Saturday, September 26, 2009

Intelligent Social Networks. Part 1

This article is written in collaboration with Maxim Gorelkin.

Collecting users in one's social network is tricky business. Features that allow people to "manually" solve their problems such as, for example, finding contacts and information, are tedious and cumbersome. Many such networks measure their success through the number of profiles, even if they are inactive, fake, duplicate, or created and used once to simply "check out" the site, never to be used again. Such systems should be consistently analyzed in order to accurately measure the level of its complexity and growth rate, but the key measure should be how dynamic it is, since it is the activity on the network and the intensity of this activity that determines its current popularity. And one major way to make one's network popular is to constantly evolve its intelligence. It should offer its users the ability to solve their social and professional problems through a combination of their intelligence with that of the artificial kind within the realm of Collective Intelligence, as demonstrated by digg.com (lots of people liked this story so you might too), last.fm (people who like Madonna also like this artist), and others (see my first article, "Adaptive Web Sites").

Here I will describe some properties of Intelligent Social Networks for the use of which I would be willing to pay.

A. A search for a person by name does not always work: the name may have changed, I may have forgotten it, or remembered it incorrectly. However I can describe certain facts about this individual, such as when, where, as whom and with whom he worked; each of which is insufficient to identify the exact person I'm seeking. On the other hand, even if the combination of facts is not unique, it may narrow the number of similar profiles to allow a quick browse. Or perhaps there is a set of individuals that can help add details about this person and relegate down the line until the sought contact information is found and returned, or someone is able to pass my information directly to the person I'm seeking.

B. The networks often use names as identifiers, and as a result feature dozens of duplicate entities that denote the same physical instance, complicating the search process. In one network, for example, I claimed four (!) universities in my profile, all referring to only one by different names. Standard classification does not usually work for larger networks, but there is a simple decentralized solution to this problem - if a sufficient number of people who use different names in their profiles indicate that they denote the same entity, they should be joined by a common identifier and depicted as different values of its "names" attribute. If there is any uncertainty left, this assumption can be formulated as a hypothesis and tested on a sample of users with these names.

C. Most of the emails I receive every day from my groups, don't have any relation to the interests I described in my profile. This stream seems closer to that of "noise" than that of information, in which I rarely come across something interesting; thus more often than even skimming, I simply delete all messages. I would prefer that the social network took on the task of filtering and re-categorizing my email, possibly with an importance indicator. Of course, for this we would need to employ natural language processing, but not necessarily in real time. Moreover, if I found something interesting in these lists, I would like for the network to suggest other relevant discussions, similar in content, as well as other groups in which such discussions occur. By the way, the search for groups is another difficult problem that cannot be solved by name and keyword search alone. For example, one group may match my interests perfectly, however without having any activity in the last six months, while another group, with a name that means nothing to me, may be extremely active with people discussing subjects that I would find fascinating. Hence the problem of groups is a semantic problem. And of course, I would prefer to get not only the information relevant to my interests, but also that which only MAY interest me, but that I may not be aware of.

News: On September 21st, an international team "BellKor's Pragmatic Chaos" (Bob Bell, Martin Chabbert, Michael Jahrer, Yehuda Koren, Martin Piotte, Andreas Töscher and Chris Volinsky) received the $1M Grand Prize by winning of the Netflix Prize contest for advancing its recommendation system - algorithm Cinematch.

See the final part here.

5 comments:

  1. Nice post Mikhail.
    Do yo think only system intelligence is sufficient or a hybrid of human and system will work better?

    What about better expression of feedback from users and hence better intelligence.

    ReplyDelete
  2. Certainly interesting.

    Where do you see this machine/human hybrid intelligence today? Places like digg, facebook (likes) or perhaps delicious?

    ReplyDelete
  3. If you consider channels within an online community (OC) such as posts, news, profiles, etc., then a search across multiple channels would filter out much of the chaff. The member could tune that to return more relevant results.

    ReplyDelete
  4. You've probably already read it, but http://bit.ly/8rSYsa is a good read from Dean Pomerleau on the global brain that may emerge from social networks.

    ReplyDelete