A comment to The Scientific Indian suggested -
Soon the internet might go the ants way, I believe.. mobile routers, network devices exhibiting group intelligence... automatic social bookmarking bots... the unifying thread is the unsophisticated nature of ants themselves.
Unsophisticated nature of the ants themselves - how better could we describe ourselves as we go about bookmarking and tagging items of interest on the web which we know little about? The same post points to a New Scientist article which suggests that Big Brains are not crucial to teaching and gives the example of... yes, ants -
Teaching differs from simply broadcasting information in that the teacher must modify their behaviour, at some cost, to assist a naïve observer to learn more quickly.... follower ants would indeed find food faster when tandem running than when simply searching for it alone, but at the cost to the teacher who would normally reach the food about four times faster if foraging alone.
That's fascinating in that it explains to me why we take the time to (socially) bookmark. I've often wondered why I spend so much more time tagging and annotating my del.icio.us bookmarks than I can rationally explain. At a subsconcious level I seem to have bought into a contract that the time I donate as an individual will somehow reward the community, or the collective. And I do modify my behaviour in the sense that I try to use tag words which I think will make the most sense across demographic boundaries.
Jon Udell produced a screencast when taking a look at language evolution in del.icio.us. Its worth watching to the very end where he sums up -
In his book - The Language Instinct - Steven Pinker talks abut how you get from a pidgin to a creole. A pidgin is what you get when you throw people together who have no common language and gramatically its kind of a mess. But the children of pidgin speakers spontaneously create creole languages and those are gramatically complete. All you need to make it work is an environment in which people can easily speak to each other, hear one another and adjust what they say according to what they hear. The social tagging services just get out of the way and try to let those conversations happen. Is this how we wind up creating the Semantic Web? I'm guessing that it is.
That Sem word fired off a neuron in my singular brain and sent me searching for my paperback copy of Steven Johnson's Emergence: The Connected Lives of Ants, Brains, Cities and Software. And here it is on page 75 -
The great bulk of ant information-processing relies on compounds of pheromones, also known as semiochemicals for the way they create a functional sign system among the ants.... These chemical symbols turn out to be the key to understanding swarm logic... pheromones play the central role in the organization of colonies.
Critical mass is crucially important though as Science Frontiers online points out -
Put a hundred army ants on a flat surface and they will walk around in never decreasing circles until they die from exhaustion. But a colony of a million army ants is a sophisticated "super-organism."
It seems that intelligence, natural or artificial, is an emergent property of collective communication. Human con-sciousness itself may be an epiphenomenon of extraordinary processing power. Although experts prefer to avoid simplistic definitions of intelligence, it seems clear that all intelligence involves the rational manipulation of symbolic information. This is exactly what happens when army ants pass information from individual to individual through the 'writing' and 'reading' of symbols, often in the form of chemical messengers or trail pheromones, which act as stimuli for changing behavior patterns.
With this in mind I decided to test what intelligence might be evident in the del.icio.us social bookmarking service. Del.icio.us already has a rudimentary tag weighting algorithm which I would like to see them using to generate self-organizing OPML hierarchies but for the purposes of this experiment I decided to generate my own. So I visited the del.icio.us popular page, which displays recent highly tagged items. I scanned down to find an item which wasn't in any way familiar to me and didn't have a descriptive title.
The first one I came across was ImageWell. I'd never heard of it but apparently 97 people had tagged it so I clicked through on the link to the URL page. Now, listed on the right hand side of that page are the Common Tags assigned by the 97 taggers. Here they are -
57 software
52 mac
46 osx
32 graphics
21 tools
21 photos
19 free
16 apple
10 freeware
8 image
7 photo
6 tool
5 design
5 editor
4 apps
4 web
4 application
3 macintosh
2 imaging
2 editing
2 photograph
The number apparently indicates the number of times each tag has been applied to the item, listed in order from most frequent at the top to least frequent at the bottom.
So I wanted to manually generate an OPML hierarchy. The rules I decided on were -
- You must go from top (general/broad) to bottom (specific)
- You can skip any tag
- You can combine tags in the order from top to bottom
Which lead me to generate this hierarchy
Mac OSX > Freeware > Image/photo/design editor
The hierarchical respresentation is language independent so you could at this stage replace the English words above with, let's say, their Spanish equivalents and proceed just as easily to the next step, which is to interpret the hierarchy in term of the rules of grammar. Of course English grammar tends to work backwards from the specific to the general so starting on the right hand side and working to the left the hierarchy tells me that ImageWell is: photo editing/design freeware for Mac OSX. So I click through on the item link to the IMageWell page and sure enough the first paragraph on their homepage says - "ImageWell, the free and lean image editor...". And it is, naturally, for OSX
At this stage you're pointing out how convenient it was for me to pick and choose the tags I did to arrive at my definition of ImageWell. But my contention is that as social bookmarking services like del.icio.us improve the critical mass, feedback loop and visualization features will indeed allow us to arrive at such tight definitions.
Let's try another. The next Popular item which meant absolutely nothing to me was Doane Paper. 109 other people had tagged it (25 recently) so I clicked through on its URL information page. And the Common Tags were -
66 paper
35 design
15 business
12 creativity
9 diy
9 graph
8 pdf
8 grid
7 tools
7 useful
6 notes
5 legal
5 moleskine
5 journal
5 productivity
5 writing
3 downloads
3 blog
2 download
2 gtd
2 notetaking
2 templates
2 cool
This time I decided to skip the hierarchy and just see what sense I could make of it by starting at the bottom and working upwards. Remember a hierarchy starts at the top and works down because it needs to go from broad/general to specific. The English language on the other hand tends to go from specific to general (eg. jolly fat man, dull red car). So here was my first effort -
cool download [for] writing journal(s) [on] moleskine [in] grid [or] graph [format] [for] creativity [in] business design
I've added the words and letters in brackets to signify how the pidgin generated by the first generation of social bookmarking services might evolve into a gramattically complete creole with second generation services.
So, I then clicked through on the link for Doane Paper, and wow! The definition was almost exactly as predicted by the army of tagging ants. The pheromones secreted by the taggers had left behind a hierarchical trail to the definition. The semiochemicals have generated a miniature semantic web.
No one has explicity posted to del.icio.us the specific definition for Doane Paper in the sentence above but the army of tagging ants have generated it with their collective intelligence.
So where do the next generation of Social Bookmarking services need to go to truly realize this potential?
- Critical mass. The above examples, although pleasingly useful for demonstration purposes suffer by the small number of taggers. Imagine how the accuracy could benefit from thousands of taggers instead of hundreds.
- Feedback Loops. Del.icio.us currently provides a crude form of feedback when tagging by popping up popular terms but this needs to become much more sophisticated.
- Visualisations. One type of feeback loop we need is the presentation of previously used tags organized in a hierarchy. That would allow us to quickly zoom in on the best choice of tag much more accurately than the current crude weighting system. Much more about this later....
In conclusion, for the moment, I'd just like to repeat Jon Udell's wonderful insight -
All you need to make it work is an environment in which people can easily speak to each other, hear one another and adjust what they say according to what they hear. The social tagging services just get out of the way and try to let those conversations happen. Is this how we wind up creating the Semantic Web? I'm guessing that it is.
Technorati Tags: opml, social bookmarking
Great post, James.
You know when you tag a page in del.icio.us and it suggests popular tags that other others have used on that same item? I've always thought that that's about a similar as online behaviour can get to the pheremone-trailing ants of Emergence. Free will is retained, the paths of those who came before you influence you actions (i.e. what you tag), and intelligent behaviour emerges from that activity. But, as you point out, the signal to noise ratio only increases to a useful level once you reach a critical mass of participants.
I'm not entirely convinced of Udell's semantic web hypothesis, though (but I think you cover what I'm about to point out). In The Language Instinct, Pinker makes the distinction that creoles evolve because only children have an inate ability to construct new grammatically elegant languages - we lose this ability way before adulthood (this is why new languages are much more difficult for adults to learn, and the best attempt that can adults make at contructing a new language is a pidgin). However, I do think that all you need is the technological equivalent of a child's language ability to contruct something semantic out of social data, a framework that can apply meaning (like what you've done manually). The relationship between pidgin and creole is a fantastic analogy of the relationship between social software and the semantic web. Humans can infer meaning from pidgin data, but machines can't - they require the robustness of a creole.
So, is it possible to make a creole (or semantic)-generating framework for the social web?
Posted by: Emmet Connolly | March 02, 2006 at 02:41 PM
The point of tagging things in del.icio.us is so that you can find them again later, the same way you organise files on your harddisk into folders, except that del.icio.us allows multiple folders. Joshua Schacter has always stressed that bookmarking should first and foremost be a selfish action (one of Clay Shirky's requirements for good metadata, see: http://many.corante.com/archives/2005/02/01/folksonomy_the_soylent_green_of_the_21st_century.php ) and that you shouldn't think too much about how to tag an item. If you assign it the first tags that come to mind then it's likely that the same associations will be active in your brain when you're actually looking for it. He purposefully doesn't include too much feedback from other users that might disrupt this process.
Having said that, given the tag overlaps that do arise on del.icio.us, you've nicely demonstrated the possibility of clustering items into tag categories. You should note that the ability to categorize items into type-hierarchies, as you've done, is only a very limited part of what the semantic web is about. In other words, you may be able to derive reasonably good "is-a-kind-of" relations, but the majority of semantic web usefulness will come from the ability to describe arbitrary relations.
Emmet says: "is it possible to make a creole (or semantic)-generating framework for the social web?"
Last summer I did a demo at Barcamp of a prototype web app aimed squarely at this goal. The meager session notes are at http://barcamp.org/StructuredDataForTheMasses . I haven't had time to set it up on a server yet. There have been a couple of apps released in the meantime with some similar ideas, e.g. Google Base and Ning, but they're not quite there.
I think the semantic web can be bootstrapped with a social app without people having to know what the semantic web is. del.icio.us is not it. Nowhere near yet. But I can demonstrate an interface that might come closer. I'd show it at the Blog Awards tech meetup but I'm out of the country. Some other time, promise :)
Posted by: Rowan Nairn | March 02, 2006 at 04:21 PM
Awesome post.
Posted by: Alex Barnett | March 07, 2006 at 06:12 AM
Thanks for the feedback everyone. I'm following the suggested links and brushing up alot on my understanding of the Semantic Web.
Posted by: James Corbett | March 07, 2006 at 09:39 AM
As a late addition, I saw something recently that reminded me to read this post again. Check out how Flickr infers the different meanings of the word "pitcher" based on the common adjacent tags applied to each photo:
http://flickr.com/photos/tags/pitcher/clusters/
Posted by: Emmet | July 10, 2006 at 02:56 PM
Wow, cool,... now *that's* what I'm talking about :)
But where's my OPML? ;)
Posted by: James Corbett | July 10, 2006 at 03:04 PM
Great insight into the "clouds" of del.icio.us.
Posted by: doane | March 09, 2007 at 05:27 PM
Have a look at http://www.entopica.com/, a new social bookmarking web site
Entopica is an online system that allows you to easily access, categorize, share and store your bookmarks online.
It is free to join and registration is both quick and easy.
Discover a whole new world of social bookmarking. It is user friendly and easy to use.
Posted by: Val | March 08, 2008 at 05:55 AM
Take a look at http://www.entopica.com/, a new social bookmarking website
It is an online system that allows you to easily access, categorize, share and store your bookmarks online
Entopica offers a free registration and it is both quick and easy. Register now and discover a whole new world of social bookmarking
Posted by: Val | March 24, 2008 at 03:53 AM