The most fascinating thing I've read in a long time is this article about a group of scientists from the Weizmann Institute of Science who have identified the basic principles of communication -
"Mathematically, these concept vectors can go in many directions, and reading the text can be thought of as a tour along paths in the resulting network. The multidimensional concept vectors seem to span a “web of ideas.” The scientists’ work suggests this network is based on a tree-like hierarchy that may be a basic underpinning of language. The reader or listener can reconstruct the hierarchical structure of a text, and thus the multidimensional space of ideas, in his or her mind to grasp “the author’s meaning.”
How profound is that? When I wrote a hypothesis for how social bookmarking can lead to the semantic web[1] I demonstrated how the hive mind is surprising good at extracting meaning from a text by building a tree-like hierarchy of keywords (tags). In fact the collective intelligence describes a webpage better any individual tagger. Now along[2] comes new research from Stanford aimed at automating the production of hierachical taxonomies, to describe the data from flat tag sets generated by users -
"This paper describes a simple algorithm for constructing hierarchies in social tagging systems that usually works reasonably well. The main contribution is a notion of generality in social tagging systems based on centrality in a similarity graph."
This ties in extremely well with my views on progressive feed filtering through hierarchical tag grazing. And it should be noted that RawSugar are also working on tag clustering and similarity detection. But getting back to the whole notion of seeing the world in hierarchies Damien Mulley and Danny Ayers both had some interesting pushback. Danny said -
"There’s nowt wrong with hierarchical views of web data. It’s only when you start shoehorning web (or general real-world) data into a tree data structure that I reckon you’re on the road to suckiness."
And Damien added -
"We may organise things physically in hierachies but we certainly do not think in such a way. If we did we'd be a lot slower in reacting to stimuli".
But Danny caught the crux of the confusion -
"The world is a hierarchy? (Or as you put it better, at least we think of it that way)."
Exactly. I had overstretched the hypothesis and confused the concept for the communication of the concept. What the Weizmann study shows is that eventhough not everything is a hierarchy we create a hierarchy in the very act of explaining or communicating the thing to ourselves and to others. So if the world isn't a hierarchy we still think of it as such because we think by sub-vocalizing. I always sub-vocalize when communicating an idea (even to myself). That's why I've never never mastered speed reading techniques. If I'm not sub-volcazing while I'm reading I feel like I'm not absorbing the information. I think in English. I have relatives who were born in the Gaeltacht and therefore think in Irish. I've often wondered if we have fundamentally different ways of thinking because of the different languages we use to do so. So its fascinating to read more from the Weizmann Institute of Science -
"Philosophers from Wittgenstein to Chomsky have taught us that language plays a central evolutionary role in shaping the human brain, and that revealing the structure of language is an essential step to comprehending brain structure. Our contribution to research in this basic field is in the creation of mathematical tools that can be used to make the connection between concepts or ideas and the words used to express them, making it possible to trace in a speech or text the path of an idea in an abstract mathematical space. We can understand theoretically how the structure of the wording serves to transmit concepts and reconstruct them in the mind of the reader.
The podcaster Adam Curry prepares his shownotes in an outliner and has often mentioned how this suits him because he 'thinks in outlines'. Of course outlines are hierarchies. Many who try outliners never go back to using ordinary text editors to organise their thoughts and ideas. We can infer now that this is because the hierarchical structure of an outline matches the hierarchical underpinning of language itself. Its exactly why I'm obsessed with the combination of OPML and social tagging. OPML allows us to model the tree-like hiearchies that underpin the language of communication. I'm convinced that when social bookmarking reaches a critical mass we will have a much more efficient means of navigating the feedosphere because -
- Social Bookmarking sites will render tags as nodes in tree-like hierarchies
- As we drill down through the hierarchies (tree branches) each subsequent tag node will progressively filter the feedosphere on that semantic path
- Tag nodes will be OPML nodes and we'll use feed grazers to browse the hierarchies and consume the feeds
[1] Interestingly, that blog post was bookmarked by way more del.icio.us users than anything else I've written here
[2] Via John Tropea
Technorati Tags: opml, feed grazing, social bookmarking, tagging
Does "hierarchy" here mean simply nesting one idea in another (with the possibility that one idea might be nested in many others), or does it imply a tree-like structure in which ideas are nested only in one other idea, forming a neat brasnching structure?
Posted by: David Weinberger | May 17, 2006 at 02:39 PM