Helping out links
The biggest problem with the Net, it seems to me, is just sorting through the morass of links to find the really credible information. This is not a new observation; Google is an attempt to make the Net “speak for itself,” such that highly linked pages appear high on the list, but only if those highly linked pages are themselves linked from highly linked pages. And so on recursively. This is a way of expressing trust relationships: if I link to you, I trust what you say, in some general sense. Those with a lot of links are highly trusted (by definition). If you’re highly trusted, and you link to a page, your vote counts more than a less-trustworthy person’s vote. This makes sense, and maps to how ordinary human relationships work.
It seems that the linker could do more to help the search engines, though. For instance, what if I want to make clear that I trust someone’s statements about mathematics, but not his or her statements on anthropology? If I link to someone’s Website, and that person happens to talk about both subjects, all Google knows is that I generally trust what this person says. Something more specific would help.
So it would make sense to develop a series of XML tags that both categorize information into a taxonomy, and specify links with that taxonomy. So for instance, the syntax might look like
<link url="http://www.philosophy.duq.edu/tophtml/faculty.html#rockmore"> <topic> <main> Philosophy </main> <subhead> Continental </subhead> </topic> </link>
This syntax isn’t exactly what we want, because we’d want arbitrarily nestable tags (Philosophy -> Continental Philosophy -> Hegel -> Hegel Scholarship Since 1950 -> . . . ), and the above syntax seems like it wouldn’t scale very well in that direction. But you get the idea.
A search engine like Google could then take my proposed subject labeling of a page (“this is a page about Hegel”) and decide whether I’m a trustworthy judge of that subject. The way it would decide my “subjectworthiness” is analogous to how it decides trustworthiness overall: by how many people link to my site on that same subject. Highly ranked philosophers’ “votes” would count more than low-ranked philosophers’, and so forth.
Possible extensions: if pages were aware of their linkers, this process would become much more interesting. If a number of pages link to me, say on the topics of “the Internet,” “Linux,” and “Hitchcock,” then my server can publish those as “taxonomies that I’m aware of.” When someone links to me, software on the linker’s machine could ask the linker, “This page has already been categorized as a Linux page, a Hitchcock page, and a page about the Internet. Do any of these subjects fit your new link? If not, would you like to add a new topic?” We wouldn’t want taxonomies to become self-reinforcing, so that users would lazily pick “Hitchcock” even if their link was actually about Howard Hawks. But you get the idea: users could very quickly categorize their own data, in effect creating a distributed Yahoo! that does the job much, much better than Yahoo! does.
One big moral of the above is simply this: the fundamental language of the Web is the hyperlink. Authors have had their own versions of the hyperlink since well before the Internet, be they academic footnotes or literary allusions. The Web just makes linking easier and richer than it has ever been before.
It also seems that my linking to another site often expresses disdain for that site, rather than a vote of confidence for it; often I link so that I can show others how much of a fool I think the linkee is. Google only counts this as a link, and the most highly linked fools will end up at the top of the list for exactly the wrong reason. So again, there should be a syntax to express the degree of distrust for the person we link. Or at the very least, there should be a Google-specific set of tags that says, “don’t count my vote here!”
The architecture of expressing trust relationships on the Net is in its infancy, but it’s dreadfully important that we make it work. Traditional media maintain their hold over the Net (nytimes.com, cnn.com, abcnews.com, etc.) because they have carried over the public’s trust onto the Web. And yet the great promise of the Net is that more ground-level media such as Weblogs will displace the huge corporations. This won’t happen unless there are reliable ways to measure trust on the Net.