Sunday, July 27, 2003

Digging for Googleholes
But the oracle—recently described as "a little bit like God" in the New York Times—is not perfect. Certain types of requests foil the Google search system or produce results that frustrate more than satisfy. These are systemic problems, not isolated ones; you can reproduce them again and again. The algorithms that Google's search engine relies on have been brilliantly optimized for most types of information requests, but sometimes that optimization backfires. That's when you find yourself in a Googlehole.

Googlehole No. 1: All Shopping, All the Time. If you're searching for something that can be sold online, Google's top results skew very heavily toward stores, and away from general information. Search for "flowers," and more than 90 percent of the top results are online florists. If you're doing research on tulips, or want to learn gardening tips, or basically want to know anything about flowers that doesn't involve purchasing them online, you have to wade through a sea of florists to find what you're looking for.

The same goes for searching for specific products: Type in the make and model of a new DVD player, and you'll get dozens of online electronic stores in the top results, all of them eager to sell you the item. But you have to burrow through the results to find an impartial product review that doesn't appear in an online catalog.

I suspect this emphasis is due to the convention of linking to an online store when mentioning a product, whether it's a book, CD, or outdoor grill. In addition, a number of sites—such as DealTime—track the latest prices and availability of thousands of items at online stores, which creates even more product links in Google's database. Because PageRank assumes that pages that attract a lot of links are more relevant than pages without links, these most-linked-to product pages bubble up to the top.

Googlehole No. 2: Skewed Synonyms. Search for "apple" on Google, and you have to troll through a couple pages of results before you get anything not directly related to Apple Computer—and it's a page promoting a public TV show called Newton's Apple. After that it's all Mac-related links until Fiona Apple's home page. You have to sift through 50 results before you reach a link that deals with apples that grow on trees: the home page for the Washington State Apple Growers Association. To a certain extent, this probably reflects the interest of people searching as well as those linking, but is the world really that much more interested in Apple Computer than in old-fashioned apples?

At this stage in the Web's development, people who create a lot of links—most notably the blogging community—tend to be more technologically inclined than the general population, and thus more likely to link to Apple Computer than something like the Washington State Apple Growers Association. (This process is sometimes known as "googlewashing," where one group of prolific linkers can alter the online associations with a given word or phrase.) But there's another factor here, which is that categories that don't have central, well-known sites devoted to them will fare poorly when they share a keyword with other categories.…

Googlehole No. 3: Book Learning. Google is beginning to have a subtle, but noticeable effect on research. More and more scholarly publications are putting up their issues in PDF format, which Google indexes as though they were traditional Web pages. But almost no one is publishing entire books online in PDF form. So, when you're doing research online, Google is implicitly pushing you toward information stored in articles and away from information stored in books.…

http://slate.msn.com/id/2085668/

No comments: