Thursday, February 03, 2005

Search Engines

Imagine going to New York City with your spouse to visit a friend. Your friend is the ultimate sophisticated urbanite who knows the ins and outs of everything trendy in New York. You ask your friend, "Where is a nice place we can go to dinner tonight?"

This is a simple request, and your friend is going to process it very naturally. He will ask you several questions. For example:
  • What kind of cuisine would you prefer?
  • What price range for the restaurant?
  • Fancy or low key? Loud or quiet? Music or not?
Then using his knowledge of the New York Restaurant Topography, and perhaps adding in a little extra information (like your current location in the city (to avoid a long cab ride), his knowledge of your personality and your spouse's, etc.) he is going to match you with a very suitable restaurant. He is going to send you to a very nice, highly rated restaurant that has wonderful food and great service. The whole transaction might take two minutes or so, and you get exactly the information you are looking for.

Compare that transaction with how today's search engines work and you can see how primitive today's search engines are. It really is quite sad. If you go to Google right now and type in the search string "New York City restaurants", you will get over 15 million results. Adding the word "French" really narrows it down -- only four million results. If you try "New York city best French restaurants" that gets you down to two million results.

What you would like, at that point, is for Google to ask you, "what are you trying to do?" Are you trying to find a restaurant for tonight? Or are you trying to compile a list of all French restaurants in New York City for a travel guide you are producing? (Or something else?) Those are completely different tasks. If it is the former, then you would like Google to query you about location, price range and atmosphere, and then give you two or three search results -- the actual pages for the couple of restaurants that are perfect for your meal tonight.

There are so many places right now where search engines fail us because they cannot read and understand text, or understand the intent of the person doing the query. For example, let's say you want to know how many teenagers there are in the United States. That's a simple question. You type in "How many teenagers are there in the U.S." What you would like Google to do is come back with a single result -- the actual number! What you get instead is... Well look at this. Here is what I got tonight:

That first one really does say, " - Police: Woman threw sex parties for teenagers" and the next one down really does say, "USAID: Earthquake and Tsunami Relief." Is that what I am looking for? Not really. None of these search results have anything to do with the number I am looking for. They are not even close.

Now imagine going further. You want a table showing the number of teenagers in the U.S. by year going back to 1950. Imagine if you could say that to the search engine, and have it actually come back with the table. Then you ask that the table show the teen population for England, Canada, Australia and the U.S. by year going back to 1950. And the search engine does it. Then you ask it to show per capita spending on education by year in the four countries. And it does it.

In 2050, it is easy to imagine that the best search engine on the planet will have not only spidered all the information on the "Internet" (whatever form that takes in 2050), but has also read and understood that information and is thus a super-intelligent source of all knowledge. It will be able to answer any question that you might have, based on the truly staggering amount of information that is available to it.

When we get to that point, the really interesting question is, "what else will this search engine be able to do?" If it has all of the world's information loaded up in its "brain", what connections, patterns and ideas will it be able to see?



At 9:08 AM, Anonymous Anonymous said...

Hey MB - you need to find a way to integrate your blogs better. This post on Robotic nation is relevant:

Taking AI to the next level

At 4:53 AM, Anonymous Anonymous said...

Relevant article: Search Engines -- The Future

At 2:22 PM, Anonymous Anonymous said...

Check out the START project at MIT

At 4:49 PM, Blogger Dimitar Vesselinov said...

What 2034 will bringJakob Nielsen observed:
"According to Moore's Law, computer power doubles every 18 months, meaning that computers will be a million times more powerful by 2034. According to Nielsen's Law of Internet bandwidth, connectivity to the home grows by 50 percent per year; by 2034, we'll have 200,000 times more bandwidth. That same year, I'll own a computer that runs at 3PHz CPU speed, has a petabyte (a thousand terabytes) of memory, half an exabyte (a billion gigabytes) of hard disk-equivalent storage and connects to the Internet with a bandwidth of a quarter terabit (a trillion binary digits) per second." New Moore's Law

At 2:10 PM, Anonymous Anonymous said...

It looks like Google heard you!

Google intros Q&A service

At 10:45 AM, Anonymous Anonymous said...

Hey, you have a great blog here! I'm definitely going to bookmark you!

I have a Free site Free Article Search. It pretty much covers ##KEYWORD## related stuff.

Come and check it out if you get time :-)

At 11:30 PM, Blogger Learn Chinese language on hanbridgemandarin said...

The best place to learn Chinese language is in China. However, we understand that it isn't always possible to move here to study Chinese language. The next best thing is to study with our experienced teachers in a virtual classroom. Online students enjoy the same excellent way of Chinese language class and custom designed courseware that we provide for our face to face clients.


Post a Comment

<< Home