Home > artificial intelligence, Google > Gmail Voice about future search, not free calls

Gmail Voice about future search, not free calls

August 26, 2010

Yesterday’s most exciting news was Google’s introduction of free voice calls to Gmail. In a nutshell, if you have a Gmail account, you can now make free calls from your computer to real landlines and cellphones in North America. You can also call the rest of the world for peanuts, with many countries costing only 2 cents a minute.

The announcement is significant for a number of reasons. For one, it’s direct competition for Skype, which was already pretty direct competition to landline and cellphone companies. Skype has made calling virtually free - I currently pay about $35 a year for unlimited calls within North America through its SkypeOut service, which is obviously a fraction of what phone companies charge.

In the U.S., the computer-based Gmail service works nicely with Google Voice, which is another free calling service that lets smartphone owners use their data plan rather than their voice plan to make calls. In other words, you don’t actually need a voice plan to make phone calls with Google Voice; just the data connection will do. And while the Gmail service is currently shackled to the computer, there’s no realistic reason why it’ll stay that way.

Here’s why Google will beat Skype and every other phone company: to those other companies, it’s still about phone calls and figuring out how to make money from them. But, because the actual cost of making a call over the internet is almost zero, Google can afford to swallow this rather incidental cost as a future investment toward its real business: search.

In Sex, Bombs and Burgers, I talk to Franz Och, the man behind Google Translate. The company’s award-winning and pretty accurate service uses statistical machine translation, an algorithm that studies patterns in different written languages and then predicts the results in another language. The system’s accuracy is predicated entirely on the number of documents it has to work with; the larger the comparison database, the more accurate the translation.

In 2007, the search company launched Google 411, a service that you could call and ask questions, such as the address of a business. The service would send you the requested information back in a text message. The purpose of 411 wasn’t so much for Google to provide you with a rather convoluted information delivery system, but more for the company to gather voice samples to use in building a better voice search system, in the same way that documents were used to build Translate’s database.

Och, in building his original translation system, used United Nations documents because there were millions of them, and they were all already translated into the U.N.’s six official languages. It was a treasure trove of information. Google 411 was a similar early attempt at building a database and the effort bore fruit with the launch of voice-activated search on the iPhone in 2008, but it wasn’t exactly the same jackpot as the U.N. documents largely because it wasn’t that useful to consumers.

Google Voice, including Gmail calling, is the next step in that process. Google will use the zillions of calls that go over its Voice service to build up its database, which will allow it to improve the accuracy of its voice search. As the Google Voice privacy policy states:

Google’s computers process the information in your messages for various purposes, including formatting and displaying the information to you, playing you your messages, backing up your messages, and other purposes relating to offering you Google Voice.

That “various purposes” clause is pretty nebulous and can certainly include research and development of search algorithms.

In other words, free phone calls are the jackpot that Google has been looking for. While Skype and phone companies continue to try and find a way to squeeze pennies out of phone calls, for Google it’s extremely valuable to give them away for nothing because it will help the company develop the next generation of search. After all, typing our searches into a web browser is far from the most efficient way of finding information. Saying what we want is much better, and it’s how we’ll primarily be searching in the not-so-distant future.

UPDATE: This post has been picked up by Gizmodo, and some reader comments there have provided some additional clarity on how Google Voice works. The calls made on Google Voice using a smartphone actually go over the cellphone carrier’s voice network, not its data network, as I mentioned above. That’s a little different from Skype on a smartphone, which as far as I know, uses only the data connection. Google Voice could theoretically use the data network, and I’m not really sure why it’s not doing so. In any event, how the service is conveyed doesn’t really make much of a difference in my search theory.

  1. August 26, 2010 at 11:33 am | #1

    “Saying what we want is much better…”

    Indeed.

  2. August 26, 2010 at 2:25 pm | #2

    There are actually two major components to a statistical machine translation system. The first is the language model which uses large monolingual datasets. This move will clearly help Google improve their spoken language models. The other component is an alignment model which maps words, phrases or even entire sentences from one language to another. This is why datasets were originally drawn from the UN proceedings. They are parallel documents in that each one contains roughly the same meaning expressed in different languages allowing an alignment model to be built. It is less clear how this move increases Google’s ability to construct an alignment model because there is no parallel text. In my view the alignment problem is far trickier. There are tons of sources of monolingual language data, less so for parallel datasets.

    David

Comments are closed.
Follow

Get every new post delivered to your Inbox.