Voice is arguably as unique as our fingerprints. Voice is not only produced by the voice box, but is also shaped by the throat, tongue, lips, teeth, chest and head. Put that together with ascents and dialects and you have quite a challenge understanding spoken language. Now not only can Windows on the PC comprehend 8 languages, but so can Windows on the mobile phone, or at least one anyway.
Recently the Live Search for Mobile team introduced voice input in their application for Windows Mobiles. And it works remarkably well according to some early responses. Check out this quick demo by Microsoft’s speech-guru Rob Chambers.
Update: Interestingly enough, the speech recognition doesn’t actually occur on the mobile device itself. Instead as explained on the Speech team blog, “the phone takes your speech input, sends it to a server, the server does it’s recognition magic, and sends the results back to the phone.”
One of the people behind the technology and implementation is Oliver Scholz, Program Manager in the Speech Components Group, who not surprisingly also worked on the speech recognition technology in Windows Vista. I had the opportunity to ask him some quick questions.
When and why as the decision made to build voice recognition into a Live Search for Mobile?
Is the technology based on the speech-recognition in Windows Vista?
What were some of the challenges of building such a complex system for a mobile device?
The other hard part was building a user experience that made sense and satisfied user needs.
Are there any major differences in quality compared to the PC?
Windows Vista accepts dictation, correction, and full commanding. Live Search for Mobile is limited to business names, categories, city, state, zip and full addresses.
Are there plans to enable it to work with other languages?
What’s the next step? More features? Better accuracy?
I find it interesting voice recognition is built into an application and not the mobile OS. Will Win Mobile 7 provide voice support on a platform level?
Whilst the technology is not there yet for dictation or system-wide speech recognition, but imagine how much easier and safer a mobile phone could be hands-free when you’re driving. Late to a meeting? Dictate a message, and SMS it to someone without even touching a button.
This makes me wish my iPaq didn’t die. Good short interview and short video.
The only problem with that is you usually (well I do) have lots of background noise going on when using a phone. Say in the city, shopping centre, bus, train, car etc.
So it would need to be very good at detecting background noise and what you are actually saying.
Though that probably would be corrected via better quality microphones or by using array mics.
So will Vista SP1 improve accuracy?
Of course it’s better than pin-code security
Bimbo’s bitchin burrito, best company name EVER!
@Insomniac
Isn’t background noise generally eliminated at the microphone level on a cell phone? In which case, problem solved.
Very interesting topic!
@Steve,
you would think so.
I’m sure theres phones out there that do it.
But I’m just going by the experience i’ve had using voice commands on my Nokia mobiles in the past.
I have a Dopod 810x now and it works better for voice commands. But it still seems to pick up someone in the bus talking and will launch a command and ignore my command.
The application seems a little “light” to have a built in speech recognition feature.
Is the recognition done locally in the app, or remotely (at MS’s live server’s), or both (the client app preps the voice stream to just the essetials sounds components which then get processed further on live servers)?
I cant get it to work on my alltel 6700…any reason
Forgot… I say Pizza or Pizza Hut and get Sorry we were unable to complete your request. please try again