EnterpriseMobileToday BlackBerryToday

Home | News | Reviews | Features | Tips | Mobile Product Watch | Forums



Internet.com's premiere site for mobile managers and IT professionals is where wireless meets business. Our expert analysis and tips will guide you in buying, deploying, securing and managing mobile technology in the enterprise. You'll find strategic analysis, best practices, news, buyer.s guides and practical advice on how to evaluate and support a wide range of devices in the workforce.


  BlackBerryToday > Features > Solutions: Continuous Dictation, Search on VoiceSignal’s Menu

Solutions: Continuous Dictation, Search on VoiceSignal’s Menu

By James Alan Miller
April 14, 2006

Chances are, if you're turning speech into text or controlling a function on your mobile handset without training, then the software you're using is from Woburn Massachusetts-based VoiceSignal. The company's solutions are now in 55 million mobile phones and smartphones worldwide, where it owns 90 percent of the speaker-independent speech recognition space.

VoiceSignal has an update to an existing product, VoiceMode, to improve the conversion of speech into text in the works, as well as a new platform to leverage the voice as the means to perform searches on a mobile handset on tap. As with all of VoiceSignal's products, the aim is to replace mobile handset keypads and keyboards with users’ voices so as to improve usability, convenience and even safety.

VoiceMode
VoiceMode will take a giant leap forward this summer, when original equipment manufacturers start shipping handsets with an update to the mobile dictation application. Unlike version 1.0, which required pauses between words, VoiceMode 2.0 enables users to speak naturally, according to VoiceSignal CEO Rich Geruson.

Geruson told PDAStreet the new edition delivers higher accuracy and quicker performance over VoiceMode 1.0. "Voice Mode 2.0 is faster, more natural for users, and leads to less errors," he explained.

VoiceMode 2.0's fully continuous dictation capability is "unconstrained, so you can basically say anything to the phone, and it'll print out in the display so you can send it off," he added. Say laugh out loud and it'll write LOL even.

NPD Group research director of mobile devices Neil Strother said, "Even for experienced texters, using the key pad is tedious and slow. Continuous dictation will be a major leap forward in mobile phone usability."

Unlike server-based solutions or PC-based speech to text, VoiceMode doesn't use a dictionary, enabling it to run in the memory constrained environments of even the most mass market of phones.

It works by breaking speech down into the small number of phonemes (basic constituent building blocks of speech) and patterns of language to create words. So VoiceMode isn't sorting through a 50,000-odd word dictionary.

Geruson said, "The key has always been for the mobile space to get it (the application) into a very small footprint and working with very weak processor. And that's what makes us truly unique. Our stuff does what otherwise might take gigabytes and Pentium class processors to handle."

He pointed out a lot of brain power went into developing VoiceMode 2.0. It wasn't something you'd create in your garage: VoiceSignal employs five string theorists and converted them to voice scientists.

When VoiceMode 2.0 does launch on a handset, it will be with an "extraordinary large handset vendor in Europe," according to Geruson; with a U.S. phones likely to follow shortly afterwards.

Additional VoiceSignal speech recognition applications include VSuite for voice command and control, VSpeak (text to speech synthesis) and an upcoming search solution - demonstrated at CTIA last week.

VSearch
VoiceSignal's search product, VSearch, is not a search engine. Rather, it'll allow carriers to provide voice-capable searches with any search solution they support - within or without their walled gardens.

Gersun used the following example of how it might work:

You're in Palo, Alto, CA. and you want to find a Chinese restaurant. You pick up your phone and say, "search for Chinese restaurants in Palo, Alto CA."

Ten come up. The third one down is called Great Wall.

You say, "Call Great Wall."

Without you touching the phone keypad, it calls that restaurant.

And you could say, for example, "Get directions to Great Wall."

Turn by turn directions then display on the screen.

If you had VoiceSignal's speech synthesis product, you could then have the phone read the directions back to you. And since most search engines support maps, you could view a map as well.

Gersun pointed out VoiceSignal's platform would be good for all different kinds of search, including content; such as games, ringtones, wall paper, music, etc.

Unlike VoiceSignal's other business models - as an embedded software vendor earning revenue from licensing fees - its search offering will be completely ad driven. So in the restaurant search example, an ad would appear at the bottom of the list of Chinese restaurants.

The carriers, content providers, and VoiceSignal would all earn revenue from advertising. To Gersun, mobile advertising is more targeted than any other type because carriers know more about the end user than any other deliver mechanism: Internet, TV, and print included.

He also asserted VoiceSignal's position as a middleware provider is an advantage, since it puts the company in a neutral position, something a phone manufacturer, a carrier or even a search provider couldn't claim.

Currently, Gerusun claims there's been a lot of interest in its speech-enabled VSearch solution. He went so far as to say it made Google jaws drop during CTIA.

According to the CEO, VoiceSiginal is currently working with at least a dozen content providers and carriers, big and small, in the mobile space to get this new technology out and into the hands of consumers.

Competition
While VoiceSignal is a dominant player in the mobile speech recognition market, there are others like deskop veteran Nuance set to move in on its turf. At CTIA last week, for example, Nuance showed off Dragon Mobile Dictation, its own continuous speech-to-text solution for mobile handsets.

Dragon Mobile Dictation is based on the well known PC application Dragon NaturallySpeaking and runs completely on the mobile handset like VoiceSignal's VoiceMode. Dragon Mobile Dictation should arrive first on a handset in the U.K. sometime this year. Nuance may unveil additional mobile speech-enabled applications later on in 2006.

In the meantime, also at CTIA, Nuance announced a joint partnership to develop and market speech-enabled solutions for digital music collections on mobile devices with GraceNote. The offering provides a voice-activated interface to select a song, artist or playlist on MP3-enabled mobile devices.

This combined solution uses underlying speech capabilities from Nuance and Gracenote’s MediaVOCS phonetic database and playlist navigation to allow users to - the companies say - quickly and easily find and play the music stored on their mobile devices.



Related Links:

  • Review: Samsung P207 – Turns Speech into Text
  • Speak Your Next SMS Text Message
  • Fonix to Present Speech Solutions at 3GSM World Congress
  • Dragon Naturally Speaking Adds Pocket PC Support
  • Smartphones, PDAs Finding their Voices

     
     Printable Version
     Email this Story to a Friend






  • The Network for Technology Professionals

    Search:

    About Internet.com

    Legal Notices, Licensing, Permissions, Privacy Policy.
    Advertise | Newsletters | E-mail Offers