Featured FREE Resource:

Newsletter | Advertise | CONTACT | Facebook Twitter | LinkedIn
     
Thursday, April 24 2014  
Welcome to SymbianOne - smartphone and mobile industry tech news for wireless developers, device makers, and mobile industry architects
Home arrow App Reviews arrow Voice Recognition: Speak And You Shall Be Recognized
HomeNewsArticlesApp ReviewsDirectoryResourcesAboutLBSEVENTSDevices
Give SymbianOne a Like on facebook
GeoJobs / LBS / AnyGeo Blog / Twitter
Need A Wireless Developer?... Post Your Free Job Listing in our Career Center Today!
Mobile Tech Feature Articles from SymbianOne

Developer /Geek Reading - Free e-Book, Linux from Scratch - This 318 page eBook provides readers with the background and instruction to design and build custom Linux systems.

Voice Recognition: Speak And You Shall Be Recognized Print E-mail
Written by SymbianOne   
Sunday, 23 April 2006
Are all voice recognition systems equal? Are they just a gimmick and of no practical use? Richard Bloor takes a look at two solutions for Symbian OS, one from Nokia, the other from VoiceSignal and comes to a clear conclusion.

There was a time when every new mobile advertised its voice dialing capability. It was the feature to have. However, it has been some time since mobiles have made much noise about voice features. Beyond using this feature to switch Bluetooth on and off, I've never used it much on any S60 device (with the exception of the Sendo X). This is because the older system delivered by Nokia needed each contact to be trained before it could be voice dialed, hardly practical on a contacts list of several hundred.

The voice dial application, Voice Command, on the Nokia N90 appears to be a significant improvement over the earlier offering. Now, rather than training each contact, the system is speaker independent. This means that, in theory at least, you can simply activate the voice recognition and speak.

Voice Command on the N90 is activated from the camera button, with the flip in the closed or open "phone" position. A short tone is issued when recognition is activated and a "speak now" dialog displayed, however no indication of the commands you can provide is given. After issuing your voice command the system provides a synthesized version of the matched contact or command and briefly displays the match before activating the command or dialing the contact.

This automatic action is slightly disconcerting. Should the recognition not work correctly, you have to be quick to prevent the system from dialing a wrong number. As voice recognition is likely to be used when you do not have quick access to the keypad, this is something of a limitation.

The voice commands can be tailored to provide access to profiles, voice mailbox, and any application on the system through the Voice Commands application in the tools folder. (In addition, Voice Command also uses the nickname set in the contacts records, allowing tailoring of a contacts recognition words.)

The Profiles option opens a separate folder, which provides a list of profiles.

For each profile and indeed each action, there is the option to change the command. This involves changing the text definition of the command to be recognized.

After the command has been changed it is replayed, using the voice synthesizer.

The application list is initially populated with just a few of the device applications, with Voice Mailbox, Bluetooth, Voice Recorder, and Contacts active. Any additional applications have to be added individually.

Finally, Voice Commands offers options to turn the synthesized voice confirmation off and reset all the personal adaptations.

Initial impressions of Voice Command were poor. In a quiet office environment, the first problem encountered was that Voice Command had difficulty distinguishing between contacts and commands. The request "Bluetooth" repeatedly resulted in Voice Command trying to dial a contact called "Ho Ho", while "Contacts" unfalteringly tried to call "Brenda Nash". Apart from the fairly obvious issue that these recognized contacts seem to bear little audible resemblance to the command spoken, this problem was fixed by using the adapts to change the recognized commands to "toggle Bluetooth" and "open contacts".

Name recognition was equally disappointing. From a random selection of ten contacts no better than 50% were accurately recognized. Some of the names Voice Command confused it always confused and no amount of careful, slow or fast enunciation of the name changes its inability to recognize the name.

On a Nokia 6682 VSuite activates from the voice button on the left of the phone. The first noticeable difference with VoiceSignal's VSuite 2.0 is that it provides far more options for using contacts and activating commands.

VSuite starts with a splash screen before asking you to "say a command".

You immediately get four options, to prefix your command with the action you want to take: to call a contact, send them an SMS, open a contact's details or open an application. While VSuite listens to the command, the small ear icon flashes.

The on screen guidance is helpful the first time you use VSuite, letting you know what can be done and how to do it. This is in marked contrast to Voice Command, which leave you to guess what options you have. As these prompts are provided in parallel with the recognition process, they are not a hindrance once you have become familiar with VSuite's operation.

For the call, SMS and lookup options VSuite provides a list of possible matches by default, starting with the best match. A voice synthesizer asks "Did you say" followed by the matched name. VSuite then give you the option to: confirm this was the correct contact, was not the correct contact, cancel the recognition session or repeat the name.

The voice activated Yes, No, Cancel, or Repeat feature was a little disconcerting at first. The natural reaction, on seeing options on the screen, was to try scroll to one of them rather than speaking the action required. Once you get used to it, this mechanism has another advantage. Unlike the Nokia application, once you have confirmed a contact, VSuite lists all the alternative numbers they can be contacted on and again a voice command can be used to select the correct one.

This may seem like a long process if you knew you wanted to call Graham Trimmer at home. VoiceSignal's engineers obviously agreed, as you can avoid the step by simply saying "call Graham Trimmer home".

Not all the numbers you might want to call are going to be in your contacts list, to handle that ad-hoc dialing VSuite includes a digit dialing feature. You simply need to say call and speak the number required.

The open application feature works differently from the contacts related functions. Given there is a relatively short list of possible matches VSuite simply opens the application when the voice command has been recognized.

VSuite's options are built into the application and accessed by selecting the menu item "Settings", instead of speaking a voice command.

The choices list option allows the display of a list of likely matches for any contact related command to be set on, off or to activate only when VSuite determines that there is a reasonable chance the recognized speech could match two or more names. With the option off VSuite behaves in the same way as the Nokia voice suite, immediately dialing the best match name, but unlike the Nokia suite (which only dials the contacts first listed number or the default number if one has been set) this option still allows you to select a specific number by providing it after the contacts name.

The sensitivity option allows VSuite's margin of error on matches to be altered, to include or reject more options. In testing no significant need to alter this setting was identified.

Next is an option to customize and train digit dialing. The process of training digits (as the application notes) takes about a minute to complete and involves repeating 10 sets of digits.

In addition, VSuite can be set to identify the unique digit groupings used in various countries.

The sound options allow the prompts associated with saying the command and confirming digits and names to be turned on and off. In addition, you can also change the volume and speed at which names are read back.

VSuite automatically adds almost all the devices' applications to the list that can be opened by voice command, but any new applications added to the device have to be activated manually with the application launcher option.

Finally, VSuite allows the method it uses to keep in sync with the contacts folder to be set as automatic or manual.

VSuite clearly offers more features than Nokia's offering, however the key question is how well does the voice recognition work.

The first obvious recognition advantage VSuite has is its use of a preface command, this means there is almost no chance of a call to a contact being confused with the requirement to open an application. In fact this error never occurred during my testing.

On my ten contact test, VSuite managed better than 90% recognition in a quiet office environment. Even more impressive was that it coped well with the specific number selection even when it was made with some "natural language" such as "call Graham Trimmer's home" or "call Graham Trimmer at home". I also found no obvious contact or command that VSuite consistently failed to recognize.

Initially I found digit dialing to be unreliable, if only because the leading zero on long distance calls was often dropped. Training the digits seemed to improve the recognition. However, when I realized that my local number format was similar to that in the UK and set the digit dialing location to the UK, I found recognition was as good as that for names.

While the performance of the two voice systems in a quiet office is interesting, it is not really very representative of the types of environments where a user may wish to use this feature.

The most obvious place where voice dialing is of use is while driving. For this test I called on the services of a 10-year old Land Rover Discovery Diesel. This is by no means a quiet vehicle (a considerable improvement on the Defender, cabin noise wise, but that is a different review).

I undertook two tests. The first was at 60 kph, but with the added distraction of the CD system playing. The second was at 100 kph, but with the CD off. These two sound files are what the phones heard during the tests.

Download the 60 kph Test Audio (AMR format) and 100 kph Test Audio (AMR format).

In both these tests the Nokia voice recognition was very poor. Trying to call the contact "Marylin De'ath" variously failed completely, opened the Web application or recognized "Joseph Keys" and no one else. During the same test VSuite only once completely failed to recognize "Marylin De'ath". The majority of attempts had "Marylin De'ath" as the most likely match although once or twice she was second in the list of possible matches.

Similarly Nokia's Voice Command unfailingly recognized "Melinie Dodd" as "Ana Pinto". The same name seemed to challenge VSuite at normal speaking speed, but by taking a little more care it dramatically improved, achieving close to one hundred percent recognition. By contrast no amount of care seemed to affect Voice Commands misguided attempts to connect me with "Ana Pinto".

While these two examples represent the worst performance show by each application they are representative of the difference in performance between the two systems. Voice Command was inclined to incorrectly recognize contacts or commands, while VSuite was inclined to correctly recognize voice commands.

Given that Voice Command failed to perform well in a quiet office it was perhaps unsurprising that, as in the car test, in all other environments it performed very much worse than VSuite.

I did discover one fascinating thing during the testing. Entirely by accident I managed to activate VSuite just as Voice Command's voice synthesizer was repeating a contact name, which VSuite recognized correctly. This little trick has proved impossible to repeat, reproducing just the right timing seems unachievable. While it does illustrate that Voice Command's voice synthesis is good it seems to say more about the quality of VSuite's recognition, that it can interpret accurately a computers attempt to reproduce the human voice.

The basic difference between VSuite and Voice Command is that VSuite is usable, out of the box, in demanding and not so demanding environments. VSuite even managed tolerably accurate recognition in environments where carrying out a phone conversation would be a challenge. It was not always perfect, but the worst misrecognitions ended up with the desired contact somewhere on the possible matches list.

By contrast Voice Command's recognition was poor, in even the most undemanding environment it could not be described as reliable. In anything other than a quiet office its error rate became so bad that it was really unusable and even in quiet environments its performance hardly impressed. It might be possible, using the command adaptations or by creating contact nicknames, to improve recognition for key commands and contacts. However, this somewhat defeats the purpose of speaker independent voice recognition. Voice Command has only one obvious advantage over VSuite and this is that the command Bluetooth toggles the Bluetooth connection on or off, rather than simply opening the Bluetooth applications. This toggling is a handy feature if you use Bluetooth gadgets, like a keyboard, regularly as it is a quick and convenient way to turn Bluetooth on and off (when it works).

VoiceSignal has done an excellent job with VSuite, combining both quality voice recognition with an intelligently designed recognition process. It is a practical solution for initiation of phone calls and SMS messages or controlling a S60 devices in a wide variety of environments. By contrast the only charitable thing I can say about Voice Command is that it adds a tick to the N90's feature list.

Like so many technologies voice dialing and commanding suffered from too much hype too early in its development. Expectations were created and not met. VSuite overcomes that early disappointment. I can't wait to get my hands on their dictation system, VoiceMode, and pretty much give up using the keypad all together.


VSuite is shipped as a try-and-buy application on selected Nokia S60 devices. VoiceSignal has indicated that retail versions of VSuite will be available for both S60 and UIQ devices in the future. For more information see www.voicesignal.com.

Possibly Related:

Last Updated ( Friday, 28 April 2006 )
 


Share

Submit Your Mobile Tech News




Social Media Strategies

Social Media Strategy Workbook: This Workbook will help you to define your goals and audiences and to decide on the channels that make the most sense for you. Ready to figure out what social media means for you and your own organization? This Social Media Planning Workbook will help you to define your goals and audiences and to brainstorm the channels that make the most sense for you.  

47 Handy Facebook Stats and Charts - Do you know the best time to engage with your customers on Facebook? Learn this and more in this free eBook. They have compiled 47 stats, charts and graphs on Facebook that are easy to share and put into presentations.  

Nokia Lumia 900 4G Windows Phone, Black (AT&T)

Contribute to the SymbianOne Symbian Search!

Top of 

Page
(c)2003 - 2008, SymbianOne - All rights reserved