Open the Pod Bay Door Please, HAL

  • December 26, 2000
  • By Scott Courtney

The enrollment process is a bit time-consuming but is very straightforward. After passing the audio level test, the User Guru jumps to a screen for picking one of several "stories" to read aloud. Spend a few minutes reading the text, with the User Guru automatically displaying a new page when it detects that you've finished the current one. At the end of the text, the enrollment program begins to analyze the input. The process took about fifteen minutes on my machine, which is an AMD K6-III processor at 400 MHz. During the analysis, there is little to see, just a progress graph.

Now you are ready to dictate some text! Running the command 'vvstartdictation' from the shell prompt starts the Java runtime and loads the dictation main screen. Ensure that the microphone-enable button in the upper-left corner is active (green background) and start talking. There is a slight lag between speaking words and seeing them appear on the screen, but on my 400 MHz system the delay was sub-second most of the time. Occasionally the software gets a little behind, but then catches up suddenly with a burst of text.

Unlike the VoiceType Dictation from the mid-1990s, ViaVoice is a continuous speech recognition system. That means you don't have to pause between words. Punctuation marks are spoken aloud, as are paragraph breaks and line breaks. You can backspace over the last word by saying 'scratch that' and you can turn off the microphone by saying 'go to sleep'.

To get a full list of the verbal commands that are avaiable at any time, just ask aloud, "What can I say?" The window which pops up is dynamic; the list of commands changes depending on the current editing context. There are commands like 'move up N lines', 'select to end of paragraph', and 'delete this'. So it is basically possible not only to enter text verbally, but to edit it as well. The process is a little clumsy, in my opinion, compared to the unambiguousness of cursor keys and mouse. You do get used to it, though, and productivity improves with practice.

ViaVoice is far from perfect, as are all current speech recognizers. Occasionally it will goof, and the best course of action for the user is to make it learn from its mistake. Say "open correction window" aloud, and the correction dialog will appear. In this dialog, you can either choose a likely replacement from a list, or you can enter your own spelling. You can also select multiple words (however mangled they may be) from the source text and then tell ViaVoice to add this phrase to its user-specific dictionary. User entries (both words and phrases) appear to take priority in the matching algorithm, which causes the program to be extremely accurate on proper names or technical terms, if you take the time to add them to your dictionary.

