Giving Voice to Linux with ViaVoice
Open the Pod Bay Door Please, HAL

Scott Courtney
Tuesday, December 26, 2000 11:00:42 AM
Let's face it: Each and every one of us geeks, from the moment we first
saw 2001: A Space Odyssey in 1968, has dreamt of owning HAL9000. For
22 years we've crept steadily closer to the elusive goal of a self-aware
computer with virtually infinite capacity, and with which we can converse
as naturally as with any human. Admittedly, we'd like to leave out that
part about suffocating us in our sleep and locking us out of our vehicle
when we're a couple billion miles from home. But, after all, that was
just the beta version!
The phenomenal growth in processing power since then has, alas,
failed to bring us the magic of HAL9000. One of the most intractable
problems has turned out to be the recognition of human speech by a
computer. Merely increasing processing speed isn't enough, because
the fundamental difficulties lie in duplicating the contextual ways in
which human beings understand speech. Raymond Kurzweil has written an
overview article describing the problems, for those
who are interested.
Yet there has been much progress. Many of us
can remember the tantalizing glimpse of the future when IBM released
OS/2 Warp Version 4.0 with VoiceType Dictation integrated into the
Workplace Shell user interface. Warp 4 predated the ship date of
Windows 95 by several months, and indeed the full integration of
speech recognition with a commercial operating system is something
that has still not been achieved in Redmond. I was an OS/2 user
at the time, and like most others who had the necessary hardware I
couldn't wait to get my hands on VoiceType. It seemed too good to
be true.
To a large extent, unfortunately, it was. Not that it was a bad product -- it was quite advanced for its
day and did basically everything IBM claimed it could do. The integration
with Workplace Shell wasn't quite HAL9000 but it was as seamless as one
could hope to achieve without major redesign of applications and the
Workplace Shell itself. The problems were resource consumption--VoiceType
was, to put it mildly, a pig--and the need to extensively adapt one's
speech pattern to the software. To--use--VoiceType--you--had--to--put--a--break--between--each--word. It felt unnatural,
and for those of us who are fast touch-typists, it was actually less
productive than simply typing the text.
So what happened to VoiceType? Well, we OS/2 users trotted it out for
demonstration to our friends and co-workers, showing off its tight
integration with Workplace Shell along with its sheer gee-whiz factor.
This, we said, was proof that OS/2 was a superior operating system!
Then, as soon as we were alone with the computer again, we shut down
the speech option to save memory, and we reached for the keyboard so
we could get some real work done. VoiceType, as good as it was, remained
on the shelf as an amusing toy. As OS/2 Warp has faded from the
marketplace, VoiceType Dictation joins Workplace Shell in the fond
memory albums of its former users (many of whom now run Linux).
IBM has not, however, been idle in the years since Warp 4 was released.
Always known for its research-and-development leadership, IBM has spent
millions quietly developing and improving its speech recognition tools.
The latest embodiment of that technology is its ViaVoice Dictation
product. ViaVoice was originally only available for Windows, but a Linux
port was released late this summer. I was fortunate enough to receive
a review copy, and began my arduous but rewarding
journey from installation through to productive use.
Next: 2000: A Voice Odyssey »