Tried Vista speech recognition yet?

I had not, but gave it a try today. It is almost a hidden feature, but if you go into Control Panel and double-click Speech Recognition Options, it all starts happening.

I have a few advantages. My microphone is of high quality, I use an external mic preamp, and my office is relatively quiet. After half an hour of so going through the tutorial (which also does basic voice training), I was ready to go. I have no complaints about the ease of setup. Once speech recognition was enabled, I simply opened Word and started dictating.

There are two ways to look at speech recognition. You can consider it an accessibility feature for users who prefer not to type, for whatever reason. For example, RSI is a common problem for writers and computer programmers. Alternatively, you can consider it as better way of entering text. After all, most of us can speak faster than we can type. Ideally I would like to use it to assist with transcribing interviews.

I had a simple question. Can I get a chunk of text into Word more quickly with speech recognition than with typing? To try this out, I used a few lines that are indelibly imprinted into my brain, since they make up the first stanza of Wordsworth’s poem Daffodils which I learned at school.

The test

First, I typed it. It took me about 25 seconds, which means I type at over 50 wpm (about 70 wpm according to this test).

Next, I tried speech recognition. I tried it several times, to give it the best possible chance. I found I could do the initial text entry in around 15 seconds, but correcting errors took longer. The best I managed for the entire stanza was about a minute – twice as long as typing.

The problem is that certain words and phrases seem to be difficult for speech recognition to get right, and correcting these takes so long that it wipes out any gains from the easy ones. In my case, the line that Vista struggled with most was this one:

That floats on high o’er dales and hills

As I repeated the experiment, I got different variations:

That floats on high powered tables and hills

That floats on high on the walls and hills

That floats on high ideals and hills

The speech engine will always try to make sense of what you say using its dictionary and who-knows-what clever algorithms, but this can work against you. In this case, it is really just the the word “o’er” that trips up the engine. If I dictate instead:

That floats on high over dales and hills

it usually transcribes perfectly. Unfortunately, in trying to make sense of “o’er”, it usually messes up several other words as well.

Does this mean that a poem with elided text is just a difficult case? Possibly, but unfortunately technical writing seems to pose the same kinds of problems. Everything is fine for a line or two, and then a difficult word or phrase causes garbage to be inserted into your text. Speech correction in Vista is nicely implemented and works well, but it takes time.

Pros and cons

I don’t mean to put you off. I’m actually impressed with Vista’s speech recognition, though it is early days and I’m not sure how well it compares to alternatives like Dragon NaturallySpeaking. I could definitely get some work done, and considered as an accessibility feature, it seems pretty good. Unfortunately, it doesn’t seem quite good enough to be useful even to a proficient typist – at least, not without more time spent voice training and learning to get the best out of it.

Technorati tags:

One thought on “Tried Vista speech recognition yet?”

  1. I spent some months several years ago using Dragon NaturallySpeaking, rather than the keyboard, to write magazine articles. After a day or so it was exhibiting remarkable accuracy, although it did not prove noticeably faster than using the keyboard – probably because I can (almost) touch-type. However what I did find, and what eventually stopped me using it, was that my writing style changed when moving between voice-recognition and typing. Why this should be I don’t know – perhaps it’s down to differences in the way we express ourselves in vocally and in print.

Comments are closed.