Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Input Devices Software Apple

Mac Version of NaturallySpeaking Launched 176

Posted by kdawson
from the listen-what-i-say dept.
WirePosted writes "MacSpeech, the leading supplier of speech recognition software for the Mac, has canned its long-running iListen product and has launched a Mac version of Dragon NaturallySpeaking, the top-selling Windows speech recognition product. MacSpeech had made a licensing agreement with Dragon's developer, Nuance Communications. The new product is said to reach 99% accuracy after 5 minutes of training."
This discussion has been archived. No new comments can be posted.

Mac Version of NaturallySpeaking Launched

Comments Filter:
  • by lhaeh (463179) on Wednesday January 16, 2008 @03:37AM (#22063972)
    The last time I tried using voice dictation was When I was running OS/2 Warp 4. Training took forever, and the experience of using it was nothing but an exercise in frustration, ending with me screaming at the bloody thing then seeing neat, yet random expletives on my screen. I later came across some budget software that required no training, yet worked surprisingly well compared to the $400 packages made by the big boys. That software really showed what voice diction should be like, if only it was developed further.

    The training an accuracy seem like things that can be overcome, but I would really like to see a solution for things like punctuation and function keys, things that don't naturally come with speaking. Instead of having to say "delete that" or " delete" it would be nice to just have a button that I can hold down when saying things I want interpreted as commands.
  • by LordLucless (582312) on Wednesday January 16, 2008 @04:01AM (#22064102)
    I used Dragon Naturally Speaking for a while ages ago, and you could program it to respond to its name. Or rather, you setup a "start" sound that would indicate activate the listening algorithm. I had mine set to respond to "computer", but "minion" would work just as well.

    I stopped using it after I accidentally left it on in training mode one day, when I was teaching it the word "bonza". The pet lorikeet outside my room made such a wide variety of noises, that from that time forth, it thought every word I said was bonza, and I couldn't be bothered retraining it - training time was more than 5 minutes back then.

    I was using it more for commands than for dictation, and it was good at that, but there was one major drawback, and that was background noise - especially loud background noise emitted by the computer itself. One of the things I wanted to do was to get the computer to start and stop playing music on command. Unfortunately, once the music was playing, you had to really yell for the computer to differentiate the command from the music.
  • by jimicus (737525) on Wednesday January 16, 2008 @04:14AM (#22064160)
    A few things became of the technology:

    1. 99% accuracy rate is actually pretty bad in the real world. In a typical document, you might expect 12-15 words per line - so you have one error every 7 lines or so.
    2. 99% accuracy rate is only achievable under ideal circumstances - ie. using a top quality microphone hooked up to a good soundcard in an environment with very little background noise and no echo. Basically, circumstances you only get in a half-decent recording studio. In the real world, you seldom get this.
    3. Unless you happen to be blessed with amazing self-discipline (and/or can guarantee that nobody is going to approach you while you're working). Otherwise you get back to work after a distraction and find yourself having to delete a conversation you just had with a co-worker.
    4. If you're in an open-plan office (that's probably about 99% of UK offices these days) your colleagues will not thank you for spending all day talking.
  • by Seumas (6865) on Wednesday January 16, 2008 @05:06AM (#22064408)
    I tried Dragon years ago and after a couple hours or so of training, it still completely sucked. Same with IBM Via Voice. Perhaps Google will help improve things with their GOOG411 service that they're using to build up a massive bank of phonetics. Otherwise, it seems like real speech recognition is never seriously going to get off the ground.
  • Accessibility (Score:4, Insightful)

    by Selanit (192811) on Wednesday January 16, 2008 @06:53AM (#22064900)
    Five minutes training for most people, but not everyone. My boss uses Dragon NaturallySpeaking, and it took him nearly two weeks to complete the five-minute training due to some complications.

    Namely, he's blind. He cannot read the training phrases off the screen, because he can't see them. Instead he had to have a screen reader (JAWS in this case) read the phrases aloud to him so that he can repeat them back. But of course, Dragon was not expecting to hear audio input from anything other than the user, so that confused things. There were problems even using a headset. And since he can't actually use the program at all without having the screen reader running, it was pretty awful trying to get the training done. I'm not even sure how he finally managed to do it - I suspect he probably got a sighted friend to help. Thankfully the training files can be copied from one computer to another so you don't need to retrain it on each different installation.

    Once the training was finally finished, it worked well. He has poor fine motor control as a result of leukemia treatments - he can type, but only slowly and with a high error rate. His speech is slightly slurred as well, which reduces the accuracy of the transcription. Even so, the Dragon transcriptions are definitely better than manual typing. It's helped him a lot.

    I just wish that the Dragon programmers would come up with a more easily accessible training routine. There aren't a whole lot of users with the same disabilities as my boss, but for the few like him having good, well-trained dictation software is vital. With it, he can control his computer reasonably well, if rather more slowly than a sighted person with normal motor control. Without it, using the computer is basically impractical. When he can't use Dragon, sending a single rather short email can take upwards of an hour.
  • by Tibor the Hun (143056) on Wednesday January 16, 2008 @08:27AM (#22065548)
    This is fantastic news for those who need extra accessibility features.
    It may be fine for you or me to hit any key, but there are many other folks with various disabilities for whom such a task is not an easy one. So it may make more sense for them to use their voice and move on.

    If any of us were to lose fingers or hands in an accident, I bet we'd all be using something like Dragon to continue our work, rather than try to become a tap dancer.

    And let's not forget about accessibility in the workplace. This is great news for Mac shops, as now there is one less reason for having to support a rogue Windows machine...
  • by LMacG (118321) on Wednesday January 16, 2008 @09:56AM (#22066412) Journal
    > I tried Dragon years ago

    Yeah, software never gets better or anything. And faster processors and more memory surely couldn't help.
  • by esj at harvee (7456) on Wednesday January 16, 2008 @10:32AM (#22066898) Homepage
    Reading the comments I'm see a bunch of tabs[1] with no clue about being disabled, the speech recognition market, the history of the product, and how nuance is probably hampered by the management attitude towards money and the history of the code base.

    for someone who's been disabled (temporarily or permanently) speech recognition means the difference between making a living and being able to support oneself, a mortgage, family etc. and sitting around on your ass in section 8 housing on Social Security disability. Pain from RSI once made it extremely difficult to feed myself. When you've experienced that level of pain, disability and the associated despair, you get the attitude that anything that gives a disabled person independence and an ability to make a living should be encouraged with all possible resources.

    Listening to someone dictating using speech recognition will drive you mad. You would have the same problem with a blind person listening to text-to-speech. But that's not the fault of speech recognition or text-to-speech. That's the fault of management not providing the disabled person with an acoustically isolated environment (i.e. reasonable accommodat.

    Desktop speech recognition is a monopoly because it's extremely expensive and difficult to develop speech recognition and there is not a large market. the market consists of lawyers, doctors, and the disabled. There is not enough money to support two companies (or more) to develop desktop speech recognition applications.

    NaturallySpeaking is very buggy. There are bugs that cause people problems that were first seen in NaturallySpeaking 5. These are not hidden or hard-to-find bugs. They don't affect nuances ability to sell NaturallySpeaking. There's no reason for them to fix them except for the fact that they interfere with the use of many programs by the disabled. If you are just doing dictation into Microsoft Word or DragonPad, you'll never notice. If you try to dictate into Thunderbird, Firefox, Open office,... you're screwed. For example, I cannot dictate directly into Firefox for this comment, I need to use a workaround for dictation and then paste the result into the text box. The reason why this problem exists is because nuance management has the reputation of not making any change or feature unless you can make a business case and show them they will get revenue from that change. This is not such a bad model because it can keep nuance profitable and product available to people who truly need it (i.e. the disabled). The downside is that it doesn't leave room for changes necessary for the disabled.

    I've heard from people working inside dragon that part of the problem also is the code base. It was written by a bunch of Ph.D.'s who are really really good at speech recognition but are not so good at writing code. Also in the last few years, there has the huge turnover and people working on the code as NaturallySpeaking was sold first to L&H and then to nuance. That kind of change alone will wreak havoc on the code base as knowledge is lost and never really acquired by the new people. by the way, I have talked with some people from nuance, and they are basically good people. They understand the needs of the handicapped but they are constrained in what they can do for us because of budget and resources.

    When people talk about alternatives with open source speech recognition, only a tab would think they would work for the disabled. Their recognition speed is significantly slower, vocabulary size is smaller, and they are really more projects to keep grad students busy than be anything useful in the real world.

    The last problem with speech recognition sits in your lap if you are a manager of a software product or a developer. As far as I can tell, the number of applications that are speech recognition friendly is vanishingly small. It seems to me that software developers go out of their way to make software handicap hostile. It starts with the multiplatform GUI toolkits that do not

I have never seen anything fill up a vacuum so fast and still suck. -- Rob Pike, on X.

Working...