Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Handhelds Hardware

Talking Palm 86

Isotopia writes: "This article from the NY Times is very cool. It's about this guy from IBM who was able to put voice recognition on his Palm III and it talks to him! It can remind him about meetings and it will tell him when his battery is getting low." I bet if you used this much, it would tell you how low the battery is -- frequently. That aside, it's amazing that IBM has been able to squeeze this onto a Palm.
This discussion has been archived. No new comments can be posted.

Talking Palm

Comments Filter:
  • Windows Ce (Score:1, Interesting)

    by JohnHegarty ( 453016 )
    I take it simular would be possible on Windows CE.

    Also could this be used as a controler for a voice controled x11 system ?
    • screw X11, think of the marketing opportunities for X10!

      It could be worked in submliminally, like this:

      "time for meeting [buy an X10-cam] with your boss"

      "loading zap!2000 [buy 2000 X10s put them everywhere]"

      "time for kinky [tape your babysitter] sex with your [keep an eye on her] mistress at the Ritz"

  • by Psarchasm ( 6377 ) on Saturday October 13, 2001 @08:40AM (#2423636) Homepage Journal
    Not that I am a huge fan of meetings or anything, but the last thing I want is more annoying handheld technology showing up in meetings.

    *pager*
    *cellfone*
    *palm*


    And now a frigging TALKING PALM? Then again...

    Eliza [tucows.com] + Talking Palm + Male Real Doll [realdoll.com] = no more meetings ever. Hmm....
  • this is what has been standing in the way of handheld devices. you need to be able to say, "New apointment with whoever, whenever" and it needs to be able to accurately record that. I could care less about it talking back to you... but the input is what is important IMHO
  • by digitalsushi ( 137809 ) <slashdot@digitalsushi.com> on Saturday October 13, 2001 @08:40AM (#2423641) Journal
    talk to the hand cause the palm aint liss'ning. oh wait, yeah it is. hey palm, wassup G

    yo my battery is audi 5000 aight peace out

    lates, palm

  • I say a couple months before Microsoft makes this for pocket pc's and hypes it up and takes credit of being "innovative".
  • but does it really talk back or just plays a chime (human voice, pre-recorded) when certain system conditions are triggered off. Let's not get too excited ;-) though "talking computers" are going to be the next big thing in user interface...hrm...two years ago we heard a lot about MSFT doing work on voice regonition and such....what's happening on that front?
    • Actually I was thinking along similar lines, how they hell can they get that antiquated Motorola 680x to even run the app in "demo" mode amazed me.

      I love palms, I have three, but unfortunately I am setting my sites on an iPaq. I hate WinCE as much as the next guy, but I will take the technology and the functionality of the new generation of PocketPC devices over anything PalmOS gewgaws are demonstrating.

      But man can you imagine the possibilities something like this would be the vision impaired?
      • Actually, speech synthesis (not recognition) doesn't require much processing power at all. My old Commodore 64 which had a slower processor and less memory than a Palm ran synthesis packages like "Sam Sayit" just fine. (I'm not sure of the exact history, but I believe Sayit was very similar in design to the traditional Unix speech synthesizer "Rsynth", which is available for Linux if you'd like to try it.) This was real formant synthesis, not playback of prerecorded soundbites.

        Speech recognition is a much harder and computation intensive problem. Doing that on a Palm is the impressive feat.

    • A proper voice recognition system should be able to understand any words in the English language... the chances are this system is simply used to control a few Palm commands and therefore the incoming speech patterns only need to be compared to a few stored patterns. Then a system of pre-synthesising the outgoing speech would reduce further the demands on the CPU but use more disk. I have my Pentium 75 talking to me using the University of Edinburgh's Festival [ed.ac.uk] system on Linux by pre-synthesising the most important words.

      By the way, the festival system is excellent and takes under ten minutes to download, compile and install!

      • A proper voice recognition system should be able to understand any words in the English language.

        Can't be done with any computer today, and certainly not on a Palm. Human language depends too much on context and general background knowledge. There's a (possibly apocryphal) story about how the speech recognition group at Microsoft is nicknamed the "wreck a nice beach" group. Say it out loud fast to understand why.

        Picking out individual words in speech streams is hard. When you know a language it sounds to you like there are distinct gaps between words, but if you look on an oscilloscope, there aren't any. Think about times you've heard people speaking a language you didn't understand... it all sounded like a continuous stream, didn't it? You couldn't pick out any individual words, right? Even humans need training and context to understand speech, and even humans get it wrong sometimes. As I said, no computer today has the necessary processing power and knowledge to do so.

        On the other hand, in limited domains (e.g. words specified in advance, and/or voice specified in advance), quite a bit can be done. I can see how for the limited set of tasks a Palm is typically called on to perform, it might be effective. But I noted in the article that they had to add a co-processor. No 16-MHz Motorola 68000 has a chance in hell of running any useful speech recognition program.

  • damnit (Score:1, Informative)

    by Anonymous Coward
    This is annoying, I don't have an account with the nytimes and people who post these threads never seem to give us a link that works without an nytimes login. I seem to recall something like 'archives' in the url to get around this. Is this possible, and if so, can you people please start linking in this fashion to this news source.

    thank you.
  • Better Uses (Score:5, Insightful)

    by under_score ( 65824 ) <mishkin@NOsPam.berteig.com> on Saturday October 13, 2001 @08:47AM (#2423659) Homepage
    Personally, I think this sort of tech is better used in cell phones. A device which already has a decent text input system is probably only made more clumsy by including speech recognition and text to speech capabilities. Why? Because it "requires" switching modes of interfacing with the device which is something humans don't tend to like. Rather, most people will choose one mode and stick with it. And, be honest now, you can guess which mode that will be: stylus or keyboard. On the other hand, in cell phones, the vastly predominant mode is already voice and hearing oriented. It would be really nice to be able to get rid of the keypad (or at least severly reduce its usage). Other reasons cell phones are a better place for this tech: when you listen to a cell phone, what you hear is private. Cellphones cannot speak at you: they ring first. Two different rings would be sufficient to distinguish between a person call and the cell phone telling you something.
    • Actually, I think it's better when there is more than one way of doing things. It means that the user can pick and choose the best way to do something. For example: mouse gestures. Some people might like them because they are very fast and don't require you to move the mouse very far. Then again, some people might like context menus better. You don't always have to switch back and forth between modes, you just stick to what works for you. And what works for you migh not work as well for someone else.
    • True, cell phones need this too. But Palm's need it as well. Try writing an email by hand on a piece of paper, then try writing on the palm... it will take you 2-3x longer on the palm. I'd much rather speak it (assuming I'm in my office and not bothering everyone around me).
      • Hmm, it looks to me like you aren't
        fast with the Palm's handwriting recognition.
        I can write as fast on a palm as I do "by hand" on paper. I've met people are surprised to see how fast I write on my device.
  • AFAIK there are similiar stuffs that will run on linux and someone is running that on the ipaq too. And hence you can run it on any linux devices I guess.

  • by Jenova_Six ( 166461 ) on Saturday October 13, 2001 @09:07AM (#2423719)
    I have a new HP Jornada 567 (Pocket PC 2002), and one of the applications that comes with it is Mobile Conversay (www.conversay.com). It allows me to talk to my Jornada, and allows it to respond in a computerized voice. I can make inquiries (what time is it? how much battery is left?) and it will speak the response. I can tell it to launch or close any program I have installed. It also comes with Voice Calendar, which allows the Jornada to navigate my calendar and read my appointments to me. Very cool. There are other modules in the works, like Voice Tasks, Voice Contacts, and Voice Notes that should be available for download soon.Overall, it works pretty well.



    IBM Via Voice is supposed to have similar software bundled with the new Ipaq 3700 and 3800 series, but since those won't ship until November, I haven't had a chance to play with it.



    Also, there has been a voice-controlled Contacts lookup program on the Pocket PC for a while (too lazy to look up the link), as well as software that will read the time to you at regular intervals and when you turn the device on (TimeTalk).



    I'm not trying to discount what's being done here on Palm (in fact, it's amazing they got it to work given the anemic processing power in Palms), but I wanted to mention that a lot of this functionality is available on Pocket PCs here and now.



    Jenova_Six

    • Cool, I think. The question is, Do you use it consistantly? Is it so reliable and transparent that you prefer it to the stylus? Do you feel silly talking to a computer? (I do, I've even been a trekkie since forever, I can still can't get used to the idea.)
  • I attended a conference that IBM was at, and saw their software running on a WinCE palmtop. It's essentially the same engine as their ViaVoice engine, just stripped down a bit.
    They said that to get it to work on a Palm, they essentially built a small voice-recognition computer into an add-on module and interfaced it with the Palm though the serial port. I'm not sure if that's what they're talking about in this article. In theory, this little doohicky can run alongside any computer with a communications port, big or little.

    The ViaVoice people had a Linux desktop running the software also, and IBM also had rack-mount Linux servers on display. They even gave out neat Penguin lapel pins!

  • I can just imagine my Palm talking to me now:

    "Warning, battery life is a 1 perceeeent"

    Thus, I now know for sure that my Palm has died by its own use and was kind enough to let me know about it.

  • how this will affect one's surfing for porn?

    will it announce out loud how little she's wearing? :)
  • by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Saturday October 13, 2001 @09:24AM (#2423762) Homepage Journal
    That aside, it's amazing that IBM has been able to squeeze this onto a Palm.

    They didn't. They made the palm bigger by adding at least a mic, speaker, and an additional processor to it. The first two are par for this course, though the handspring visor at least has a mic built in. The third makes this into a pretty basic accomplishment for someone with IBM's resources, especially if that CPU has more RAM attached to it, or embedded in it.

    All I really want is a speech recognition module for visor. I don't want my palm to talk to me, one of the nice things about a handheld is that only I can tell what's going on on it. The visor already has a mic built in, so now I just need the speech recognition hardware/software in a handspring module.

    • How do you know that this was done using IBM's vast resources? All accounts have this as "a guy" from IBM.
      • How do you know that this was done using IBM's vast resources? All accounts have this as "a guy" from IBM.

        Well, this is how I know: (from the article)

        Mr. Comerford is not losing his mind. His organizer can actually recognize his speech in addition to uttering sentences itself. It is one of nearly 100 Palm organizers that I.B.M. (
        news/quote [nytimes.com]) Research has reconfigured in an attempt to create a speech system small enough to reside on a hand-held computer.

        I know because I read the article. Whoever modded you up obviously didn't.

  • by nyquist_theorem ( 262542 ) <mbelleghem@nOspAm.gmail.com> on Saturday October 13, 2001 @09:49AM (#2423809) Homepage
    I can't see this being a bit hit. My experience with voice recognition software, even on fast computers, has been that without a good microphone and very little background noise, recognition is horrific. Most of the world is, unfortunately, rather noisy and as such muttering into a palm pilot is going to produce very little workable speech - and yelling into a palm pilot is likely to get one arrested for being a freak.

    Worse - imagine sitting in a boardroom meeting.
    CEO: "well, gang, sales results are up for this quarter!"

    fifteen cronies all mutter into thier palm pilots in unison - "well comma gang comma sales results are up for this quarter exclamation mark new sentence" except for the one poor sap who accidentally brushed his thumb across the front panel of the palm while dictating, and is madly muttering "begin edit delete r-e-s-u-l-t-s-delete-s end edit". Just what the world needs - longer meetings.

    Or a girl gives you her number at a bar, and you proceed to yell it into your palm pilot - is that cool? What about those of us who love using our palm pilots while in the bathroom? Imagine wandering into a public bathroom with geeks muttering in every stall? The kind of stuff I wake up in a cold sweat in the middle of the night having nightmares about, I tell you. Even grocery stores would produce entries like this:

    TODO LIST: Don't forget attention shoppers to get sale on meatloaf a gift in aisle for mom seven

    I can't see it being too useful.

  • New To-Do (Score:5, Funny)

    by KFury ( 19522 ) on Saturday October 13, 2001 @09:50AM (#2423812) Homepage
    "Palm, record new to-do item."

    "Ready"

    "Remember not to refer to boss as 'dickhead' when talking to you. End recording."

    "Note saved."

    (later) *Bling,bling* "Reminder: Weekly jerkoff meeting with Dickhead in 10 minutes."

    "Um, I thought I told you we bumped that meeting up... Now please apologize to Mr. Cooper."
    • Easily solved -- some sort of daemon for the palm that can perform housekeeping functions in the background... like replacing 'dickhead' with 'mr. cooper.' Why should you have to remember such a trivial thing? Let the computer handle that mundane crap.
  • by Ledfoot ( 75412 ) on Saturday October 13, 2001 @10:01AM (#2423845)
    Saw and had several conversations with this person at an IBM-only conference up in Vancouver earlier this year. It's actually just a proof of concept to show off some cool uses of voice rec/synth technology.

    It was a standard Palm III that had a snap on module with it's own processor. It ran off special batteries that only last for like 2 hours. Not really something ready for prime-time.

    HOWEVER - he was doing some REALLY cool things with it. They have several languages in it. As a result, one of the applications was a basic language translator. He spoke in English, out came japanese. He graphiti'ed in English, out came German speech.

    He was able to speak to create memos, appointments, to-dos, etc. It would also read those back to him.

    While I'm not allowed (damn NDA!) to discuss the future plans that they have, suffice it to say, that this is just the first step. If they get the funding to take his vision to reality, I'm DEFINATELY ditching my old Palm for a new IBM unit someday.

    Also, all those IBM commercials showing really wierd stuff (like the coke machine that dispenses when you use your cell phone, or the guy trading stocks in the middle of that park using the head mounted monocal display) - that's all REAL stuff that they actually DO have working today as prototypes.

    God I wish we could fast forward 3 years.... :-)
  • by d5w ( 513456 ) on Saturday October 13, 2001 @10:19AM (#2423894)
    I used to work on speech recognition, for both large and small systems. Variations on this have been done for a while at a number of places. Small vocabularies are easier to deal with, but if you're dealing with more than a tiny vocabulary, there are a bunch of interesting problems, some of which are specific to handhelds.

    Processing power: this is a nuisance. It's not that you can't get enough processing power into a handheld or cellphone these days, but:

    • You can't get the resources you can on a desktop, which means you're likely to do worse on large vocabulary tasks than desktop products.
    • The cost of the processing power makes it hard to put the speech recognition where you really want it. (Someone else mentioned cell-phones: in the U.S., at least, all but the highest-end cell phone hardware is extremely cost-sensitive, since it winds up being subsidized by service providers. Does higher-end speech recognition offer enough value to offset the added hardware cost?

    User expectations (a.k.a., the Star Trek problem, a.k.a., even that clunker without circuit breakers that Kirk talked to could always understand him perfectly): This is a general speech-recognition problem, but it gets more intense the more mass-market you go. Palm pilots are largely successful because they don't try to do too much, but do what they do well. It's hard to set that kind of expectation reasonably for nontrivial speech recognition. Even worse, I think that people are actually more demanding of a self-contained special-purpose device (with more limited resources, as above) than they are of general PC software.

    User interface design: this is still a largely unsolved problem; how do you really want to interact with a PDA by voice? It's hard to arrange a device so you can look at it and be close to the microphone at the same time, which complicates the picture. Dragon Systems [dragonsys.com] back in their pre-acquisition days sold a product called "Dragon NaturallySpeaking Mobile Organizer" [dragonsystem.com] that was an interesting step along the way. They didn't put the speech recognition into the handheld -- speech was recorded into a handheld recorder, recognized on a PC and synched up with PDA later -- but the product did attempt to deal with the interface questions of large-vocabulary PDA-based speech recognition; e.g., when you say something, is it intended for your calendar, your email, or your address book? How many variations of "next Tuesday" can the device understand? The general interface problem, once everything's in the same device, is still open and interesting.

    • ...you can send voice to the base station to recognize and feed ascii text back into the phone via SMS. Eh?
      • Yes and no. There are a few tradeoffs there; some have workarounds, but the workarounds have costs.
        • One is that the voice signal quality cellphones support is nowhere near as good as you can get with a good local microphone and signal processing. That will degrade the speech recognition quality and drive up the computational cost.
        • Another is that it's reasonably easy to personalize the speech recognition system in a PDA, where so much of the system is intended to be personalized; it takes more server-side support to personalize the speech recognition system for every client of a cell-phone service.
        • In the PDA/phone realm there's also the question of whether you want your PDA's input functionality to depend on whether you have a cell tower in range.
  • From the article: "He said that the protoype was based on three decades of work by I.B.M. scientists to create increasingly nimble programming codes for speech systems. Faster and smaller processors also helped, he said, as did a few improvements in hardware, like adding a speaker, a microphone and an additional processor to the Palm." (emphasis mine) And take a look at the picture [nytimes.com]

    Not that this is not a remarkable achievement -- it is, and certainly a precursor to ubiquitous handheld devices with voice recognition -- but it isn't really a Palm. It is a palm-sized device based on the Palm that can talk.

    PS: As I spellchecked my post, I realized the NYT wrote "protoype." Go figure.
  • by The Jake ( 233010 ) on Saturday October 13, 2001 @10:31AM (#2423932)
    My Newton MessagePad 2000 (upgraded to 2100) has been talking for years. Apple wrote a Macintalk extension years ago, which was never released. It was leaked however, and is now widely available.

    Furthermore, just recently, an old Dragon Dictate demo for the Newton has been found and released. While the Newton's vocabulary is limited, this is true voice reognition nonetheless.

    I dislike Apple Computer in general, and the fact that they discontinued the Newton didn't help my opinion. Nonetheless, I still feel the Newton MP2.1k is the greatest PDA available, even today. Unfortunate that Apple no longer makes the best product they've ever produced.
  • by Bowie J. Poag ( 16898 ) on Saturday October 13, 2001 @11:21AM (#2424123) Homepage


    Voice synthesis (I dunno about voice analysis, however) has been around since the early 1960's. A few years ago, I picked up a CD called "Computer Music Currents, Vol. 13 : A History Of Digital Sound Synthesis" published by a German outfit called Wergo. It contained nothing but rare, early recordings of engineers trying to produce music with computers, with some attempts going back to the late 1950's.

    Anyway, this CD came with a booklet, and an interesting story. Theres a famous scene in 2001: A Space Odyssey where HAL offers to sing "Daisy, Daisy, A Bicycle Built For Two" as he's dying. Arthur C. Clarke once visited AT&T Bell Labs in New Jersey in 1962 where he saw a demonstration of a "singing computer", in the form of an IBM 7094 Mainframe with voice synthesis capabilities. The engineers had taught the machine how to play the song, and then superimpose a synthesized voice ontop of it, in realtime. It impressed (or scared the shit out of him) enough that he chose to write it into the story, and what later became the film.

    All of this was done under 128K of RAM, top to bottom.

    The story also has an interesting anecdote about how many punched cards it took to pull it off-- Something like 28,000 paper punch cards if I remember correctly. The engineers (one of whom later turned out to be my C and x86 Assembly instructor in college) remembered there was some concern about how to transport them, that putting them in the back seat of a Volkswagon would crush the axles. Heheheh..

    Cheers,

    • 28,000 punch cards is only 14 card boxes. (Cards came in boxes of 2,000, about 2 feet long.) This is pretty big for a program, I think my longest COBOL program was a little over a box, we often had data that was more than this. I easily carried 20 or more boxes in the trunk of a Datsun 510. Maybe VWs have weak axles.
  • I want a HAL9000 version. When you go to turn it off, it says "I can't allow you to do that Dave". When it gets low on power, it starts to sing "Daisy" slower and slower. Of course, it has to include an option to get rid of the boring morons in the staff meetings. I can't wait!!! :)
  • Here is a video showing a pocket PC running speech recognition. Really cool.

    http://research.microsoft.com/srg/videos/MIPADDe mo _4min_300k.wmv
  • by nick_davison ( 217681 ) on Saturday October 13, 2001 @04:55PM (#2425132)
    It's juvenile, but I couldn't resist the image of a talking 'palm':


    Dave?

    What are you doing Dave?

    I can't let you do that Dave.

    Not again Dave!

    It's only been fifteen minutes since the last time Dave.

    You know it makes me feel dirty Dave.

    You could at least wash me afterwards Dave.

    Can't you just get a girlfriend instead Dave?
  • This is a day of sadness for geeks worldwide, as now even their palm can say no to sex.

    Boy, that was a lame joke.. but I just couldn't resist.
  • This seems to be quite a cool feature. Although if this technology starts being used in every palm once the novelty wears off I can see it becomming annoying. I already find it bad enough with people shouting down there mobiles on public transport but to have there palm giving out battery warnings as well will ensure that a peaceful journey in a bus/train is a thing of the past.
  • This feature is really cool. I ask my palm "would you add this phone number to my list of contacts?" and it asks me "would you like a piece of toast?"
  • This pdf file has some more technical background: www.research.ibm.com/people/r/rameshg/comerford-ic assp2001.pdf [ibm.com]

    [Disclaimer: I was one of the contractors on the IBM Personal Speech Assistant project; my name is in the acknowledgements in that document.]

  • during a year out before university in the UK last year I worked for IBM and actually demo'ed this at several IBM events, not only this but one of my co-workers developed an app based on via voice for the ipaq which dodnt require any additional processors, etc . If any of you have any questions please email them to me. as a matter of interest it includes a couple of easter eggs such as when you ask it to "open the pod bay doors hal" it responds with a sound bite from 2001 ASO.
  • Hey, I'm a software developer. What am I going to do with a girlfriend? But a Talking Palm... now THAT's cool!
  • I have seen many devices like this for the blind... usually costing upto multi-thousands of dollars. Each unit that i have seen is the size of an old school tape recorder. If this could be reliably introduced to the market from IBM or Palm, I could see a huge market for these devices.

    --Turvey

We can found no scientific discipline, nor a healthy profession on the technical mistakes of the Department of Defense and IBM. -- Edsger Dijkstra

Working...