Please create an account to participate in the Slashdot moderation system

Text-to-Speech on a Low-Power Chip 263

Posted by michael on Wednesday November 07, 2001 @04:49PM from the youve-got-voicemail dept.

bluephone writes: "The EE Times has a story on a new chip from Winbond that can take ASCII or UNICODE text and convert it to either spoken English or Mandarin (the Chinese language, not the orange). The low-power chip scans the text and translates it into spoken phenomes and outputs it to a filter for smooth analog sound, or can directly output the digital signal. Imagine a cell phone with this, you can have your email read to you, rather than seeing a line at a time on a dinky screen, street directions from a website, or even Slashdot's headlines. :)"

This discussion has been archived. No new comments can be posted.

Text-to-Speech on a Low-Power Chip

Load All Comments

Search 263 Comments Log In/Create an Account

Comments Filter:

Done that (Score:4, Interesting)

by beretboy ( 221801 ) writes: on Wednesday November 07, 2001 @04:51PM (#2534338)

I have a little bot written in perl and VXML that reads my email. It is far esier than making the phone do the processing, and ita free. see studio.tellme.com

Share
twitter facebook
- Re:Done that (Score:1)
  
  by beretboy ( 221801 ) writes:
  
  And making it read headlines would be trivial
- - Re:Done that (Score:1)
    
    by beretboy ( 221801 ) writes:
    
    If you had actually LOOKed at my refrence to studio.tellme.com you would realize that this will run ON ANY PHONE through a 800 number!
Dr. Sbaitso (Score:1)

by ChazeFroy ( 51595 ) writes:

Anybody remember Dr. Sbaitso? This program was great for being written way back when (1994?).

I believe Radiohead used it as the voice for their track "Fitter Happier" on OK Computer.
- Re:Dr. Sbaitso (Score:1)
  
  by Scrameustache ( 459504 ) writes:
  
  I'm pretty sure that OK computer was MacIntalk.
  
  (The bundled voices for text to speech on MacOS 8+)
  - Re:Dr. Sbaitso (Score:4, Funny)
    
    by EvlPenguin ( 168738 ) writes: on Wednesday November 07, 2001 @05:49PM (#2534665) Homepage
    
    MacIntalk is older than that, and quite franky, it rocks. Man or Astroman [astroman.com] (one of the greatest bands ever -- especially live) use it as their lead singer. Fred really can sing.
    
    In other news, "Man or Astroman wants all the party people.. to say.... yeeeaaaaahhhhhhhhhhh"
    
    And by the way, the voice on "Fitter, Happier" (Radiohead) was actually Thom during an especially intense episode of innebriation >:P
    
    Parent Share
    twitter facebook
- Re:Dr. Sbaitso (Score:1)
  
  by CrazyBrett ( 233858 ) writes:
  
  Yes! Dr. Sbaitso ruled! I'll bet many people here remember it. But...
  
  How many people know where the name Dr. Sbaitso actually came from?
  - Re:Dr. Sbaitso (Score:2)
    
    by Genom ( 3868 ) writes:
    
    sb = Sound Blaster
    
    ai = "artificial intelligence"
    
    tso = ???
    
    text-to-speech output?
  - Re:Dr. Sbaitso (Score:3, Informative)
    
    by generic-man ( 33649 ) writes:
    
    All you had to do was ask. Sound Blaster Acting Intelligent Text-to-Speech Operator.
- Re:Dr. Sbaitso - no it's bruce (Score:1)
  
  by Radiohead ( 86586 ) writes:
  
  Actually the "fitter happier" voice is the Bruce MacInTalk Pro voice that is part of MacOS. You can paste the lyrics into SimpleText on a Mac and get your very own live performance.
  
  The lyrics are here [followmearound.com]
Mac OS has that (Score:1)

by homb ( 82455 ) writes:

MacOS has had that for a while. It works ok. In fact, by default in OS 10.1 it speaks modal error dialogs. It surprised me the first time this happened.
- Re:Mac OS has that (Score:3, Funny)
  
  by SnapShot ( 171582 ) writes:
  
  My Amiga was talking to me 15 years ago.
  
  Actually, my Timex Sinclair 1000 was talking to me 20 years ago, but I think that was the acid...
  - Re:Mac OS has that (Score:2, Insightful)
    
    by SnapShot ( 171582 ) writes:
    
    Actually, on a more serious note, is there anyone working on an open source speech synthesis project?
    - Re:Mac OS has that (Score:3, Informative)
      
      by cduffy ( 652 ) writes:
      
      Actually, on a more serious note, is there anyone working on an open source speech synthesis project?
      
      Yup; it's called Festival [ed.ac.uk].
    - Re:Mac OS has that (Score:2, Informative)
      
      by Chakat ( 320875 ) writes:
      
      Yep, it's called Festival [ed.ac.uk], and the results are pretty decent. Became free as in speech a couple minor versions back, too.
  - Commodore 64 has that! (Score:2)
    
    by bgarcia ( 33222 ) writes:
    
    My Amiga was talking to me 15 years ago.
    
    I can't remember the name, but I had a program for my Commodore 64 that did text to voice.
    Does anyone remember the name of this program? I think it was something like "Simon Says".
    - Re:Commodore 64 has that! (Score:2)
      
      by Emil Brink ( 69213 ) writes:
      
      I'm pretty sure the name was "SAM". I found this page [tripod.com] about a C64 speech synthesis program named SAM, but I'm not sure if it's the same one. Sounds right though, I remember that running SAM added a SAY keyword to the C64's built-in BASIC... Aah, nostalgia!
  - Re:Mac OS has that (Score:2)
    
    by blair1q ( 305137 ) writes:
    
    My Clark-Nova was talking to me in the '60s, but that was its job.
    
    --Blair
Oh no...... (Score:2)

by the_2nd_coming ( 444906 ) writes:

man, if they put simplt text (apples text/scripting/voice filter program) on one of those things, no one will gt there work done!!!
- Re:Oh no...... (Score:2, Funny)
  
  by SPK ( 8321 ) writes:
  
  yes, but simple text and Chinese ... that's like comparing apples to oranges, no?
  - Re:Oh no...... (Score:1)
    
    by jiheison ( 468171 ) writes:
    
    yes, but simple text and Chinese ... that's like comparing apples to oranges, no?
    
    Not in this case, since the chip translates text in to written phonemes. This just means that the Chinese language text would have to be written phonetically, in simple text. How it handles the tonal aspect of Mandarin is another story, but I suspect that there is some phonetic writing scheme that accounts for this.
- - Re:Oh no...... (Score:1)
    
    by the_2nd_coming ( 444906 ) writes:
    
    sorry if I am to esoteric and dry for you, I am a geek what do you expect?
Great for all sorts of devices. (Score:3, Interesting)

by Negadin ( 261695 ) writes: on Wednesday November 07, 2001 @04:54PM (#2534353)

Cell phones, PDA's, perhaps new tools for people with vision disabilities, where it could pick up plain text via IR near busy intersections or information kiosks. Text is small, broadband wouldnt be required, since its all converted in real time on a chip. Since it is supposed to be low-powered, it would be great for devices that didnt need to be recharged often, like the pagers mentioned in the article.

I wonder how lifelike the voice is though. I don't think any text-speech tools are going to become very mainstream untill they sound better.

Share
twitter facebook
- Re:Great for all sorts of devices. (Score:2)
  
  by IronChef ( 164482 ) writes:
  
  All computers should sound like the Voice of World Control. Audio clips from Colossus: The Forbin Project are here [uiuc.edu] for your enjoyment.
  
  If Stephen Hawking sounded like this, he would have taken over the world long ago.
wouldn't it be easier.... (Score:5, Informative)

by CodePoet82 ( 177189 ) writes: on Wednesday November 07, 2001 @04:54PM (#2534354) Homepage

as the writeup said, this could be used in a cellphone to read what you were looking at, but wouldn't it be simpler, and backwards compatible, to just do text to speach synthisis on a remote computer. every cell phone out there can already just transmit the sound from a remote location, and it wouldn't require any new/expensive chips.

Share
twitter facebook
- Re:wouldn't it be easier.... (Score:1)
  
  by LWolenczak ( 10527 ) writes:
  
  thats what the companies would do, so they can use up your minutes when your getting your email...
  - Re:wouldn't it be easier.... (Score:1)
    
    by CodePoet82 ( 177189 ) writes:
    
    unlimited minutes to local/same provider calls is starting to become more and more standard, under that plan, would this logically be a free call?
Converting text to an orange ... (Score:2, Funny)

by (void*) ( 113680 ) writes:

would be something we can all be impressed with.
- Re:Converting text to an orange ... (Score:1)
  
  by Fenris2001 ( 210117 ) writes:
  
  Converting game programmers to oranges would be even more impressive..... [imdb.com]
Great... (Score:2, Funny)

by Anonymous Coward writes:

That's all I need, Stephen Hawking's voice coming at me from my cell phone:

"Get Viagra Because You Care !! Watch the Famous Jennifer Lopez p0rn video Swing LOOOW, Sweet Chariot dash dash dash dash dash 8 eff ess arr...."

Anonymous cowards love the rich meaty taste of spam.
- Re:Great... (Score:2)
  
  by Doc Hopper ( 59070 ) writes:
  
  Heh, I already have this daily on my cell phone using "Voicelink", a service provided by WorldCom (and my employer, Talk2 Technology). I routinely get emails like this:
  
  From: root
  Subject: cron kill `ps -ef | grep username | awk '{print $2}'`
  
  Just imagine how that sounds read back to you over your cell phone. It really beats having to lug a laptop with me just to check my email, but the kinds of email a sysadmin receives often don't translate well into spoken English. However, it's fun to hear this female voice try to get it right. One of these days, I've gotta get together with the programmers here and make sure these things get read right, like "kill back-tick pee-ess dash ee-eff pipe grep user-name"...
pr0n related use? (Score:2, Funny)

by bludstone ( 103539 ) writes:

could i set it to a deep erotic female sounding voice and have it read dirty stories to me?
A Possibility... (Score:2, Funny)

by VA Porware ( 534937 ) writes:

Can you imagine millions of geeks' cell phones with this chip? "First Post" echoing throughout the world...
Nothing new (Score:1)

by CmdrTroll ( 412504 ) writes:

Radio Shack used to sell these chips years ago. I once built an automated model rocket launcher that used the chip to announce the countdown - pretty damn slick, if you ask me. I believe the same chip was also used in the old TI-99/4A speech synthesizers (if anyone else remembers those).
There's really nothing new about this product, except for its ability to speak Mandarin. And given the state of the Chinese economy, it's not very likely that many citizens over there will be in the market for talking electronic devices anytime soon. Most of them are still trying to get phone service and running water.
-CT
- Re:Nothing new (Score:5, Informative)
  
  by morcheeba ( 260908 ) writes: on Wednesday November 07, 2001 @05:05PM (#2534426) Journal
  
  No, those chips (it was a pair) were power-hungry 5 volt parts made by General Instrument. One was a microcontroller (8051?) with the text-to-phonome algorithm, and the other was the phonome-to-audio processor (GI SP0256). Actually, the SP0256 could accept external roms for specialized words, so it could have spoken in any language you wanted.
  
  Check out quadravox [quadravox.com] for boards that emulate the SP0256, using ISD's analog flash memory and a microcontroller.
  
  (My misadventure with the old GI chip: -12 instead of +5, just for a split second. After that, it developed an stutter!)
  
  Parent Share
  twitter facebook
- Re:Nothing new (Score:1)
  
  by discoinferno ( 137207 ) writes:
  
  Rat Shack used to sell a chip 'like' this, not these exact chips. The voice quality of these chips is much improved, and it has MANY MANY new features. Perhaps you should read the article and actually check out the specs of the chip before running it down, hmmm?
The only problem with this.... (Score:2)

by Teancom ( 13486 ) writes:

would be reading /. headlines. I mean, text-to-speech is great, but can it spell-check at the same time?
applications... (Score:1)

by recursiv ( 324497 ) writes:

The chip could enable items such as a teddy bear that lulls a child to sleep by reading a bedtime story with the pre-programmed voice of Winnie the Pooh.
Think of the applications for blow up dolls and pr0n!
Text 2 Speech on Linux (Score:1)

by Hornsby ( 63501 ) writes:

I've tried festival on Linux, and it's output is always really fuzzy and hard to understand. Do any of you know of any alternative programs that are more discernable in their delivery of voice? I would love to have my Linux box talk to me like one of those sexy Imac operators...
Why dump more tech than necessary into the phone? (Score:4, Insightful)

by StaticEngine ( 135635 ) writes: on Wednesday November 07, 2001 @04:57PM (#2534381) Homepage

Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the cellphone really the way to do it? The phone can already send and transmit voice, so why not keep the text-to-speech synthesis at some central server where the systems can be maintained and upgraded, rather than having to support/manufacture/refurbish thousands of phones out in the field?
The cellphone may have all the power of an original Palm Pilot these days, but we don't need to make it into a Onyx Server.

Share
twitter facebook
- Re:Why dump more tech than necessary into the phon (Score:1, Interesting)
  
  by Anonymous Coward writes:
  
  Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the phone really the way to do it? Why put a radio transmitter and receiver in the headset-- if you do that, then you'll need a battery to power it. Stay with the corded phones, I say! And, if you have more than one person in a city talking on these newfangled radiophones, you'll need a computer to set the radio frequency! My Gremlin's 8 track player/AM radio doesn't need a computer to change channels -- it's got those big preset buttons to move the dial for me. The cell phone may have all the power of an trs-80 these days, but we don't need to make it into an IBM-PCjr.
  
  p.s. and don't even get me started on digital phones... converting analog to digital to analog baseband to RF, and then back again!
  - Re:Why dump more tech than necessary into the phon (Score:1)
    
    by beretboy ( 221801 ) writes:
    
    studio.tellme.com
- Re:Why dump more tech than necessary into the phon (Score:5, Insightful)
  
  by DdJ ( 10790 ) writes: on Wednesday November 07, 2001 @05:42PM (#2534634) Homepage Journal
  
  Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the cellphone really the way to do it?
  
  I've got a co-worker, our Oracle admin, who's blind. As things stand, with most cell phones he can't do anything except dial out and answer calls. He can't use the built-in address book to place calls for example, because all of the info is in text on a tiny screen. With text-to-speech software on the phone, he'd be able to use the address book just like sighted folks, read text messages he received earlier even when he's in an area with no coverage just like sighted folks, and so on. This is a good idea.
  
  Parent Share
  twitter facebook
  - Re:Why dump more tech than necessary into the phon (Score:3, Informative)
    
    by Kraft ( 253059 ) writes:
    
    Wow! Your Oracle admin is blind? *im baffled*
    
    I have never worked with blind people, but after reading an article last year about how websites are getting more and more difficult for braille browsers (flash, imagelinks without alt tags etc.), I decided to make a lynx-friendly version of my site - and so should YOU!
    
    Anyways, how does he do it?? Is it worth it to the company you work for, or does it cause everyone else problems? Is he good? Tell! Hopefuly this could encourage others to take on "disabled" in their company....
    - Re:Why dump more tech than necessary into the phon (Score:3, Interesting)
      
      by DdJ ( 10790 ) writes:
      
      Wow! Your Oracle admin is blind? *im baffled*
      
      ...
      Anyways, how does he do it?? Is it worth it to the company you work for, or does it cause everyone else problems? Is he good? Tell! Hopefuly this could encourage others to take on "disabled" in their company...
      
      He's got a variety of tools at his disposal. Just the other day, he gave a demo of some of them to a bunch of us.
      
      He's got an 8-dot braile terminal that gives him enough characters to do C and Perl programming. He's got a hardware speech synthesizer he cranks up to something like 200+ words per minute. I tried, and could only understand a few phrases when it was cranked up to 95 words per minute.
      
      And when a web site he needs or wants to access is inaccessible, he complains to them, and sometimes things get fixed. He can navigate web sites that use alt tags remarkably well. A good rule of thumb is that if a site makes sense with images turned off (or in lynx), then it'll work for him.
      - seeking more info (Score:2)
        
        by Technodummy ( 204943 ) writes:
        
        Are there any websites where you can get a review by a blind person? or anything similar?
        
        We can talk about web standards until we are blue in the face, but when we stop certain people from being about to use the web, that's more than a failure of standard.
        
        thanks Kynn (Score:2)
        
        by Technodummy ( 204943 ) writes:
        
        thanks Kynn
  - Re:Why dump more tech than necessary into the phon (Score:2)
    
    by Dr Caleb ( 121505 ) writes:
    
    I've got a co-worker, our Oracle admin, who's blind. As things stand, with most cell phones he can't do anything except dial out and answer calls
    I hope he practices safe cell phone use and doesn't call out while he's driving.../humour
- Re:Why dump more tech than necessary into the phon (Score:2, Insightful)
  
  by tych0 ( 519146 ) writes:
  
  You have to remember that economics is what drives these things. If there are yuppies or geeks out there who want to have "every feature but the kitchen sink" in their cellphone, PDA, or whatever, there will be a company out there that will be happy to take their money to implement these technologies.
- Re:Why dump more tech than necessary into the phon (Score:2)
  
  by Glytch ( 4881 ) writes:
  
  It's easier on bandwidth to just send a few hundred bytes of text than streaming audio.
  - Re:Why dump more tech than necessary into the phon (Score:2)
    
    by Doc Hopper ( 59070 ) writes:
    
    Well, not necessarily. The cell and PSTN networks are designed around carrying audio and that is still what they do best. Today, it's a toss-up as to whether it's better to approach text-to-speech from the back-end (where you can have more flexibility) servers, or by embedding pieces into phones which gives you a whole new set of problems and potentially great solutions.
    
    The problem is, the idea of using this tech in phones is fighting against hundreds and hundreds of millions of deployed telephones without any tech newer than perhaps a microchip for caller ID. Over the long-term, text-to-speech embedded in the device is the more efficient and user-controllable format. Over the short haul, though, we're going to see many years still of central-office-controlled voice apps on your phone.
    
    Niche applications, like on a Pocket PC, now there something like this would absolutely rock. Get a toehold, and eventually low-power text-to-speech and speech-to-text devices will be all the rage.
    
    Now if only someone would perfect a speech-to-text engine that didn't require hours of training to recognize my accent...
Old news man ... (Score:1)

by Mr. Eradicator ( 470089 ) writes:

/me gets out his Speak n Spell.
Oh the possibilities! (Score:1)

by Neutron_F1uX ( 534720 ) writes:

Oh imagine them! TO have one of these babies that understood more languages, and could translate them to one of the others. Need a real translator? Nooope, watch my lil PDA Do it for me!

Seriously, that could have tremendous bussiness implications for those who are doing bussiness in other countries.

Their usage of EEPROM is nothing ut ingenious, why hasn't anyone done this before? Or have they? It makes a lot more sense then a flash card, and it's cheaper too.
I bet it will choke... (Score:5, Insightful)

by MarkusQ ( 450076 ) writes: on Wednesday November 07, 2001 @05:00PM (#2534401) Journal

I bet it will choke on:
The lead story read: "Unionized environmental health workers object to new chip that can read un-ionized lead levels."
Reading english is a lot tougher than most English speaking people think.
-- MarkusQ

Share
twitter facebook
- Re:I bet it will choke... (Score:2, Insightful)
  
  by jiheison ( 468171 ) writes:
  
  I would assume that it could recognize and handle a dash appropriately.
  
  You're right though: rough, bough, cough, laughter, slaughter etc. might give it trouble (they certainly gave Ricky Ricardo a headache).
  
  It would have to do a lot more than simply translate the text to phonemes to be effective with English.
- Re:I bet it will choke... (Score:2, Informative)
  
  by thelexx ( 237096 ) writes:
  
  Which is why the other language is Chinese. I remember hearing years ago that Chinese is very well suited for voice recognition due to the fact that it is a tonal language with a total set of only a few hundred distinct sounds. Not sure if this is true just for Mandarin or also Cantonese and the others.
  
  LEXX
  - - Re:I bet it will choke... (Score:2)
      
      by Spankophile ( 78098 ) writes:
      
      Far as I know:
      
      Taiwan = Mandarin
      Mainland China = Mandarin
      HongKong = Cantonese
      Toronto = Cantonese
- - +1 Funny, +1 Informative on the MQR standard (Score:2)
    
    by MarkusQ ( 450076 ) writes:
    
    The Chaos
    G. Nolst Trenité
    Dearest creature in creation,
    Study English pronunciation.
    Spot on! Not only would it have to disambiguate homonyms by semantic context, it would even need to use poetic context. Great poem!
    -- MarkusQ
Good for the Blind. (Score:2, Insightful)

by jellomizer ( 103300 ) writes:

Any technology that can translate text to words is a good thing so the Blind people can have less of a hard time with technology which is mostly sight driven. But of course with my Really bad spelling it could drive people nuts. (Yea Yea Lern to spell and that will fix the problem) But I always want the feature to disable it no matter how low processing power it uses. Speaking is generally slower then reading. Plus there is some times were your concentration dosent need a computer speaking to you.
- Don't count on Hearing a Slashdot headline (Score:1)
  
  by Erris ( 531066 ) writes:
  
  The blind have gotten a tremendous shaft from the WWW maintainers. Don't think so? Try surfing with images turned off! Most pages start off with a blizard of adverts, banners and other junk that gets in the way of real content. Think about having to listen to the same 12 item banner for each new page that contains one or two images, no real links and something dumb for text like a copyright notice. Many government sites are well designed. Other sites like Slashdot can be customized but many have gone in the exact opposite direction. Using images for navigation might look nice, but it usually is not.
  M$'s Front Page is one of the worst offenders. It's full of useless font adjustments and other needless code. Worse, it lables images crypticaly and encourages all of the worst practices.
  As Bill Gates once said, software is what is lacking in a world full of technology. He aims to keep it that way for those who trust him.
Kavita Maharaj (Score:2)

by blair1q ( 305137 ) writes:

Does it sound like Kavita Maharaj?

Because I swear, sexy though it is, her voice is synthesized.

--Blair
Phenomes? (Score:4, Insightful)

by Sodium Attack ( 194559 ) writes: on Wednesday November 07, 2001 @05:04PM (#2534420)

The low-power chip scans the text and translates it into spoken phenomes and outputs it to a filter for smooth analog sound, or can directly output the digital signal.
But is it smart enough to pronounce the boldfaced word above as "phonemes"?

Share
twitter facebook
Hrmm... (Score:2)

by dasmegabyte ( 267018 ) writes:

Didn't I see this first in "Wargames?"

Incidentally, a guy I work with has a father who designs for Chrysler. He said that the big D-C was "really interested" in applications of text to speech. Think about it: ebooks that read themselves to you while you drive, driving directions and traffic info read to you rather than displayed on a screen (most nav screens require you to take your eyes entirely off the road and down the dash as much as 18 inches...eep!). You've got a much more useful interface, and with a low cost(though they'll charge you a grand, i'm sure) , easy to interface chip, they'll have no excuse not to bring this much safer system for data interaction to my dash today, and not six years from now.
- - Re:Hrmm... (Score:2)
    
    by dasmegabyte ( 267018 ) writes:
    
    Yes, but read a lot of books that don't have a book on tape available. Like 95% of the market. The only books available on tape are best sellers or otherwise popular books...if i want to listen to, say, Dirk Gently's Holistic Detective Agency or the Faerie Queene or RedHat Linux Unleashed, books I can get on ebook but not as a book on tape, i'm up a creek.
    
    Of course, a text to speech reader isn't going to sound anywhere near as nice as an audiobook...but until you can find an audiobook of my email, or my students' papers, or the latest press release from Sun Microsystems, i'll take the coder.
read your e-mail outloud? (Score:1)

by Peyna ( 14792 ) writes:

I can just see it now. You're sitting on a bus on your way to work, reading your e-mail when all of a sudden you hear: "ENLARGE YOUR PENIS NOW!" "XXX GIRLS INSIDE, CUM JOIN US" "COME INSIDE AND LICK MY..." well you get the idea. I hope you have headphones!
- Re:read your e-mail outloud? (Score:2)
  
  by inburito ( 89603 ) writes:
  
  Uh.. who seriously would have their private e-mail read out loud in a bus?
  
  "As your accountant I need to inform you that..",
  "Here is your divorce settlement proposal..",
  "This is your doctor. Test results came in. You have..",
  etc..
  
  In comparison some x-rated junk mail might actually make some poor fellows day..
This'll be great for phone spam (Score:2, Funny)

by saarbruck ( 314638 ) writes:

Can you imagine getting your Hotmail spam read to you while your significant other is hanging around?

Phone: "Are you looking for hot [chicks|sex|pussy|love]?"

Wife: "um... what was that, honey?"

Phone: "Get your University diploma!"

Wife: "What, I'm not good enough the way I am?"

Phone: "Get out of debt now!"

Wife: "Okay, you know what? That's your birthday present on the Credit card, bucko. That's it. I'm leaving..."
Companies in this business (Score:1)

by aspillai ( 86002 ) writes:

Right now there are companies exclusively in the business of doing this. There is very little mathematical challenge here. Psychologists and linguists have researched phenomes and related material pretty well.

But there are some implementation issues here. Example, if you have GNU. How do you say it? What about if you have Jekka Pukka Sarasate? If you were to take the literal English pronounciation you might never even be able to understand what it's trying to say. Figuring out how to solve that is an interesting CS problem.

But this is a cool invention. Low power wireless research is just taking off. Before we were trying to figure out how to just transmit wireless well. Now we can have fun with it. I truly look forward to a wireless life :)

Me..
Better in the Next Generation (Score:2, Funny)

by Murdock037 ( 469526 ) writes:

Yeah, but you know the power consumption would be a hell of a lot higher on the chip that everybody would really want anyways: the Text-to-Barry-White-Speech chip.

"You've got mail, baby."
Great for ordering Chinese food. (Score:1)

by 10.0.0.1 ( 153985 ) writes:

Will it translate into broken chinenglish? "Me want pork fried rice. Chop chop."
Catching up with an Amiga? (Score:1)

by DumbSwede ( 521261 ) writes:

My old Amiga had decent text to speech processing in 1984.
With all the horsepower available in any modern handheld device -- surely much more than an 8mhz 68000 with 512K of memory (of which only a fraction was used I'm sure) -- I don't understand why a dedicated chip would be needed to pull this off.
- Oops, 1986 (Score:1)
  
  by DumbSwede ( 521261 ) writes:
  
  Sorry, I was off my 2 years.
- - Re:Catching up with an Amiga? (Score:2)
    
    by uradu ( 10768 ) writes:
    
    > I don't know what model Amiga you had, but if you define decent as "sounding like a robot that
    > forgot what intonation was but could alter its voice half an octave to simulate slight masculine
    > or slight feminine undertones" then I'll agree with you.
    
    Well, that was if you fed text to the translation device, which did its best to generate the required phonetic output--also in ASCII--that was fed to the speak device. This translation could be pretty rough, and could be much improved upon if you generated your own raw phonetic output. You could smooth out, lengthen, shorten, or intonate individual phonemes that way, making the output sound much better. Basically, the translation device needed a good rewrite.
Yeah, but... what about everyone else? (Score:1)

by cd_Csc ( 151701 ) writes:

We all know how annoying it can be when other people to have their cell phones ring in public places... the last thing we need is people listening to a monotonous computer voice in public. Not to mention the fact that its usually much more convenient to read text which allows for skimming and variable speeds.
So what? (Score:1)

by Erik Fish ( 106896 ) writes:

Why is this a big deal? I was doing text to speech on my Commodore 64 when I was a kid with a program called SAM (which was written in 1979). The C64 had what...1MHz?
Post your Ideas here! (please) (Score:2)

by Uttles ( 324447 ) writes:

"We are looking at devices that don't necessarily have a really powerful processor on board," said Hezi Saar, product marketing manager at Winbond. "Usually most of the accessories for handheld devices don't have the power to run text-to-speech algorithms and they don't have the huge memory capacity to support this feature."

OK, so just imagine that in the near future anything and everything will have one of these small, low cost chips. Now, imagine the possibilities! Everyone I'm sure has their own ideas on how cool this could be, so go ahead and reply with yours!
- Re:Post your Ideas here! (please) (Score:3, Interesting)
  
  by Uttles ( 324447 ) writes:
  
  1) My microwave at home displays "ENJOY YOUR MEAL" when it's finished cooking something, I'd sure love it if instead of cheesy LED's I heard a sexy voice saying "come and get it, baby."
  
  2) Text messengers for blind people. You know those little IM devices all the kiddies have? Well just put brail on the keys and have one of these chips installed... there you go.
  
  3) Watches. The next time somebody says "what time is it?" you just press a button and the voice chip in your watch simulating someone who sounds extremely pissed off shouts the time.
  
  Well, that's it for now...
Remember when Speak'nSpell was a new thing (Score:1)

by iplayfast ( 166447 ) writes:

They used speech parts to make up words in the same way. (Hope this sounds better though)
They've had stuff like this for a while ... (Score:1)

by wobblie ( 191824 ) writes:

I remember one of my coworkers had gotten a service where you could email his account, which would forward to a voice mail text -> speech system.

It was hilarious sending him obscene and or ridiculous emails and listening to the recorded voice play them back ...
Already done. (Score:2)

by Doc Hopper ( 59070 ) writes:

Well, I jumped ship to this little company I work for now called Talk2 Technology (free plug, I guess). We've taken a different tack in voice-enabling applications. I think there are different target markets -- the Talk2 stuff uses servers on the back-end, which go out and fetch your email to read it to you. Putting this on-chip in the cell phone itself is a great step in the right direction.

Fundamentally it's a different approach than today's "voice portal" technology. Voice Portals retrieve data for you, and read it over standard cell or PSTN network. There are many benefits to this approach, principal among them being improved processing power for additional functionality such as voice-processing (speech to text, or compressing speech for reply email voice attachment). By putting the power into the phone, instead of at an expensive central office, this chip could either be a great advancement for text-to-speech technology, or a "killer app" that puts my company out of business :)

Regardless, I'm excited to see this happening. I've long envisioned a PDA with the only interface being spoken, rather than requiring any video component. This would bring the power consumption and delicacy of these devices down within reason for extended usage. The downside is that speech is necessarily a rather slow interface to a machine; it will be interesting to see how we adapt speech for greater speed with speech-based devices, and how English as a whole will fare.

Now that I've used voice-enabled email, it would be really hard to go back to the "old" way. I still do an enormous amount of correspondence every day by typing, but when I'm on the road I don't need to bother with a laptop since I can have my email read to me over the phone *and reply* with a voice message via email. Until you've used it, it's tough to realize how convenient it is.

I want one of these for my Agenda VR3! Or something...
Its about time... (Score:2)

by jasno ( 124830 ) writes:

I remember my first computer - a ti99/4a - had a box I plugged into the side that generated speech. It didn't sound all that good, but you could recognize it well enough. If I remeber correctly, it cost about $100.

That was... 21 years ago. Its sad that this aspect of human computer interaction has been overlooked for so long. Its nice to finally see some development.
Mandarin (not the orange) (Score:4, Insightful)

by gbrandt ( 113294 ) writes: on Wednesday November 07, 2001 @05:30PM (#2534561)

If the you feel that you have to state 'not the orange' when using the word Mandarin in a language context, perhaps you should also state 'not the peoples of the England' when using the word English in the same context.

Share
twitter facebook
Just what I need (Score:2, Funny)

by endersdad ( 181957 ) writes:

Another female voice calling my cell phone and telling my i'm offtopic...
Winbond's Whitepaper (Score:2, Informative)

by jdclucidly ( 520630 ) writes:

I haven't seen anyone post a link to Winbond's own web page on the WTS701 Text-to-Speech Processor so, here it is straight from the mouth:

Winbond [winbond.com]
Bandwidth problem -- audio is dead (Score:3, Interesting)

by RobertGraham ( 28990 ) writes: on Wednesday November 07, 2001 @05:41PM (#2534623) Homepage

I guess I'm a little skeptical of all technology that attempts to supply "old" paradigms to new problems.
The most important thing about the Internet is "bandwidth". I'm not talking bits on the wire, I'm talking how fast information flows into my brain. Speech is vastly slower than text as a medium for transfering information into my brain. I'm so accustomed to Internet speeds for information, I can no longer watch TV news -- the bandwidth is too slow. I'm glad I don't go to school anymore -- I could barely stand lectures when I was a kid, I would never be able to sit through them as an adult.
Five years ago everyone in Japan walked around with their phone to their ears. These days, everyone in Japan walks around looking at their phone (instant messaging, etc.). I'm not sure if people "get" the bandwidth problem. Sound must be multiplexed into half-bandwidth, serialized communication. By this I mean you can only input or output at the same time, but not both. Also, incoming messages must arrive separately, not in parallel. With audio, I can only talk to one person at a time, with messaging, I can carry on multiple text-based conversations simultaneously. I mean, text-to-voice has long been availabe on PCs, but nobody uses it for ICQ/AIM/YahooIM/MSIM.
As far as I can tell, audio is dead. Maybe somebody will invent some sort of hyperfast language (didn't Heinlein describe something like that in a book?), but I think the next wave is going to be something new that replaces reading text, not something that goes backwards to audio.

Share
twitter facebook
- Re:Bandwidth problem -- audio is dead (Score:2)
  
  by Relic of the Future ( 118669 ) writes:
  
  While reading your comment, something occured to me: you're right, when using IM software, I can carry on multiple conversations, i.e. I can get information from multiple sources at once, BECAUSE, I can read faster than someone can type. However, might that be because just about everyone can't type as fast as they can talk? So, those multiple conversations... are you really getting more bandwidth, or is it just comming in burst whenever someone hits the 'send' button? And what about cost for context switching? I think that overall there is some gain on total bandwidth, but that each individual conversation takes longer.
  
  Consider this: have you ever been IMing w/ several people, but then called one of them because it was _really_ important to get that conversation done _fast_? I have, and it makes it much harder to try to keep all the _other_ conversations going.
  
  Fast speach-to-text could give you the best of both worlds, but that's, of course, still a long way off.
Commodore C64: SAM (Score:3, Funny)

by twms2h ( 473383 ) writes: on Wednesday November 07, 2001 @05:42PM (#2534629) Homepage

Great achievement, my Commodore C64 could do that so many years ago that I don't even remember when it was. SAM, the speech synthesizer which could even "sing".

Has anything new happened lately? ;-)

Share
twitter facebook
- Re:Commodore C64: SAM (Score:2)
  
  by Velex ( 120469 ) writes:
  
  Great achievement, my Commodore C64 could do that so many years ago that I don't even remember when it was. SAM, the speech synthesizer which could even "sing".
  
  ::nods:: I remember programming my TI-99/4A to read me a menu of games whenver I started it up. Then there was that text-to-speech program for my 386 that came with my soundblaster that enabled me to make my computer announce that it was booted up and ready for his l33tness Velex himself to use dos. Just a few months ago, my roommate download this monkey called Bonzi that talked to him, but Bonzi got annoying so my roommate shot him.
  
  I'm sure that there's been tons of text-to-speech programs that I've never heard of, no will I ever, because it's been done so many times before, and the AI required to get the computer to talk in an un-Vice Fearless Leader #42 fashion is beyond the grasp of even the most 31337 at the moment. What I would really like for my mobile phone is rudementary speech recognition.
  
  As it's been pointed out before, text is just simply faster than speech, and who knows, maybe twenty years down the road we'll all carry around little AIM or ICQ devices. What I'd really like to do, though, is skip the minature keyboards or fumbling on a keypad. Speech regonition is the way to go. I'd much rather tell my phone, "call pink" and have it call pink back at the hotel, than have it announcing all my spam to the world.
  
  No, unfortunatly nothing new's happened lately
Technology advances... (Score:3, Funny)

by Znork ( 31774 ) writes: on Wednesday November 07, 2001 @05:46PM (#2534644)

...and everything gets slower. I read between 2-20 times faster than I can comprehend spoken language, depending on the junkfiltering that's possible.

No way in hell do I want to read email on a cell phone (it's a PHONE. You _talk_ to people in it. If it was a generic mail reader it would have at least a 17 inch monitor and a keyboard that lets you type faster than .2 cps. I know this is a difficult concept to grasp for certain cell phone companies, but a phone, as opposed to a computer, does not have these things, and thus it _sucks_ for email and browsing, and will continue to do so until it has those things, at which point in time you will not want to carry it around because it aint gonna fit in your pocket anymore.). Nor do I want to listen to my email. I dont have the time or the patience for it.

At least until the phone can give me an (intelligent) summary when I say 'Get to the point'.

Share
twitter facebook
What Happened to TI (Score:2)

by farrellj ( 563 ) writes:

Texas Instruments used to have some of the best Speech Synthsis chips out there...I remember the TI/99 computer had a speech module, and one skiing game got both male and female realistic sounding voices out of the speech module. If they could do it in the early 1980's why can't they do it now?

ttyl
Farrell
- Re:What Happened to TI (Score:2)
  
  by Detritus ( 11846 ) writes:
  
  From what I remember about the early speach synthesis systems, they cheated. The input to the synthesis module was a stream of phonemes, not normal text. The programmer had to translate the text into phonemes.
  - Re:What Happened to TI (Score:2, Informative)
    
    by CodeShark ( 17400 ) writes:
    
    Yes, the programmer had to translate the text into phonemes, but you know what? That wasn't the hard part, because with a decent translation dictionary, you can get about 95% of the words right with a simple one pass "phoneme compiler" and a good set of rules. The hard part is that English is a highly inflected language without a constant set of rules for doing the inflections.
    Still, I was part of the team that made the first Apple II (at least in the State I lived in at the time) that could read from the screen back in 1981 -- to an "Echo II Speech Synthesizer" which IIRC came from Radio Shack.
    We took some of our stuff to the linguistics department at the University across town, and of all things, had the darn machine speaking understandable Japanese (from Romaji, or romanized letters) within a few days because the Japanese language is consistent not only in phonetic translation but also in inflection. It still sounded like a machine, but that was a limitation of the sound chip's internal phoneme library in the Echo II. The same program with one of today's chips would have sounded very near normal.
    Goes to show you how much more difficult spoken English is than most of us native speakers tend to realize, because I have yet to see a low cost implementation of a text to speech translator that was all that much better than what we were doing back in '81. (not that I have seen everything out there by the way -- I do have a life outside the PC world....occasionally :-)
language transulation? (Score:2)

by josepha48 ( 13953 ) writes:

It would be interesting if this could be used for language transulation. Turn on Spanish mode (when they get it) and type the spanish words you see into your phone and viola.. it tells you what it means in english. Sort of like a speaking babelfish if you will.
Now all we need are really good speach to text converters....
Memory density (Score:2, Funny)

by pogofish ( 514289 ) writes:

Unlike everybody who posted "big deal, my Commodore 64 used to hold long, sexy conversations with my Speak & Spell about the meaning of Wargames," I actually read the article. Near the end it says "The multilevel storage memory system allows the chip to store up to 256 different voltage levels, or the equivalent of 8 bits, into one EEPROM cell, which is up to 8x the capacity of conventional memories..."

Being a software geek with my last classes in EE/CE several years safely my sordid past, I'm out of touch. Is this a big deal?
I'll show you low power... (Score:2)

by x136 ( 513282 ) writes:

My Macintosh SE (8MHz 68000, circa 1988) can convert text to speech no problem. It's not necessarily smooth and natural, but it can't be *that* much of a jump...
- Re:I'll show you low power... (Score:2)
  
  by GiMP ( 10923 ) writes:
  
  Yeah, macs have great text->speech.. even the old ones. Although, the older ones had it on a chip.. whereas apple has since moved it to software.
  
  For obsolete machines, they still pack a punch as a server for text->speech conversion. :)
Any one remember... (Score:2)

by Pope Slackman ( 13727 ) writes:

The General Instruments SPO256 chipset?
The '256 took coded phonemes an outputted audio,
while the other chip in the set (don't remember the name) took ASCII serial data and
converted it to phoneme codes the '256 could understand.

This set has been around for prolly close to 20 years now. (I remember finding a variant of it in
the Intellivison voice module ["Bee Sevunteen Bahlllllmer"!] that I believe was circa 1984.)
The '256 has been discontinued for a long time now, and I'm kinda excited to see
something similar to it show up, it was a cool gadget.

C-X C-S
Hey (Score:2, Interesting)

by Infosquawk ( 131022 ) writes:

I like to convert text to mp3s for long journeys so I can listen to Dickens on my Rio. Of course, that takes a lot of disk space. I'd much prefer a little handheld device that simply converts the .txt file which is much smaller, to speech.

I'd pay for it, and I bet a bunch of other people would too.
Good for accessibility (Score:2)

by kimihia ( 84738 ) writes:

Such a device will be very handy for people that have visual impairments. Instead of the current bulky and expensive kits, this will be an improvement, especially for VI users out-and-about.

What can you do? Make your web pages accessible [w3.org] for a start.
The State of the Art... (Score:2, Informative)

by Sam Lowry ( 254040 ) writes:

Specialized chips for TTS applications has
been around for a while... The problem with
their acception is that they have poor voice
quality. Actually, ther are tho quite different
technologies available to produce text nowadays:
1. Diphone synthesis and its variations. The idea
is to have one sample of each sound compination
(diphone) in a speechase and produce the actual
speech by manipulating those sounds. This is what
give computer-syntethized, somewhat metallic speech
that most people have already heard somewhere and
this is what actually used in low-powered devices,
handhelds and speaking dictionaries.

2. Corpus-based synthesis. The idea is to store
a few hour of the speech of a highly trained
speaker in the speechbase and select fragments
of this speech that suit best for the genaration.
The second approach gives astonishing results with
the quality of the speech being sometimes
undistinguishable for the human. However, the size
of the speechbase is an issue. You can not fit a
300Mb speechbase onto a handheld hevice yet
and hardware optimizations dont help much when
it conserns fetching data from the speechbase
and performing text-to-phonemes conversion.

Several companies have corpus-based synthesis
demos on-line. Check out SpeechWorks' and
Lernout & Hauspie's sites

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Done that (Score:4, Interesting)

Re:Done that (Score:1)

Re:Done that (Score:1)

Dr. Sbaitso (Score:1)

Re:Dr. Sbaitso (Score:1)

Re:Dr. Sbaitso (Score:4, Funny)

Re:Dr. Sbaitso (Score:1)

Re:Dr. Sbaitso (Score:2)

Re:Dr. Sbaitso (Score:3, Informative)

Re:Dr. Sbaitso - no it's bruce (Score:1)

Mac OS has that (Score:1)

Re:Mac OS has that (Score:3, Funny)

Re:Mac OS has that (Score:2, Insightful)

Re:Mac OS has that (Score:3, Informative)

Re:Mac OS has that (Score:2, Informative)

Commodore 64 has that! (Score:2)

Re:Commodore 64 has that! (Score:2)

Re:Mac OS has that (Score:2)

Oh no...... (Score:2)

Re:Oh no...... (Score:2, Funny)

Re:Oh no...... (Score:1)

Re:Oh no...... (Score:1)

Great for all sorts of devices. (Score:3, Interesting)

Re:Great for all sorts of devices. (Score:2)

wouldn't it be easier.... (Score:5, Informative)

Re:wouldn't it be easier.... (Score:1)

Re:wouldn't it be easier.... (Score:1)

Converting text to an orange ... (Score:2, Funny)

Re:Converting text to an orange ... (Score:1)

Great... (Score:2, Funny)

Re:Great... (Score:2)

pr0n related use? (Score:2, Funny)

A Possibility... (Score:2, Funny)

Nothing new (Score:1)

Re:Nothing new (Score:5, Informative)

Re:Nothing new (Score:1)

The only problem with this.... (Score:2)

applications... (Score:1)

Text 2 Speech on Linux (Score:1)

Why dump more tech than necessary into the phone? (Score:4, Insightful)

Re:Why dump more tech than necessary into the phon (Score:1, Interesting)

Re:Why dump more tech than necessary into the phon (Score:1)

Re:Why dump more tech than necessary into the phon (Score:5, Insightful)

Re:Why dump more tech than necessary into the phon (Score:3, Informative)

Re:Why dump more tech than necessary into the phon (Score:3, Interesting)

seeking more info (Score:2)

thanks Kynn (Score:2)

Re:Why dump more tech than necessary into the phon (Score:2)

Re:Why dump more tech than necessary into the phon (Score:2, Insightful)

Re:Why dump more tech than necessary into the phon (Score:2)

Re:Why dump more tech than necessary into the phon (Score:2)

Old news man ... (Score:1)

Oh the possibilities! (Score:1)

I bet it will choke... (Score:5, Insightful)

Re:I bet it will choke... (Score:2, Insightful)

Re:I bet it will choke... (Score:2, Informative)

Re:I bet it will choke... (Score:2)

+1 Funny, +1 Informative on the MQR standard (Score:2)

Good for the Blind. (Score:2, Insightful)

Don't count on Hearing a Slashdot headline (Score:1)

Kavita Maharaj (Score:2)

Phenomes? (Score:4, Insightful)

Hrmm... (Score:2)

Re:Hrmm... (Score:2)

read your e-mail outloud? (Score:1)

Re:read your e-mail outloud? (Score:2)

This'll be great for phone spam (Score:2, Funny)

Companies in this business (Score:1)

Better in the Next Generation (Score:2, Funny)

Great for ordering Chinese food. (Score:1)

Catching up with an Amiga? (Score:1)

Oops, 1986 (Score:1)

Re:Catching up with an Amiga? (Score:2)

Yeah, but... what about everyone else? (Score:1)

So what? (Score:1)

Post your Ideas here! (please) (Score:2)

Re:Post your Ideas here! (please) (Score:3, Interesting)

Remember when Speak'nSpell was a new thing (Score:1)

They've had stuff like this for a while ... (Score:1)

Already done. (Score:2)