Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Hardware

Text-to-Speech on a Low-Power Chip 263

bluephone writes: "The EE Times has a story on a new chip from Winbond that can take ASCII or UNICODE text and convert it to either spoken English or Mandarin (the Chinese language, not the orange). The low-power chip scans the text and translates it into spoken phenomes and outputs it to a filter for smooth analog sound, or can directly output the digital signal. Imagine a cell phone with this, you can have your email read to you, rather than seeing a line at a time on a dinky screen, street directions from a website, or even Slashdot's headlines. :)"
This discussion has been archived. No new comments can be posted.

Text-to-Speech on a Low-Power Chip

Comments Filter:
  • Done that (Score:4, Interesting)

    by beretboy ( 221801 ) on Wednesday November 07, 2001 @04:51PM (#2534338)
    I have a little bot written in perl and VXML that reads my email. It is far esier than making the phone do the processing, and ita free. see studio.tellme.com
  • Anybody remember Dr. Sbaitso? This program was great for being written way back when (1994?).

    I believe Radiohead used it as the voice for their track "Fitter Happier" on OK Computer.
  • MacOS has had that for a while. It works ok. In fact, by default in OS 10.1 it speaks modal error dialogs. It surprised me the first time this happened.
  • man, if they put simplt text (apples text/scripting/voice filter program) on one of those things, no one will gt there work done!!!
    • by SPK ( 8321 )
      yes, but simple text and Chinese ... that's like comparing apples to oranges, no?
      • yes, but simple text and Chinese ... that's like comparing apples to oranges, no?

        Not in this case, since the chip translates text in to written phonemes. This just means that the Chinese language text would have to be written phonetically, in simple text. How it handles the tonal aspect of Mandarin is another story, but I suspect that there is some phonetic writing scheme that accounts for this.
  • by Negadin ( 261695 ) on Wednesday November 07, 2001 @04:54PM (#2534353)
    Cell phones, PDA's, perhaps new tools for people with vision disabilities, where it could pick up plain text via IR near busy intersections or information kiosks. Text is small, broadband wouldnt be required, since its all converted in real time on a chip. Since it is supposed to be low-powered, it would be great for devices that didnt need to be recharged often, like the pagers mentioned in the article.

    I wonder how lifelike the voice is though. I don't think any text-speech tools are going to become very mainstream untill they sound better.
  • by CodePoet82 ( 177189 ) on Wednesday November 07, 2001 @04:54PM (#2534354) Homepage
    as the writeup said, this could be used in a cellphone to read what you were looking at, but wouldn't it be simpler, and backwards compatible, to just do text to speach synthisis on a remote computer. every cell phone out there can already just transmit the sound from a remote location, and it wouldn't require any new/expensive chips.
  • would be something we can all be impressed with.
  • Great... (Score:2, Funny)

    by Anonymous Coward

    That's all I need, Stephen Hawking's voice coming at me from my cell phone:

    "Get Viagra Because You Care !! Watch the Famous Jennifer Lopez p0rn video Swing LOOOW, Sweet Chariot dash dash dash dash dash 8 eff ess arr...."

    Anonymous cowards love the rich meaty taste of spam.

    • Heh, I already have this daily on my cell phone using "Voicelink", a service provided by WorldCom (and my employer, Talk2 Technology). I routinely get emails like this:

      From: root
      Subject: cron kill `ps -ef | grep username | awk '{print $2}'`

      Just imagine how that sounds read back to you over your cell phone. It really beats having to lug a laptop with me just to check my email, but the kinds of email a sysadmin receives often don't translate well into spoken English. However, it's fun to hear this female voice try to get it right. One of these days, I've gotta get together with the programmers here and make sure these things get read right, like "kill back-tick pee-ess dash ee-eff pipe grep user-name"...
  • could i set it to a deep erotic female sounding voice and have it read dirty stories to me?
  • Can you imagine millions of geeks' cell phones with this chip? "First Post" echoing throughout the world...
  • Radio Shack used to sell these chips years ago. I once built an automated model rocket launcher that used the chip to announce the countdown - pretty damn slick, if you ask me. I believe the same chip was also used in the old TI-99/4A speech synthesizers (if anyone else remembers those).

    There's really nothing new about this product, except for its ability to speak Mandarin. And given the state of the Chinese economy, it's not very likely that many citizens over there will be in the market for talking electronic devices anytime soon. Most of them are still trying to get phone service and running water.

    -CT

    • Re:Nothing new (Score:5, Informative)

      by morcheeba ( 260908 ) on Wednesday November 07, 2001 @05:05PM (#2534426) Journal
      No, those chips (it was a pair) were power-hungry 5 volt parts made by General Instrument. One was a microcontroller (8051?) with the text-to-phonome algorithm, and the other was the phonome-to-audio processor (GI SP0256). Actually, the SP0256 could accept external roms for specialized words, so it could have spoken in any language you wanted.

      Check out quadravox [quadravox.com] for boards that emulate the SP0256, using ISD's analog flash memory and a microcontroller.

      (My misadventure with the old GI chip: -12 instead of +5, just for a split second. After that, it developed an stutter!)
    • Rat Shack used to sell a chip 'like' this, not these exact chips. The voice quality of these chips is much improved, and it has MANY MANY new features. Perhaps you should read the article and actually check out the specs of the chip before running it down, hmmm?
  • would be reading /. headlines. I mean, text-to-speech is great, but can it spell-check at the same time?
  • The chip could enable items such as a teddy bear that lulls a child to sleep by reading a bedtime story with the pre-programmed voice of Winnie the Pooh.

    Think of the applications for blow up dolls and pr0n!

  • I've tried festival on Linux, and it's output is always really fuzzy and hard to understand. Do any of you know of any alternative programs that are more discernable in their delivery of voice? I would love to have my Linux box talk to me like one of those sexy Imac operators...
  • by StaticEngine ( 135635 ) on Wednesday November 07, 2001 @04:57PM (#2534381) Homepage
    Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the cellphone really the way to do it? The phone can already send and transmit voice, so why not keep the text-to-speech synthesis at some central server where the systems can be maintained and upgraded, rather than having to support/manufacture/refurbish thousands of phones out in the field?

    The cellphone may have all the power of an original Palm Pilot these days, but we don't need to make it into a Onyx Server.

    • by Anonymous Coward
      Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the phone really the way to do it? Why put a radio transmitter and receiver in the headset-- if you do that, then you'll need a battery to power it. Stay with the corded phones, I say! And, if you have more than one person in a city talking on these newfangled radiophones, you'll need a computer to set the radio frequency! My Gremlin's 8 track player/AM radio doesn't need a computer to change channels -- it's got those big preset buttons to move the dial for me. The cell phone may have all the power of an trs-80 these days, but we don't need to make it into an IBM-PCjr.

      p.s. and don't even get me started on digital phones... converting analog to digital to analog baseband to RF, and then back again!
    • by DdJ ( 10790 ) on Wednesday November 07, 2001 @05:42PM (#2534634) Homepage Journal

      Sure, the tech is cheap and relatively disposable, but is moving every feature but the kitchen sink into the cellphone really the way to do it?

      I've got a co-worker, our Oracle admin, who's blind. As things stand, with most cell phones he can't do anything except dial out and answer calls. He can't use the built-in address book to place calls for example, because all of the info is in text on a tiny screen. With text-to-speech software on the phone, he'd be able to use the address book just like sighted folks, read text messages he received earlier even when he's in an area with no coverage just like sighted folks, and so on. This is a good idea.
      • Wow! Your Oracle admin is blind? *im baffled*

        I have never worked with blind people, but after reading an article last year about how websites are getting more and more difficult for braille browsers (flash, imagelinks without alt tags etc.), I decided to make a lynx-friendly version of my site - and so should YOU!

        Anyways, how does he do it?? Is it worth it to the company you work for, or does it cause everyone else problems? Is he good? Tell! Hopefuly this could encourage others to take on "disabled" in their company....
        • Wow! Your Oracle admin is blind? *im baffled*

          ...
          Anyways, how does he do it?? Is it worth it to the company you work for, or does it cause everyone else problems? Is he good? Tell! Hopefuly this could encourage others to take on "disabled" in their company...

          He's got a variety of tools at his disposal. Just the other day, he gave a demo of some of them to a bunch of us.

          He's got an 8-dot braile terminal that gives him enough characters to do C and Perl programming. He's got a hardware speech synthesizer he cranks up to something like 200+ words per minute. I tried, and could only understand a few phrases when it was cranked up to 95 words per minute.

          And when a web site he needs or wants to access is inaccessible, he complains to them, and sometimes things get fixed. He can navigate web sites that use alt tags remarkably well. A good rule of thumb is that if a site makes sense with images turned off (or in lynx), then it'll work for him.
      • I've got a co-worker, our Oracle admin, who's blind. As things stand, with most cell phones he can't do anything except dial out and answer calls

        I hope he practices safe cell phone use and doesn't call out while he's driving.../humour

    • You have to remember that economics is what drives these things. If there are yuppies or geeks out there who want to have "every feature but the kitchen sink" in their cellphone, PDA, or whatever, there will be a company out there that will be happy to take their money to implement these technologies.
    • It's easier on bandwidth to just send a few hundred bytes of text than streaming audio.
      • Well, not necessarily. The cell and PSTN networks are designed around carrying audio and that is still what they do best. Today, it's a toss-up as to whether it's better to approach text-to-speech from the back-end (where you can have more flexibility) servers, or by embedding pieces into phones which gives you a whole new set of problems and potentially great solutions.

        The problem is, the idea of using this tech in phones is fighting against hundreds and hundreds of millions of deployed telephones without any tech newer than perhaps a microchip for caller ID. Over the long-term, text-to-speech embedded in the device is the more efficient and user-controllable format. Over the short haul, though, we're going to see many years still of central-office-controlled voice apps on your phone.

        Niche applications, like on a Pocket PC, now there something like this would absolutely rock. Get a toehold, and eventually low-power text-to-speech and speech-to-text devices will be all the rage.

        Now if only someone would perfect a speech-to-text engine that didn't require hours of training to recognize my accent...
  • /me gets out his Speak n Spell.
  • Oh imagine them! TO have one of these babies that understood more languages, and could translate them to one of the others. Need a real translator? Nooope, watch my lil PDA Do it for me!

    Seriously, that could have tremendous bussiness implications for those who are doing bussiness in other countries.

    Their usage of EEPROM is nothing ut ingenious, why hasn't anyone done this before? Or have they? It makes a lot more sense then a flash card, and it's cheaper too.
  • by MarkusQ ( 450076 ) on Wednesday November 07, 2001 @05:00PM (#2534401) Journal
    I bet it will choke on:

    The lead story read: "Unionized environmental health workers object to new chip that can read un-ionized lead levels."

    Reading english is a lot tougher than most English speaking people think.

    -- MarkusQ

    • I would assume that it could recognize and handle a dash appropriately.

      You're right though: rough, bough, cough, laughter, slaughter etc. might give it trouble (they certainly gave Ricky Ricardo a headache).

      It would have to do a lot more than simply translate the text to phonemes to be effective with English.
    • Which is why the other language is Chinese. I remember hearing years ago that Chinese is very well suited for voice recognition due to the fact that it is a tonal language with a total set of only a few hundred distinct sounds. Not sure if this is true just for Mandarin or also Cantonese and the others.

      LEXX
  • Any technology that can translate text to words is a good thing so the Blind people can have less of a hard time with technology which is mostly sight driven. But of course with my Really bad spelling it could drive people nuts. (Yea Yea Lern to spell and that will fix the problem) But I always want the feature to disable it no matter how low processing power it uses. Speaking is generally slower then reading. Plus there is some times were your concentration dosent need a computer speaking to you.
    • The blind have gotten a tremendous shaft from the WWW maintainers. Don't think so? Try surfing with images turned off! Most pages start off with a blizard of adverts, banners and other junk that gets in the way of real content. Think about having to listen to the same 12 item banner for each new page that contains one or two images, no real links and something dumb for text like a copyright notice. Many government sites are well designed. Other sites like Slashdot can be customized but many have gone in the exact opposite direction. Using images for navigation might look nice, but it usually is not.

      M$'s Front Page is one of the worst offenders. It's full of useless font adjustments and other needless code. Worse, it lables images crypticaly and encourages all of the worst practices.

      As Bill Gates once said, software is what is lacking in a world full of technology. He aims to keep it that way for those who trust him.

  • Does it sound like Kavita Maharaj?

    Because I swear, sexy though it is, her voice is synthesized.

    --Blair
  • Phenomes? (Score:4, Insightful)

    by Sodium Attack ( 194559 ) on Wednesday November 07, 2001 @05:04PM (#2534420)
    The low-power chip scans the text and translates it into spoken phenomes and outputs it to a filter for smooth analog sound, or can directly output the digital signal.

    But is it smart enough to pronounce the boldfaced word above as "phonemes"?

  • Didn't I see this first in "Wargames?"

    Incidentally, a guy I work with has a father who designs for Chrysler. He said that the big D-C was "really interested" in applications of text to speech. Think about it: ebooks that read themselves to you while you drive, driving directions and traffic info read to you rather than displayed on a screen (most nav screens require you to take your eyes entirely off the road and down the dash as much as 18 inches...eep!). You've got a much more useful interface, and with a low cost(though they'll charge you a grand, i'm sure) , easy to interface chip, they'll have no excuse not to bring this much safer system for data interaction to my dash today, and not six years from now.
  • I can just see it now. You're sitting on a bus on your way to work, reading your e-mail when all of a sudden you hear: "ENLARGE YOUR PENIS NOW!" "XXX GIRLS INSIDE, CUM JOIN US" "COME INSIDE AND LICK MY..." well you get the idea. I hope you have headphones!
    • Uh.. who seriously would have their private e-mail read out loud in a bus?

      "As your accountant I need to inform you that..",
      "Here is your divorce settlement proposal..",
      "This is your doctor. Test results came in. You have..",
      etc..

      In comparison some x-rated junk mail might actually make some poor fellows day..
  • Can you imagine getting your Hotmail spam read to you while your significant other is hanging around?


    Phone: "Are you looking for hot [chicks|sex|pussy|love]?"

    Wife: "um... what was that, honey?"

    Phone: "Get your University diploma!"

    Wife: "What, I'm not good enough the way I am?"

    Phone: "Get out of debt now!"

    Wife: "Okay, you know what? That's your birthday present on the Credit card, bucko. That's it. I'm leaving..."

  • Right now there are companies exclusively in the business of doing this. There is very little mathematical challenge here. Psychologists and linguists have researched phenomes and related material pretty well.

    But there are some implementation issues here. Example, if you have GNU. How do you say it? What about if you have Jekka Pukka Sarasate? If you were to take the literal English pronounciation you might never even be able to understand what it's trying to say. Figuring out how to solve that is an interesting CS problem.

    But this is a cool invention. Low power wireless research is just taking off. Before we were trying to figure out how to just transmit wireless well. Now we can have fun with it. I truly look forward to a wireless life :)

    Me..
  • Yeah, but you know the power consumption would be a hell of a lot higher on the chip that everybody would really want anyways: the Text-to-Barry-White-Speech chip.

    "You've got mail, baby."
  • Will it translate into broken chinenglish? "Me want pork fried rice. Chop chop."
  • My old Amiga had decent text to speech processing in 1984.

    With all the horsepower available in any modern handheld device -- surely much more than an 8mhz 68000 with 512K of memory (of which only a fraction was used I'm sure) -- I don't understand why a dedicated chip would be needed to pull this off.

    • Sorry, I was off my 2 years.
  • We all know how annoying it can be when other people to have their cell phones ring in public places... the last thing we need is people listening to a monotonous computer voice in public. Not to mention the fact that its usually much more convenient to read text which allows for skimming and variable speeds.
  • Why is this a big deal? I was doing text to speech on my Commodore 64 when I was a kid with a program called SAM (which was written in 1979). The C64 had what...1MHz?
  • "We are looking at devices that don't necessarily have a really powerful processor on board," said Hezi Saar, product marketing manager at Winbond. "Usually most of the accessories for handheld devices don't have the power to run text-to-speech algorithms and they don't have the huge memory capacity to support this feature."

    OK, so just imagine that in the near future anything and everything will have one of these small, low cost chips. Now, imagine the possibilities! Everyone I'm sure has their own ideas on how cool this could be, so go ahead and reply with yours!
    • 1) My microwave at home displays "ENJOY YOUR MEAL" when it's finished cooking something, I'd sure love it if instead of cheesy LED's I heard a sexy voice saying "come and get it, baby."

      2) Text messengers for blind people. You know those little IM devices all the kiddies have? Well just put brail on the keys and have one of these chips installed... there you go.

      3) Watches. The next time somebody says "what time is it?" you just press a button and the voice chip in your watch simulating someone who sounds extremely pissed off shouts the time.

      Well, that's it for now...
  • They used speech parts to make up words in the same way. (Hope this sounds better though)
  • I remember one of my coworkers had gotten a service where you could email his account, which would forward to a voice mail text -> speech system.

    It was hilarious sending him obscene and or ridiculous emails and listening to the recorded voice play them back ...
  • Well, I jumped ship to this little company I work for now called Talk2 Technology (free plug, I guess). We've taken a different tack in voice-enabling applications. I think there are different target markets -- the Talk2 stuff uses servers on the back-end, which go out and fetch your email to read it to you. Putting this on-chip in the cell phone itself is a great step in the right direction.

    Fundamentally it's a different approach than today's "voice portal" technology. Voice Portals retrieve data for you, and read it over standard cell or PSTN network. There are many benefits to this approach, principal among them being improved processing power for additional functionality such as voice-processing (speech to text, or compressing speech for reply email voice attachment). By putting the power into the phone, instead of at an expensive central office, this chip could either be a great advancement for text-to-speech technology, or a "killer app" that puts my company out of business :)

    Regardless, I'm excited to see this happening. I've long envisioned a PDA with the only interface being spoken, rather than requiring any video component. This would bring the power consumption and delicacy of these devices down within reason for extended usage. The downside is that speech is necessarily a rather slow interface to a machine; it will be interesting to see how we adapt speech for greater speed with speech-based devices, and how English as a whole will fare.

    Now that I've used voice-enabled email, it would be really hard to go back to the "old" way. I still do an enormous amount of correspondence every day by typing, but when I'm on the road I don't need to bother with a laptop since I can have my email read to me over the phone *and reply* with a voice message via email. Until you've used it, it's tough to realize how convenient it is.

    I want one of these for my Agenda VR3! Or something...
  • I remember my first computer - a ti99/4a - had a box I plugged into the side that generated speech. It didn't sound all that good, but you could recognize it well enough. If I remeber correctly, it cost about $100.

    That was... 21 years ago. Its sad that this aspect of human computer interaction has been overlooked for so long. Its nice to finally see some development.
  • by gbrandt ( 113294 ) on Wednesday November 07, 2001 @05:30PM (#2534561)
    If the you feel that you have to state 'not the orange' when using the word Mandarin in a language context, perhaps you should also state 'not the peoples of the England' when using the word English in the same context.
  • Another female voice calling my cell phone and telling my i'm offtopic...
  • I haven't seen anyone post a link to Winbond's own web page on the WTS701 Text-to-Speech Processor so, here it is straight from the mouth:

    Winbond [winbond.com]
  • by RobertGraham ( 28990 ) on Wednesday November 07, 2001 @05:41PM (#2534623) Homepage
    I guess I'm a little skeptical of all technology that attempts to supply "old" paradigms to new problems.

    The most important thing about the Internet is "bandwidth". I'm not talking bits on the wire, I'm talking how fast information flows into my brain. Speech is vastly slower than text as a medium for transfering information into my brain. I'm so accustomed to Internet speeds for information, I can no longer watch TV news -- the bandwidth is too slow. I'm glad I don't go to school anymore -- I could barely stand lectures when I was a kid, I would never be able to sit through them as an adult.

    Five years ago everyone in Japan walked around with their phone to their ears. These days, everyone in Japan walks around looking at their phone (instant messaging, etc.). I'm not sure if people "get" the bandwidth problem. Sound must be multiplexed into half-bandwidth, serialized communication. By this I mean you can only input or output at the same time, but not both. Also, incoming messages must arrive separately, not in parallel. With audio, I can only talk to one person at a time, with messaging, I can carry on multiple text-based conversations simultaneously. I mean, text-to-voice has long been availabe on PCs, but nobody uses it for ICQ/AIM/YahooIM/MSIM.

    As far as I can tell, audio is dead. Maybe somebody will invent some sort of hyperfast language (didn't Heinlein describe something like that in a book?), but I think the next wave is going to be something new that replaces reading text, not something that goes backwards to audio.

    • While reading your comment, something occured to me: you're right, when using IM software, I can carry on multiple conversations, i.e. I can get information from multiple sources at once, BECAUSE, I can read faster than someone can type. However, might that be because just about everyone can't type as fast as they can talk? So, those multiple conversations... are you really getting more bandwidth, or is it just comming in burst whenever someone hits the 'send' button? And what about cost for context switching? I think that overall there is some gain on total bandwidth, but that each individual conversation takes longer.

      Consider this: have you ever been IMing w/ several people, but then called one of them because it was _really_ important to get that conversation done _fast_? I have, and it makes it much harder to try to keep all the _other_ conversations going.

      Fast speach-to-text could give you the best of both worlds, but that's, of course, still a long way off.
  • by twms2h ( 473383 ) on Wednesday November 07, 2001 @05:42PM (#2534629) Homepage
    Great achievement, my Commodore C64 could do that so many years ago that I don't even remember when it was. SAM, the speech synthesizer which could even "sing".

    Has anything new happened lately? ;-)
    • Great achievement, my Commodore C64 could do that so many years ago that I don't even remember when it was. SAM, the speech synthesizer which could even "sing".

      ::nods:: I remember programming my TI-99/4A to read me a menu of games whenver I started it up. Then there was that text-to-speech program for my 386 that came with my soundblaster that enabled me to make my computer announce that it was booted up and ready for his l33tness Velex himself to use dos. Just a few months ago, my roommate download this monkey called Bonzi that talked to him, but Bonzi got annoying so my roommate shot him.

      I'm sure that there's been tons of text-to-speech programs that I've never heard of, no will I ever, because it's been done so many times before, and the AI required to get the computer to talk in an un-Vice Fearless Leader #42 fashion is beyond the grasp of even the most 31337 at the moment. What I would really like for my mobile phone is rudementary speech recognition.

      As it's been pointed out before, text is just simply faster than speech, and who knows, maybe twenty years down the road we'll all carry around little AIM or ICQ devices. What I'd really like to do, though, is skip the minature keyboards or fumbling on a keypad. Speech regonition is the way to go. I'd much rather tell my phone, "call pink" and have it call pink back at the hotel, than have it announcing all my spam to the world.

      No, unfortunatly nothing new's happened lately

  • by Znork ( 31774 ) on Wednesday November 07, 2001 @05:46PM (#2534644)
    ...and everything gets slower. I read between 2-20 times faster than I can comprehend spoken language, depending on the junkfiltering that's possible.

    No way in hell do I want to read email on a cell phone (it's a PHONE. You _talk_ to people in it. If it was a generic mail reader it would have at least a 17 inch monitor and a keyboard that lets you type faster than .2 cps. I know this is a difficult concept to grasp for certain cell phone companies, but a phone, as opposed to a computer, does not have these things, and thus it _sucks_ for email and browsing, and will continue to do so until it has those things, at which point in time you will not want to carry it around because it aint gonna fit in your pocket anymore.). Nor do I want to listen to my email. I dont have the time or the patience for it.

    At least until the phone can give me an (intelligent) summary when I say 'Get to the point'.
  • Texas Instruments used to have some of the best Speech Synthsis chips out there...I remember the TI/99 computer had a speech module, and one skiing game got both male and female realistic sounding voices out of the speech module. If they could do it in the early 1980's why can't they do it now?

    ttyl
    Farrell
    • From what I remember about the early speach synthesis systems, they cheated. The input to the synthesis module was a stream of phonemes, not normal text. The programmer had to translate the text into phonemes.
      • Yes, the programmer had to translate the text into phonemes, but you know what? That wasn't the hard part, because with a decent translation dictionary, you can get about 95% of the words right with a simple one pass "phoneme compiler" and a good set of rules. The hard part is that English is a highly inflected language without a constant set of rules for doing the inflections.

        Still, I was part of the team that made the first Apple II (at least in the State I lived in at the time) that could read from the screen back in 1981 -- to an "Echo II Speech Synthesizer" which IIRC came from Radio Shack.

        We took some of our stuff to the linguistics department at the University across town, and of all things, had the darn machine speaking understandable Japanese (from Romaji, or romanized letters) within a few days because the Japanese language is consistent not only in phonetic translation but also in inflection. It still sounded like a machine, but that was a limitation of the sound chip's internal phoneme library in the Echo II. The same program with one of today's chips would have sounded very near normal.

        Goes to show you how much more difficult spoken English is than most of us native speakers tend to realize, because I have yet to see a low cost implementation of a text to speech translator that was all that much better than what we were doing back in '81. (not that I have seen everything out there by the way -- I do have a life outside the PC world....occasionally :-)

  • It would be interesting if this could be used for language transulation. Turn on Spanish mode (when they get it) and type the spanish words you see into your phone and viola.. it tells you what it means in english. Sort of like a speaking babelfish if you will.

    Now all we need are really good speach to text converters....

  • Unlike everybody who posted "big deal, my Commodore 64 used to hold long, sexy conversations with my Speak & Spell about the meaning of Wargames," I actually read the article. Near the end it says "The multilevel storage memory system allows the chip to store up to 256 different voltage levels, or the equivalent of 8 bits, into one EEPROM cell, which is up to 8x the capacity of conventional memories..."


    Being a software geek with my last classes in EE/CE several years safely my sordid past, I'm out of touch. Is this a big deal?

  • My Macintosh SE (8MHz 68000, circa 1988) can convert text to speech no problem. It's not necessarily smooth and natural, but it can't be *that* much of a jump...
    • Yeah, macs have great text->speech.. even the old ones. Although, the older ones had it on a chip.. whereas apple has since moved it to software.

      For obsolete machines, they still pack a punch as a server for text->speech conversion. :)
  • The General Instruments SPO256 chipset?
    The '256 took coded phonemes an outputted audio,
    while the other chip in the set (don't remember the name) took ASCII serial data and
    converted it to phoneme codes the '256 could understand.

    This set has been around for prolly close to 20 years now. (I remember finding a variant of it in
    the Intellivison voice module ["Bee Sevunteen Bahlllllmer"!] that I believe was circa 1984.)
    The '256 has been discontinued for a long time now, and I'm kinda excited to see
    something similar to it show up, it was a cool gadget.

    C-X C-S
  • Hey (Score:2, Interesting)

    by Infosquawk ( 131022 )
    I like to convert text to mp3s for long journeys so I can listen to Dickens on my Rio. Of course, that takes a lot of disk space. I'd much prefer a little handheld device that simply converts the .txt file which is much smaller, to speech.

    I'd pay for it, and I bet a bunch of other people would too.
  • Such a device will be very handy for people that have visual impairments. Instead of the current bulky and expensive kits, this will be an improvement, especially for VI users out-and-about.

    What can you do? Make your web pages accessible [w3.org] for a start.

  • Specialized chips for TTS applications has
    been around for a while... The problem with
    their acception is that they have poor voice
    quality. Actually, ther are tho quite different
    technologies available to produce text nowadays:
    1. Diphone synthesis and its variations. The idea
    is to have one sample of each sound compination
    (diphone) in a speechase and produce the actual
    speech by manipulating those sounds. This is what
    give computer-syntethized, somewhat metallic speech
    that most people have already heard somewhere and
    this is what actually used in low-powered devices,
    handhelds and speaking dictionaries.

    2. Corpus-based synthesis. The idea is to store
    a few hour of the speech of a highly trained
    speaker in the speechbase and select fragments
    of this speech that suit best for the genaration.
    The second approach gives astonishing results with
    the quality of the speech being sometimes
    undistinguishable for the human. However, the size
    of the speechbase is an issue. You can not fit a
    300Mb speechbase onto a handheld hevice yet
    and hardware optimizations dont help much when
    it conserns fetching data from the speechbase
    and performing text-to-phonemes conversion.

    Several companies have corpus-based synthesis
    demos on-line. Check out SpeechWorks' and
    Lernout & Hauspie's sites

"It's the best thing since professional golfers on 'ludes." -- Rick Obidiah

Working...