Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Communications Government Input Devices Privacy Your Rights Online

How the NSA Converts Spoken Words Into Searchable Text 164

Presto Vivace writes: Dan Froomkin reports at The Intercept: "Though perfect transcription of natural conversation apparently remains the Intelligence Community's 'holy grail,' the Snowden documents describe extensive use of keyword searching as well as computer programs designed to analyze and 'extract' the content of voice conversations, and even use sophisticated algorithms to flag conversations of interest." I am torn between admiration of the technical brilliance of building software like this and horror as to how it is being used. It can't just be my brother and me who like to salt all phone conversations with interesting keywords.
This discussion has been archived. No new comments can be posted.

How the NSA Converts Spoken Words Into Searchable Text

Comments Filter:
  • by ledow ( 319597 ) on Tuesday May 05, 2015 @11:15AM (#49621179) Homepage

    I can't even get a device - of any power - to recognise my voice beyond the very slow, pronounced basics and I have to train myself to it (not the other way around).

    Would love to know how the NSA have access to technology that the top voice-recognition specialists and software can't manage, let alone dealing with noisy backgrounds, masked keywords, variety of languages, etc.

    "Acres of datacentres" don't help for the simplest of obscurations in the phone call and guess who has a reason to mask their intentions behind innocent words? Terrorists.

    • So siwie just won't wecomend a westoowant for you?

    • I call BS. I can't even get a device - of any power - to recognise my voice beyond the very slow, pronounced basics and I have to train myself to it (not the other way around).

      Sorry to break it to you, but you're wrong.

      For one, want you can't do, and what today computers and networks certainly can - after being configured and programmed accordingly - is sample bazillions of phonecalls from millions and millions of people at insane speeds and aggregate speech patterns and their written equivalent by searchin

    • See, you're thinking they need to perfect the technology for it to be useful, because imperfect technology is a pain in the ass for users of voice commands. But they don't. It's a different use case. Any amount of successful Speech-to-Text processing for archiving and searching is more effective than zero. They obviously would want to raise this as high as possible to avoid missing information, but they don't need perfection either. Even a 50% rate of transcription would yield a staggering amount of da

  • So they use speech to text software, and that's technical brilliance? There's half a dozen exceedingly good ones used in various fields of medicine that are able to handle many different languages and thickly accented English.

    • by ihtoit ( 3393327 )

      Dragon is brilliant. I have the Premium version that has all the bespoke dictionaries. Yes, it's a total hog but there again I do use it to transcribe conferences. At which it is VERY good.

  • "Hey mom, I just named my new dog. I call him President Obama's Secret Isis Terror Gun Bomb. Lets talk about him a lot over the phone using his full name."
  • All of the voice recognition algorithms I've ever tried are terrible at recognizing anything said with an accent other than the one it was programmed to recognize. I have a Brazilian accent AND a speech impediment. I have never been able to use any voice recognition system in any language.
  • by Anonymous Coward

    After seeing citizen 4, seems to me habeas corpus is worth jack squat, and NSA knows no bounds or limits or even have the faintest hint of of human decensy.
    Just look at what they did to that lavabit guy. Install a backdoor, or give us your customers data or you will be tossed in some remote jail.
    He chose to just go out of business. What a bunch of fucking assholes.

    • by ihtoit ( 3393327 )

      Habeas corpus is suspended in time of war.

      There is a war on drugs.
      There is a war on terror/ism.
      There is a war on ISI/L/S.

      Three declared wars, plus oodles of undeclared police actions and incursions into sovereign terroritories, you think habeas corpus has ANY teeth??

  • And all the voice samples from "OK Google" and SIRI are theirs to sort through.
    After turning on the personal voice sampling (to make OK google more accurate) the tablet never misses a beat, it knows exactly what I'm saying whether I just woke up, in the middle of eating, or with a cold.

    I also left an older version of the Nexus 7 tablet (first gen) off for several months, upon turning it on I was surprised to see it was "ready to install" an system update, one that I never approved downloading in the first p

  • It's not like the NSA can actually DO/b anything with this shit.

    We know the government would like some positive press about anticipating events, but a perfect opportunity arose recently when two men opened fire outside a contest for Prophet Mohammed cartoons in a Dallas suburb Sunday night.

    Reminds me of Dionne Warwick who hawked Psychic Friends Network.She ran out of money. Why didn't her friends warn her?

  • by Anonymous Coward

    Perhaps it's time to change our standard telephone salutations to include the keywords they are looking for. :P
    You can be sure they're not flagging "hello" or "goodbye". I propose all phone calls now be answered by saying "Death to the president.... Yes, this is Steve. No, I am not interested in changing my long distance provider."

  • Comment removed based on user account deletion
  • by mbone ( 558574 ) on Tuesday May 05, 2015 @11:45AM (#49621425)

    If you want to search an audio or video recording, even a fairly poor speech to text can be very useful. A 90% success rate (1 word in 10 being incorrect) would provide a very frustrating transcript if you wanted to read it. However, if you are looking for a certain set of keywords or phrases, then 90% is likely to be perfectly adequate - after all, the point is to select "conversations of interest" that can then be listened to more intently.

    • by Greyfox ( 87712 )
      It's really not that much of a problem. You just replace any words you don't recognize with "Jihad!" Like "Hey Alex! You want to go to jihad later and get some burgers?" Clearly there's some sort of terrorist activity going on there which justifies an increase to the budget next jihad!
  • Comment removed based on user account deletion
    • by Anonymous Coward

      I'VE BEEN WAITING YEARS FOR THIS TO LEAK OUT!

      I hoped that metadata would purposely be picked as the 1st phase so people realize how bad that is. But I knew they were listening to all our calls for a really long time now. They have special DSPs that handle many phone conversations at a time and transcribe it into text; the hardware and tech was declassified almost a decade ago and it may not have been Siri grade it never needed to be (plus they don't declassify until they have something much better to repl

  • The bomb (Score:5, Funny)

    by ichthus ( 72442 ) on Tuesday May 05, 2015 @11:51AM (#49621467) Homepage
    This technology is the bomb! But, I will provide a colloquialism, ala Admiral Ackbar: "Take evasive action!" Incinerate any predisposition you may have to using keywords, like: bomb, infidel, jihad, Great Satan, etc. Instead, peace be upon you, and all your phone conversations.
    • I remember when ECHELON was the big 5-eyes project that everyone was up in arms about, and someone circulated a list of key words they were supposedly flagging on, so everyone started using those in phone calls / email / web sites / etc. Eventually we discovered ECHELON wasn't as capable as thought, and was much more focused... until it got replaced by the current system.

  • by SethJohnson ( 112166 ) on Tuesday May 05, 2015 @12:06PM (#49621639) Homepage Journal
    In all likelihood, the false positives suggested by the OP and others in this discussion are unlikely to trigger any such NSA attention.

    Coming from a data science background, I suspect they are transcribing and indexing all conversations as best as is possible with their elite voice recognition technology. Once it's in ASCII stored in a database, they can datamine the conversations of known radicals and jihadists. The algorithms that are generated don't so much emphasize specific keywords, but they generate a scoring system across a bunch of conversations by known haters-of-American-Freedom.

    With filters in hand, they can look at who talked to the known villains and score them and run down the trails of phone calls, emails, text messages, and internet chats to see who else might be a solid villain candidate. Even just monitoring internet traffic to known jihadist websites can likely get the filters applied to a person's communications to see if they might be a person-of-interest.

    Keywords will come into play AFTER an attack like the Garland Draw Mohammed contest. The NSA is right now filtering recent past conversations among suspected jihadists looking for relevant keywords such as 'Garland', 'American Freedom Defense Institute', 'Pamela Geller', and 'Elton Simpson'. Any conversation leading up to the attack including those keywords would absolutely put someone on a watchlist. And everyone who that person is talking to would be suspect as well.

    Bottom line is, these tools are being used retroactively to bolster detective work. Talking about bombs and the President's name doesn't do anything because there are a thousand-million conversations using those words everyday.
    • Your argument would be compelling if not for the fact that one doesn't need this technology to build historical cases or networks. Investigators are perfectly capable of using forensics to find such connections after the fact. Of course such databases will be used retroactively, to the extent possible, but the stated goal of the intelligence community is to prevent attacks before they happen, not to pick up the pieces afterwards. See, for example, http://www.pbs.org/newshour/bb... [pbs.org]

      • Maybe I wasn't clear about how these tools help ferret out networks of freedom-haters. This line could have been more prominently stated--

        ...to see who else might be a solid villain candidate. Even just monitoring internet traffic to known jihadist websites can likely get the filters applied to a person's communications to see if they might be a person-of-interest.

        That type of work is more than forensics. It's proactively chasing up the networks to make their leadership accountable. Those are vague terms

  • Comment removed based on user account deletion
  • The problem with gadget security is it will always let you down and is why mass surveillance is counter productive. The larger the dataset, the harder it is to extract any useful information. When you're trying to process billions and billions of records, gadget security is your only option. It's a huge waste of effort and, as the Boston Marathon Bombers and those dead idiots in Texas proved, it's still relatively easy to slip through.

    Terrorists are smart enough not to speak in plain language, so I don't

  • There is a story (I've heard it from several sources over the years but I won't vouch for its veracity) about an early translation program that the US military commissioned, sometime in the sixties I think, that illustrates some of the problems. This program was meant to translate English to Russian and vice versa. At the demo all the higher-ups were there, typically not having a clue about the complexity or pitfalls of the task. One of them suggested that the phrase "The spirit is willing but the flesh is
  • Don't they keep insisting that they only collect metadata and not actual conversations? If they collect specific conversations with specific targeted people involved in actual crime, couldn't they just deal with those manually?

  • Anyone with a minimal level of training knows this, and uses methods that our intercepts won't catch.

    We only catch the n00bZ.

    And, in point of fact, the times we get people to give away things, they're not in the US, but in the Middle East (Saudi Arabia, Yemen, Pakistan mostly).

    Intercepts in the US rarely catch anything useful, and have such a high level of red herrings we waste a lot of resources that would be otherwise used profitably overseas, not in the US itself.

  • Think about this for a second. Why is this surprising?

    I don't know about other people here, but I don't even check my voicemail anymore. Google handles that, and has for years. The voicemail transcription I get through Google Voice is almost always good enough that I can determine who called, what they want, and where to call them back to talk further.

    Keep in mind, this is a 'free' service to me, I don't pay anything. Due to the volume of people they do it for, I'm certain they they're trying to meet econom

  • ... for starters.

    Yes, it's not a solution to the problem but it's a start.

  • I am torn between admiration of the technical brilliance of building software like this and horror as to how it is being used.

    The technical brilliance of voice recognition combined with data mining need not be met with horror... All the horror can be reserved for the separate issue of mass surveillance.

  • Dialling random numbers from a public phone and saying "Is it done?", or "Man, you gotta help me, I did it but there's blood and brains everywhere!"

    For the win.

E = MC ** 2 +- 3db

Working...