Forgot your password?
typodupeerror
Data Storage Government Hardware IT News Science Technology

Obama Administration Places $200 Million Bet On Big Data 72

Posted by samzenpus
from the always-bet-on-greenbacks dept.
wiredmikey writes "As the Federal Government aims to make use of the massive volume of digital data being generated on a daily basis, the Obama Administration today announced a 'Big Data Research and Development Initiative' backed by more than $200 million in commitments to start. Through the new Big Data initiative and associated monetary investments, the Obama Administration promises to greatly improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data. Interestingly, as part of a number of government announcements on big data today, The National Institutes of Health announced that the world's largest set of data on human genetic variation – produced by the international 1000 Genomes Project (At 200 terabytes so far) is now freely available on the Amazon Web Services (AWS) cloud. Additionally, the Department of Defense (DoD) said it would invest approximately $250 million annually across the Military Departments in a series of programs. 'We also want to challenge industry, research universities, and non-profits to join with the Administration to make the most of the opportunities created by Big Data,' Tom Kalil, Deputy Director for Policy at OSTP noted in a blog post. 'Clearly, the government can't do this on its own. We need what the President calls an 'all hands on deck' effort.'"
This discussion has been archived. No new comments can be posted.

Obama Administration Places $200 Million Bet On Big Data

Comments Filter:
  • by Anonymous Coward on Thursday March 29, 2012 @07:16PM (#39516853)

    I'm black, and I think that even if Obama is "my nigga", he cannot be trusted,
    and he betrayed most of those who voted for him with all his false promises.

    I intend to vote against "my nigga" next time.

    I guess you can call me an Uncle Tom, but I'd rather be called that
    than be called a sucker or an idiot.

  • by macwhizkid (864124) on Thursday March 29, 2012 @07:39PM (#39517079)

    I'm a hard science/computer science guy who's livelihood is working on various NIH/NSF projects. A common thread talking to other scientists the past few years has been the theme that the tools for data analysis have not kept pace with the tools for data acquisition. Companies like National Instruments sell sub-$1000 USB DAQ boards with resolution and bandwidth that would make a scientist from the early 1990's weep for joy. But most data analysis is done the same way it's been done since that same era: with a desktop application working with discrete files, and maybe some ad-hoc scripts. (Only now the scripts are Python instead of C...)

    The funny thing is, most researchers haven't yet wrapped their brains around the notion of offloading data onto cloud computing solutions like Amazon AWS. I was at an AWS presentation a couple months ago, and the university's office of research gave an intro talking about their new supercomputer that has 2000 cores, only to get upstaged 10 minutes later when the Amazon guys introduced their 17000 core virtual supercomputer (#42 on the top 500 list, IIRC). There's a lot of untapped potential right now for using that infrastructure to crunch big data.

  • Re:Privacy? (Score:3, Interesting)

    by Sarten-X (1102295) on Thursday March 29, 2012 @07:46PM (#39517131) Homepage

    That's an absolutely unfounded concern.

    I worked at a Big Data company. About 90% of my job was improving privacy while maintaining the integrity of medical data. The patient's zip code was reduced to 3 digits. Any references to states were removed and forgotten (because there are some zip codes that cross state lines). Any names were removed, as were any user-entered comments (doctor's notes, etc.) that might possibly contain personal information. Any personal information that is necessary for the system but might be identifiable is salted and hashed twice before it ever leaves the source (hospital, insurance provider, etc).

    That in itself isn't good enough for privacy, so we then used some proprietary methods (that was kinda outside my job, so I don't know much about them) to intentionally screw up the data we provided to our users. A user could find out, for example, that between one and fifty people in the vicinity of Denver had a particular medical condition on a particular date, and received a particular drug. Narrow down results more than that, and my company's system simply wouldn't fulfill the request.

    This isn't really the exception to how many Big Data companies treat their data. Believe it or not, Big Data providers take privacy seriously, and are willing to sacrifice perfect accuracy to run an ethical operation. Anyone interested in Big Data is running on statistics anyway, so statistically-insignificant methods are easy to preserve privacy.

APL is a write-only language. I can write programs in APL, but I can't read any of them. -- Roy Keir

Working...