Researcher Warns of "Digital Dark Age" 367
alphadogg writes "A assistant professor from the University of Illinois at Urbana-Champaign is sounding a warning that companies, the government and researchers need to come up with a plan for preserving our increasingly digitized data in light of shifting document management and other software platforms (think WordPerfect and floppy disks). Jerome P. McDonough, who teaches at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign, says there exists about 369 exabytes worth of data, and that includes some pretty hard to replace stuff, including tax files, email and photos. Open standards could play a key role in any preservation effort, he says. 'If we can't keep today's information alive for future generations, we will lose a lot of our culture,' McDonough said. Even over the course of 10 years, you can have a rapid enough evolution in the ways people store digital information and the programs they use to access it that file formats can fall out of date.'"
I say (Score:5, Funny)
And who needs to store pictures and movies on their computers anyway? In fact, I think the world would be a better place without them!
Now if you excuse me, I'm going back to watching Iron Man on my wrist watch.
Re:I say (Score:4, Insightful)
And I'm not immune of course, there's a lot of shitty software out there and it's easy to trivialise the value of Custers Revenge [wikipedia.org] or Giana Sisters [wikipedia.org] but remember that historically archivists want to know about tasteless/racist video games or tributes/Mario-ripoffs just like they want to know about 1980s comedy shows and magazines.
This article is saying that libraries and archivists had a blind-spot when it came to software. It took them decades to realise that people expressed themselves artistically in this medium. Archivists didn't know that they should preserve it like we do other media.
I know how easy it is to mock these efforts (Eg, the tag "!nothingofvaluewaslost") but please consider supporting and justifying this digital culture as part of a wider effort to justify software expression.
It's easy to pick out dumb software but closing
Re:I say (Score:5, Informative)
20th Century culture lost (Score:5, Interesting)
I'm more concerned about losing the culture from the 20th century.
Everyone born after 1975 hates the RIAA, doesn't pay any attention to whatever they say, and file-shares gigabytes without a thought to the music industry definition of 'piracy'. This is as it should be. It means that the music and movies of the (for now) young people is safe because it is widely circulated outside the control of those who have deluded themselves into believing that they own it.
It's all the stuff from the first 2/3rds of the 20th century that will disappear. Because the people who like it are in their 50's, 60's, and 70's now and don't have the technical skills to copy and distribute it. Plus they actually trust the corporations will preserve it. I mean all the books, music recordings, television shows, movies, and plays from the first half of the 20th century. The stuff that is under 'infinite copyright' and will never be in public domain because the corporations will simply pay off the politicians to endless extend the copyright period, as they do now.
As soon as all this stuff stops selling (and who nowdays is paying money for the book that was #3 on the New York Times BestSeller list of Oct 28, 1936?), and can't be legally copied because it can't enter public domain, then the corporations will just destroy it. Pulp the books; convert the film stock to ethanol to power their SUVs; dump the magazines in the oceans or in nuclear waste sites to absorb neutrons. When that happens, all this culture will be gone and historians 200 years from now will have little idea about how civilized people actually thought and acted in the critical early years of the modern technological age.
You can talk to the old people about the need to preserve their culture by making 'illegal' copies of the books, magazines, and movies that were important to them, but they are just simply and completely clueless about the extent that their culture will die as they do.
Re:I say (Score:5, Funny)
um..do you have a link to that watch?
And more importantly, a song:
sung to the spiderman tune.
"Iron Man, Iron Man
Does whatever an iron can
Presses pants really fine
Keeps those pleats right in line
Look out! Here comes the Iron Man" - Marvel
Marketing and Management already know! (Score:5, Funny)
Re:Marketing and Management already know! (Score:5, Funny)
In the cloud? Oh my god! What happens when it rains?! The farmers will have all our data! We'll have to sue the farmers for their harvest, since their crops will contain all the data and applications!
Re: (Score:2)
Re:Marketing and Management already know! (Score:4, Funny)
I'm sure the story tellers of old laughed in the same was as the cave painter said, "Ug draw this story on wall."
Re: (Score:3, Insightful)
Except you can explain what a painting is, no one can clearly define what the cloud is. Mostly becasue it's a marketing term looking for a technical design it can adhere to.
Re: (Score:2)
In theory, a cloud provider (like Amazon's S3) has a responsibility to backup the data. I lose a ton of data every time my drive crashes or I reinstall w/ out backing up my home directory. In contrast, about a decade ago I put some MP3 files on Xythos' webdav server (now known as xythos on demand), and they're still there. The MP3s are no big deal, but the fact is that this cloud provider stored my data for a decade.
10 years isn't exactly 'future proof', but that's the oldest cloud provider I could think
Re: (Score:3, Interesting)
I hacked into Carmen SanDiego and changed the character names to those of the staff of a school I was working at. // went into a time capsule around 1988. In 2013 it will be opened up.
The 5-1/2" floppy, formatted for an Apple
It'll be stuffed, just like the rest of the contents.
Re: (Score:3, Insightful)
Can I ask why you would want to restrict the ability of two consenting adults to enter into a contract together? It seems rather ridiculous.
You should be able to enter into a contract to share benefits with whomever you wish, X- or Y-chromosomes, I'm not sure I understand the difference.
It's called "Civil Unions" and in CA they have the same rights as marriages. The point is that "Marriage" would be not be used to describe these Unions in the same way as "heterosexual" would not be used to describe a homosexual person -- it's simply counter to the definition. It's not based on hatred or hope for inequality -- simply concern for a word that would quickly lose 100% of its meaning if we start tampering with the definition. Or do you go around calling homosexuals "straight" because it's big
not to worry.... (Score:2)
I'm not so sure that every megabyte of those old data disks is worth preserving. What of the past centuries' romance of the lost maps that had told of hidden treasure? Let there still be space for legends in future generations. Let the sleeping floppies lie :).
Re:not to worry.... (Score:5, Insightful)
Historically, things that have been very uninteresting at the time, have been hugely valuable to researchers later on. We may not care about the countless people talking "crap" on bebo right now, but in a few hundred years it might be a different story. When people can easily analyse all those posts for meaningful psychological profiles that aren't currently understood never mind modelled and easily detected, all of that could tell a lot about our society. Even rubbish tips from thousands of years ago are hugely valuable to paleontologists.
This goes more so, for important government records, etc. Peter Quinn did a great job of explaining that, with his Sovereignty talk.
Re: (Score:2)
We may not care about the countless people talking "crap" on Slashdot right now, but in a few hundred years it might be a different story.
Sorry, but I had to fix that for you... for comedic effect.
Re: (Score:2)
Re: (Score:3, Interesting)
Recently at work we ran into a problem where a "knowledge management" package died. The company had gone belly up and there is no converter. We are printing and re-typing in thousands of pages because there is just no other way.
I collect antiquarian books. Funny that a collection of plays printed up in Latin in 1542 only require the learning of a language, yet a knowledge base less then 10 years old is unreadable...
Re: (Score:3, Interesting)
Re: (Score:3, Funny)
They're not printing this on paper, right?
Can't you print into a PDF and convert it to a TIFF with ImageMagick and give the OCR thingy that file?
Would go a lot faster, too...
Anal (Score:2, Insightful)
It's only because people are so anal these days. Who gives a shit? It's not like anyone in the future's going to miss anything. Even today with items like the Rosetta stone it's not worth much more than a Trivial Pursuit question - we'd not be any more educated or intelligent if stuff from 2000 years ago hadn't gone missing. Sure, there's a certain entertainment value in it all, but the idea that in 2000 years time anyone's going to be remotely bothered about the loss of websites, games and so on from
Comment removed (Score:5, Informative)
Re:Anal (Score:5, Insightful)
Re: (Score:3, Funny)
This is the first time I've seen a post titled "Anal" modded +5 Insightful.
Re: (Score:2)
Not it isn't. Metallurgy is far superiour today. We can design at the molecular level now.
Some things aren't needed so no one bothers. That different then their metallurgy being 'superiour'.
Stop spreading that tired old myth. Next some ignorant person going to tell me about the 'lost' Japanese sword smith technique being superiour, or some other load of rubbish.
can't seem to be replicated does not equal superiour. Hell, I can ahve made a sword that is far superiour then anything seen in Japan without there
Re: (Score:2)
Hell, I can ahve made a sword that is far superiour then anything seen in Japan without there being any metal in it all.
[citation needed]?
Re: (Score:2)
Re: (Score:3, Funny)
Ah, but the Library of Alexandria was the offsite backup. This is why we need 3 copies of our important data. ^_^
Re: (Score:3, Insightful)
Re: (Score:2, Insightful)
Given the degree of effort historians and archaeologists today put into finding as much information as possible from times past, including minutia about how ordinary people lived their lives, you're obviously flat out wrong.
Re: (Score:2, Insightful)
Besides, I would rather no one saw the website that I
Archive... (Score:2)
Re:Archive... (Score:5, Insightful)
OPEN file formats and OPEN hardware, well documented.
Even if no program exists anymore to read your data, as long as you have the specs you can rebuild it. And I mean hard- AND software. If you know how to build it, you can build it provided you have the means. And I'm pretty confident that our future cousins will be able to build a current computer with their future technology, as long as they know WHAT they should build.
Re: (Score:2)
Documenting, via open hardware standards, how to make or read a paperback book does no good when the paperbacks are manufactured with acid-laden paper. Ask your local librarian how difficult it is to preserve popular paperback novels, and how many they have to destroy each year.
Now compare that to magnetic abd today's optical media. Floppies do not last long without careful handling and temperature control. Magnetic tape is subject to serious problems of the tightly wrapped tape affecting the bits on the ne
Re: (Score:2)
"OPEN file formats and OPEN hardware, well documented."
Same diff.
The format doesn't matter if you have the specs.
It is irrelevant to your OS agenda.
Besides, does anyone really think they won't be able to crack them? we're not talking about stone tables buried in the mud, we are talking about an ever changing and documented system.
Every change, every item, every document is talked about on the internet. Being able to access the data will be trivial.
Re: (Score:2)
Good point. But make sure the instructions for building the computer are picture-only, in case your language is lost too...
Information outlives technology (Score:5, Insightful)
- Tim Bray
Re:Information outlives technology (Score:5, Insightful)
Been using Excel, MS Word since 1990 and Quicken since 1992.
I can still open all my work from my thesis, and can search credit card purchases from 20 years ago.
No problem here.
Re:Information outlives technology (Score:4, Insightful)
Re:Information outlives technology (Score:4, Insightful)
Change it around. Everyone who's been using the same word processor for the last ten years raise your hands. Every hand probably goes up. For the ones that don't go up, ask can your current word processor read files written by your word processor ten years go? The rest go up.
I've got a few archive CDs from over ten years ago. Every file on them is readable today. Even if I'd be a little inconvenienced to dig up a copy of Corel Draw, there are lots of modern drawing and layout programs that can read the files.
Re: (Score:2)
And he might have a point if the stage in between weren't known. Since every upgrade and change is going to be known, getting thatb data will be easy.
There is no digital data that can't be cracked.
And 10 years? sure. IS the data you ahve now going to be valuable in 100 years? Probably not.
As long as you bring it along for the ride, it won't matter.
Re: (Score:3, Insightful)
Tex and LaTeX have lived for 25 years (http://www.xent.com/FoRK-archive/feb98/0307.html). While not exactly a word processor, it's what I use instead of one.
I'm not sure what the definition of "same" is in this context, but I suspect what I'll be using 10 years in the future will still be called LaTeX and will largely be compatible.
And to guard against incompatibility, I can write a script that compiles all my LaTeX documents with all my LaTeX installations and reports errors; this should easen my burden o
Subtly different from all similar warnings (Score:3, Insightful)
The cultural loss isn't something that should be overlooked, some can bemoan it but the value of culture is that it exists, and that different ones existed in the past. Culture changes from moment to moment but without some action the real meat of the early 21st century will be lost forever. That is the big thing here, and that is justification for working for truly readable digital archival methods. There is a project of making minisuce indentations, but that requires a lot of technology to see much less decode. Continuous duplication, by transfer of all old data across all mediums as they rise and fall, by printing content and storing it in climate regulated warehouses, etc. We relish seeing things from thousands of years ago. This is humanity, that is our legacy. We need to leave a legacy for our grandchildren.
Migrate, migrate, migrate... (Score:5, Insightful)
The only motivation for a company to invent new ways to preserve data long term is to provide it as a service so they can profit from it. Other than that, a companies main goals are deleting everything it legally can. Anything that no longer exists can't result in a lawsuit.
Everything that is preserved is a potential liability. For items requiring indefinite retention because they are critical to the business... They will be stored, redundant, and backed up appropriately. As the systems that provide those qualities age, they will be replaced in regular maintenance and upgrade schedules as economics and timing come together in the right proportions. In that way, reliability and long-term survivability are maintained - nothing stays on ancient systems that are unmaintainable forever. When systems go out of support, everybody has already been looking to the next solution to migrate to.
So what's wrong with this approach? Its essentially what all "big" companies are currently doing. I don't believe in this proprietary format FUD either - if the proprietary format is no longer supported, you migrate. Potential of future cost to migrate is the only concern, not survivability.
Migration is todays solution to long term storage and I see no reason it should be ignored. Like security, data retention is an ongoing objective that requires maintenance - its not some end-state. Dreaming of a solution that will just last forever seems archaic, no?
Doubtful... (Score:3, Insightful)
Most of the text in most word processing documents are easily available to be parsed out even without the specs. The formatting would be lost, as would any embedded objects or images.
Open formats would improve it, but I would be more concerned about encrypted documents and media loss than not being able to recover data (text/images/video/music/etc) from available files. There are a lot of clever people that can do amazing things with deciphering proprietary formats.
Re: (Score:2)
How about when ASCII is a distant memory?
We've seen Baudot, ASCII, EBCDIC, and codepages for things not quite ASCII. Now we have UTF-8 (which thankfully has a special relationship to ASCII) and half a dozen other encodings that are 8, 16, 32, or some variable bit length in multiples of 8 bits from 8 bits to 48 bits depending on the character.
Images can actually be easier to recover than your post suggests, and recovering text can be harder.
Re: (Score:3, Informative)
I dare say that if I gave an English-speaking computer geek who had never heard of EBCDIC a long document encoded as EBCDIC, and told him (truthfully) that it consisted mostly of English text, he'd have most o
Dark? (Score:2)
Until... (Score:2)
Urbana-Champaign gets to inventing HAL, I'd say they should stop wasting their time with this sort of thing...
This is anything but a new problem (Score:2)
The problem existed for a while. Can you read 8" discs? Do you know how to build a device to read that "data drums" IBM used to store data?
Create documented hardware and use documentes formats to store your data. Dump everything proprietary because chances are good you don't get the whole information you need to recreate the formats or the hardware flawlessly. If you know how to build it, you can build it. If you can build it today, you sure as hell can build it in the future with better technology.
The only
Professional Write (Score:4, Insightful)
Amazing as it sounds, I still have very VERY old data that goes as far back as 7th grade when I started using computers. I know of no converter for Professional Write that will convert Professional Write documents into ODF, or even MS Word 97/2000/2003.
The only hope I have is that I can use strings to extract the text elements of the data.
Re: (Score:2)
If this data was remotely important, why didn't you hang on to a computer running this software?
Or load the data and either print it or save it as ascii text prior to disposing of whatever you created it on?
Beyond that, what you describe is an *excellent* reason for storing data not in a proprietary or application-specific format, but instead as plain ascii text in the first place. You can always load it into your current modern word-mangler of choice and plays with the fonts and margins.
Re: (Score:2)
I was 13 at the time, give me a break. Actually, Professional Write will run under Dosbox.
Books? (Score:4, Insightful)
From the article -
âoeIf we canâ(TM)t keep todayâ(TM)s information alive for future generations,â McDonough said, âoewe will lose a lot of our culture.â
Hardly.
Apparently none of our culture is stored in books anymore?
Sure if every piece of data was wiped out the world would lose a lot of information... but a lot of valuable and useful information is still put on paper. I don't think that is our biggest cause for concern.
However I do agree that the world really needs to agree on more open / non-proprietary ways of storing data. Sure, I can open a .wav of Blackadder talking about 'sticking a Christmas tree' somewhere from 1992, but I have a bit of trouble opening .ra (real audio) video files from a few years ago.
And working in government everywhere I go the electronics file storage is just a discordant mess. Anything important we have to print and store hardcopies because our electronic systems are just unreliable.
haven't been around the US for the last 8 years? (Score:2)
According to 50% of the USA, they wish we did lose our current culture. That's aside from sages wanting us to not repeat history.
Guess which side it is? Trick question.
On file formats and the future (Score:4, Insightful)
Open standards could play a key role in any preservation effort, he says
The way I see it there are two approaches to the problem. The Quixotic fight consisting in changing the world and forcing in a dictatorship of openness regarding file formats, which doesn't solve the problem for the past 50 years of computer history.
Or let a few hundred people around the world worry about file format parsing or, in the worst case, even emulators to do whatever old computers did. In a hundred years from now, you'll have very complete emulators for our modern PCs. Considered that a 1994 PC is quite comparable to a 2008 PC (and presumably a 2015 PC) from an emulation point of view, you know that's a given, and even then, in case there was no such emulator, you know you could find a good such emulator for machines from the 2040s, which themselves would be well emulated by machines from the 2070s, and so on.. that's what we already do. There's hardly any program you used 20 or 30 years ago that you couldn't use today.
Been there, done that. (Score:2)
Shut up, shut up, SHUT UP! (Score:2)
The more things change... (Score:4, Insightful)
How's that for timing? (Score:2)
How's that for timing? PALGN just interviewed Eric Kaltman [palgn.com.au], cataloger at the Stanford University library about his role in cataloging game-related material and the challenges that DRM and MMOs present. Stanford's part of the "preserving virtual worlds" project, along with the University of Illinois mentioned in the article. He's also the guy who writes on the How They Got Game [stanford.edu] blog, where he documents his findings.
It's an interesting field. Far more challenging than I would have thought.
People are starting to take note (Score:3, Interesting)
Government agencies and archivists are starting to wake up to the fact that this is an issue -- I think the Office 2007 file format change was a big factor that is getting it on the radar.
Minnesota, California, Massachusetts and New York definitely have people studying the issue. Unfortunately, there are no easy answers when it comes to these things.
In my opinion -- which is not necessarily the opinion of my employer -- one of the major problems is that there are far too many records being preserved.
If you looked at the archives of a government or corporate office 30 years ago, only official memorandums, some meeting minutes and policies were retained. Today, technology like email has improved communication somewhat, but has also encouraged sloppy office practices so that it is nearly impossible to figure out what is useful and what isn't.
To compound matters, the courts are now mandating document retention and email archiving which encourages the retention of even the most banal communication.
IMO, the period 1990-2020 will be a black hole in history.
The article mixes up 2 problems... (Score:5, Interesting)
The author specifically mentions WordPerfect files. Bad example! The default file format in Wordperfect X4 (released in April, 2008) is the same as what was used in WordPerfect 6--which came out in 1993 (DOS and Windows). While I can't speak for OpenOffice or Google Docs, MS-Word can read those files (and WordPerfect 5.x files) with a simple File/Open. Excel opens Lotus 1-2-3 files as well. So, Word can open popular formats in use since 1988 (WP 5.0) and Excel can open some formats in use since 1983 (1-2-3 r1a). You can also buy programs like FileMerlin [file-convert.com] to convert old documents.
Frankly, when it comes to file formats, conversion apps will exist for a LONG time. For DOS apps, you could even go so far as to create a v/m or use Dosbox, load up your obsolete word processor (I miss "Leading Edge Word Processor"!) and copy/paste the text into Word or Notepad...
Image files, sounds, & videos are no exception... GIF has been around since 1987, JPEG has been around since the early '90s (opening those on a 10Mhz 8088 was slow!), and MPEG/WMV/AVI/Quicktime videos are easily openable...
Finally, the more people that are affected by obsolete files, the more interest there is in some way to convert the data... But don't forget that a LOT of the data is junk--do you really care about your 7th grade paper you wrote on Hong Kong in 1989?
Re:The article mixes up 2 problems... (Score:5, Interesting)
About mine? No... but how about the next Einstein's 7th grade paper, or the next Picasso's?
Simple: shorter copyright (Score:3, Insightful)
Make copyright last 5 years. Then everything worthwhile will be backed up by someone who cares about it.
Licensing Formats (Score:2)
The thing that I've started to dislike is the requirement that you license formats in order to use them. I fully understand where this is coming from, but there was never a need to license IP to build a microfilm reader, CD player, or VHS player (I may be oversimplifying here).
But if you want to play a Blu-ray disc, or Dolby Digital TrueHD audio, suddenly you can't just buy a bunch of off the shelf parts and build something that'll read that data.
We need to do something about making formats become open. I h
Re: (Score:2, Interesting)
There is no license required to build and sell a a CD player. There IS a licence required if you want to CALL your optical disk reader a "CD player."
And you can still do this. The LG BH100 combination BluRay and HD-DVD player (I had one) couldn't display the HD-DVD logo because it didn't meet all the requirements of the HD-DVD player licensing. But it could still play HD-DVD movies just fine.
Slashdot again misses the point (Score:2, Insightful)
Everyone here seems to be missing the point -- Businesses don't need help preserving data. Anything that's really valuable and needs to be preserved will eventually be put on a laptop and lost in an airport. But what about your wedding photos? What about that book you've worked for three years on, and saved it in word doc format?
The problem of data preservation is not one business needs to address -- there's a million geeks (hi slashdot) that will be eager to earn their pay coming up with washing-machine si
The problem is real to museum conservators. (Score:4, Interesting)
As for Wordperfect and floppy disks: yep. That's a problem in our home. We are having to migrate WP files now and then. It is not sufficient to have old computers that run the programs. I had WP on my computer (but didn't use it.) A series of glitches when upgrading to SP3 had as a side effect the corruption of WP on my computer. Whatever the problem was, I could not even re-install it. We are now down to one computer that can read it.
I, when I worked in IT, migrated library data. Getting it into any sort of readable text form was a trial. We have even been sent old Macintosh computers in the hope that we could get stuff off them. Usually we could, but it wasn't been done economically, and I cursed the Education system that had highly paid administrators who did not even dimly consider that a data storage system had a finite lifetime. Not even 20 years after my father retired on under half their salary.
The core solution is as the original article says - for all government software, mandate that data export to a widely used open standard be available within the package at no extra charge. I do not know of any impediment to this worth considering. Where there are privacy issues, it is simply exported encrypted and funds are established that allow a few facilities to decrypt and migrate the data. If you cannot sell to government, including any educators, then you are marginal. OK, so some games will be unavailable to future generations. That is inevitable. But then that will be a reason to collect and maintain the hardware if you are a hobbyist.
As for large corporations, it may be sufficient that the auditors require that data be accessible for forensic and liquidation purposes. That is, not readily, but if need be in extreme circumstance.
In short, the immediate solution is an administrative one. Software and hardware is the relatively easy bit.
My own prize example of a dead data format - the Windows
It isnt that fucking hard (Score:2)
If you store digital information (wether you are a library or not), make sure that as long as you have information stored in format "x", that you have the proper equipment for reading format "x". Ideally, if you get new equipment that uses a new format "y", be sure to *both* keep the old equipment that you knew worked properly (and not just one set - keep several, if you can, and make sure that new employees/members know how to use it), but also try to find such new equipment that is capable of both reading
The Big Problem (Score:3, Insightful)
this means you have to bypass access keys or encryption
This is going to be a big problem. I have CAD files, code manuals and other engineering data that cannot be accessed with anything other than the proprietary CAD apps or browsing software. Some of these apps have been 'orphaned', in that the applicable versions are no longer supported by the vendors. Activation keys are locked to a particular machine, so trading in that Windows 98 machine for a nice new XP system is out of the question.
I make sure that none of my contracts oblige me to maintain electronic
false analogies (Score:5, Insightful)
This is one of those fairly bogus, highly overblown stories that keeps cropping up every so often. A similar one is the supposed shortage of scientists and engineers in the US, which has never existed, and is always supposed to be coming Real Soon Now; in fact, the data to support this claim are always either nonexistent or wrong. (E.g., they compare Indian college graduates with US college graduates, but the Indian degree they're comparing with a U.S. bachelor's is more equivalent to an AA degree in the U.S.)
First off, the concern about incompatibility of physical media was valid 30 years ago, but it's a false analogy to try to apply it to today's situation. Thirty years ago, I had data on a mixture of 8-inch floppies and 9-track tapes. I can't read an 8-inch floppy anymore, and although 9-track tapes still exist, most 9-tracks from that era are no longer readable due to physical deterioration of the media. But that was all in an era when hard disks were expensive, and the internet didn't exist. Today, I have all my data on hard disks of various computers, and I use file synchronization software to keep them all in sync. If one of my hard disks dies, I replace it, and I haven't lost any of my data. (I also have backups on optical media, but I basically never need those.)
There's also the concern about formats. People tend to bring up, for example, the image of rooms full of physically deteriorating 9-track tapes with data from old NASA space probe missions. The formats are often not documented. The thing is, most of our data isn't at all analogous to the raw data from Mariner or Voyager or Viking. Those were unique historical events, and the only way to get more data like the data they collected is by sending another space probe. (People also tend to vastly overestimate the value of scientific raw data. It's extremely uncommon for raw data to be of interest decades later.)
Most of the world's data isn't in some obscure NASA format, it's stored in formats that are used by tons of people, and are extremely well documented. Sorry, but I just don't believe that the knowledge of how to decode Adobe Acrobat format is going to be lost to future generations. Ditto for html, jpeg, and mp3.
Another thing to keep in mind is that nowadays you can emulate old computers with excellent performance. For instance, my first home computer was a TRS-80. I can still run my old TRS-80 games on my linux box, using an emulator. Sure, emulation isn't perfect, and some information may be lost. But the claimed threat of data loss is vastly overblown.
The biggest threat to the preservation of information isn't technological change, it's copyright. The most likely reason that I wouldn't be able to get back an old piece of digital data is that the people who tried to preserve it and put it on the web got sued by the people who own the copyright -- the same people who let it go out of print. The economic incentives are to hold on to your copyrights (because that doesn't cost you any money) and send out DMCA notices to anyone who puts it on the net (because that doesn't cost you any money either), all in the hope that your content will be worth eleven cents fifty years from now. This is exactly what we see happening, for instance, with ROMs for old video games, which you can play in MAME, except that you have to find an illegal source for the data, because the owners of the copyrights aren't willing to sell you a copy.
He's needs tenure (Score:3, Funny)
Look, he's an Assistant Professor, not an Associate Professor. He just got his PhD a coupola years ago and somehow he managed to land a job. He needs to publish something, anything. He needs tenure. So he's saying the library (OK: 'Data' if you will) is on fire and we need a government rule to protect it. The librarians are going to nod wisely and agree with him (I'm a librarian and I've seen way too many wisely nodding librarians in my time.) It's all a bit of a smoke and mirrors thing and he'll be able to milk this for a few more articles to put on his c.v. He's whoring for points just like on /.
Meh?
"Dark Age"? (Score:3, Insightful)
"Dark Age" is kind of an exaggeration. Presumably it's a reference to the period right after the Fall of Rome (475 AD) when most classical literature was lost because existing information technology (hand-transcription of documents) got too expensive for what passed for an economic system. This time around, if we lose much more, it's because we have a lot more to lose. But how much of it matters? If my USB drive dies and takes the last surviving copy of Debeee Does Dingos or the collected bloggings of Joey Joey, it's not that big a deal. But anything that really matters (the complete works of Shakespeare, the Beatles, the user's manual for Ultima IV) is going to be saved in multiple places in multiple formats, and it just not going to get lost.
I think the big problem is the exact opposite of what TFA warns about: too much preservation of stuff that isn't worth preserving and doesn't really represent our culture. Future generations wading through the digital crap we leave behind — blog rants, porn, advertising, spam, internet rumors, Star Trek flame wars and fan fiction — will be hard put to sift out our serious accomplishments.
Classical Greek civilization is probably the most influential in all of human history. And yet you can buy a single CD containing every single surviving work from the entire civilization! It's quality, not quantity, that defines a cultural heritage
On a personal note... (Score:4, Interesting)
More importantly, DRM and rent vs buy (Score:4, Insightful)
Re: (Score:3, Insightful)
What you say is all quite true. The interesting thing is that long-term preservation of our cultural heritage in this DRM-crazy/copyright-insanity world may ultimately and largely be due to "piracy"!
Down with the DMCA! Support your local pirate for your grandchildren's sake!
Comment removed (Score:3, Informative)
Who's to say what's important? (Score:3, Informative)
The best example I can think of are personal letters. Usually we judge these by the importance of the person who wrote them, but in some cases we can (today) look at the letters written by ordinary people to their loved ones and gain great historical insight into the events of the time. Take, for example, Ken Burns' "The Civil War". Some of the most compelling information in the documentary was found in the letters written by ordinary soldiers.
Somehow I doubt we'll have records of the emails today's soldiers are sending home 150 years from now.
We can't judge what future generations are going to find valuable in the mountains of data we're generating today. We should find a way to preserve as much of it as we can. I hope someone is working on good, open compression algorithms to go along with the data storage.
Digital Supernova Age (Score:3, Insightful)
It's a Digital Supernova Age and they're bitching. Think of your parents' or grandparents' generation, and try figuring out how much information exists about them. Sure there's the basics like birth certificates, marriage certificates, property records and other big things, there's probably some pictures and maybe they're mentioned in some books but I doubt there's any real record of how their daily life was and what they were doing. I know I have chat logs and such from my youth that are probably way, way more accurate and uncensored records than anything my parents have, even if they kept a diary which they didn't. If I get over how immature I was at the time, that's easily something I could release for research in 50 years time. With blogs and myspace and twitter and facebook and whatnot you can do a lot more, in a lot more detail with pictures and whatnot today and capture a large part of that as it happens.
The only thing happening here is that a few historians look at all this trivia which was always there, but never in a form to be captured and go "We should preserve ALL of it!" in a historygasm. If you preserved 0.001% you'd still preserve more than any generation of humanity to date. It's a case of diminishing returns, we don't truly need 24/7 live footage of 8 billion people as an historical record. It's certainly important to catch some sample of daily life and not just the big historical events and mainstream media, but I have no doubt that more than enough of this will be preserved anyway. Maybe we're in deep shit if humanity nukes itself out of existance but otherwise I'm sure it'll be kept as collectables or antique information from hundreds of years ago. Can you imagine in 2544 saying "It's a original (=bit exact) 2008 CD by [Artist]"? That's not going away no matter how crappy it is. And if we do nuke ourselves out of existance, I'm not REALLY concerned with what alien archeologists think of us anyway.
Re: (Score:2, Funny)
*********DOR!
Well at least that's all we got out of the Word files describing the beast.
Re: (Score:3, Funny)
"I didn't know you could do that! Why would anyone want to grow celery that way?
Re: (Score:2)
Re: (Score:2, Funny)
Re: (Score:2, Funny)
In mspaint, yeah
Re:Of course (Score:5, Interesting)
Re: (Score:3, Interesting)
One of the UK's beer companies used to help sell their cans by having pictures of models on the side. At the time, it was just an beer can with a picture of a model, but now these pictures capture the fashions of the era, that would be hard for any designer to reproduce without having reference pictures ( 1980's [flickr.com].
Now these beer cans are actually collectors items.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Before Gutenberg, Fleming, and Daimler, culture came from song, dance, meals, swords, wive's tales, fairy tales, needlepoint tapestries, disease, famine, pestilence, horses, handwritten scrolls, and campfires. What's your point, exactly? Our culture is not that culture. Sure, we'd have a culture, but not the culture we have currently.
Re: (Score:2)
Yea, I have to reboot my OOXML like 4 times a day.
Re: (Score:2, Informative)
They won't care either (Score:5, Insightful)
Re:They won't care either (Score:4, Insightful)
Re:They won't care either (Score:5, Insightful)
Actually, I don't think garbage is the problem. I don't think there is a problem as it's being presented to us. Lots of printed media is destroyed also. Just the other day I found pieces of a five hundred page story I wrote a long time ago, then lost the disk. I'm not going to type it in again, so I just discarded it. It's not the first time in history and won't be the last. Very little of what is written is ever published. Most of it is discarded by our relatives after we die.
I think the real issue is that some people feel a need to collect everything that's ever created, like digital horders. If a tax return is old enough to be on floppy, then you don't need it anymore and any critical information from it probably exists somewhere else.
Content with real value self-perpetuates and remains and while some value is lost through attrition, such as websites going down, the consequences are often miniscule in comparison to the concept of archiving everything permanently.
Maybe we do lose those digital pictures on the floppy (and the box of floppies it was stored in) but if it was critical, we'd do something about it. We might print it out, but we lose albums too. They get wet, mouldy and burned, and we lose those memories too.
Too often it's not that important to us to keep until we want it later and can't find it.
Like most things horded, the value lies in keeping good care of what is most important to us, and often we find that what we want to keep is just a reflection of what matters the most.
To quote an interesting book entry I once read: Perspective. Use it or lose it.
That goes for hording digital stuff too.
GrpA.
Re:They won't care either (Score:5, Insightful)
What you say is essentially correct, I'm just pointing out that this has always happened, regardless of the transition to digital.
How many pages of Leonardo DaVinci were used over the centuries to start fires or even wipe asses? How many inventions, concepts and ideas were lost forever? How many musical pieces were lost to antiquity simply because they weren't as popular during the era and slowly became removed from history, piece by piece?
What knowledge became undiscovered when the library of Alexandria was lost?
Losses of information are perpetually occuring. Digital stuff is less likely to be lost because it's so easy to copy, so anything needed for long periods tends to be perpetuated by infinite copying.
Archives are nice (Thankyou Wayback Machine) when you want to find something now lost, but I don't think blaming media is the cause.
Think, as you've put it, that it's gone because someone decided to get rid of it... Did they make the right choice? Maybe not, but it was theirs to make.
I think a bigger issue is DRM... I went to watch some old movie clips I had on an archive the other day while browsing it... They all failed - I didn't have the correct codecs. So I tried to download/find them. Nope. They were gone.
So the clip, which I wanted to view was lost... All I have to know what it was is "funnyvideoclip.avi"
But they were only of value to me so what's the big deal?
Maybe if it was my wedding video, I'd be more annoyed, but then, how many wedding videos, pictures, photo's and even paintings have been lost throughout history?
Just because the loss affected me, it doesn't mean there's a dark age. I'm saying knowledge is always being lost, due to obscurity, damage, natural disasters, political viewpoints and many other factors.
So let's say we lose all copies of programs for the Commodore 64... Is it a dark age? Or is the knowledge we've kept of the machine quite sufficient for contemporary times.
If anything, I think even more retention is made of digital material than non-digital... Just try finding a service manual for a 40 year old obscure car. Not very likely, but if there is a copy anywhere, I'd almost put money on it being digital !
GrpA.
I'm just helping the RIAA (Score:5, Insightful)
Garbage isn't the problem.. the problem is that we have millions of copies of the same data. Think of the 50gb of video games you may have installed.. 10 million people have the same games as you. Music? Unless you performed it yourself or it's sub-underground, chances are millions of people each have multiple copies of it. The anime you've torrented has 10,000 downloads. .
No, see.. actually I'm just keeping a back up for the RIAA in case they lose their copy. PLus I keep it all transcoded to the next generation formats at no charge. And on top of that it's forward deployed for easy re-distribution without bottlenecking their servers. I even paythe lectric bill on the disks and internet connection. So copies are a good thing.
Re:They won't care either (Score:4, Insightful)
Garbage isn't the problem.. the problem is that we have millions of copies of the same data. Think of the 50gb of video games you may have installed.. 10 million people have the same games as you. Music? Unless you performed it yourself or it's sub-underground, chances are millions of people each have multiple copies of it. The anime you've torrented has 10,000 downloads.
As for images on the internet.. well, every repost is a repost repost.
That not a problem , that's called redundancy. If everyone has a copy , and you lose yours , you can get it back easily this way.
It's one of the things that make the internet the powerfull force it is today : it's nearly impossible to completely destroy data.
And trust me , that's a good thing.
Re:They won't care either (Score:5, Insightful)
The problem lies in keeping the unimportant stuff. Nobody cares about your myspace, but if an archaeologist came across a 3000 year old obscenity on a bathroom wall, it would be the find of a lifetime.
Re:They won't care either (Score:4, Funny)
Maybe we should bury a time capsule, to be opened in 1000 years. In that time capsule, a strange black object, with a wheel and a screen. And from that object, when powered on, comes a voice from the past: Never gonna give you up, never gonna let you down...
Re: (Score:3, Insightful)
Hell, a discarded ring pull/glass bottle/flint arrowhead/tooth from a dinosaur weren't considered particularly importan