New York Times Wipes Journalist's Online Corpus 94
thefickler writes "Reading about Peter Wayner and his problems with book piracy reminded me of another writer, Thomas Crampton, who has the opposite problem — a lot of his work has been wiped from the Internet. Thomas Crampton has worked for the New York Times (NYT) and the International Herald Tribune (IHT) for about a decade, but when the websites of the two newspapers were merged two months ago, a lot of Crampton's work disappeared into the ether. Links to the old stories are simply hitting generic pages. Crampton wrote a letter to Arthur Sulzberger, the publisher of the NYT, pleading for his work to be put back online. The hilarious part: according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."
broken links? (Score:2, Interesting)
the NYT is throwing away at least $100,000 for every month that the links remain broken."
now how much would it cost to fix all those links...
no wonder newspapers are not doing well
Re: (Score:2, Informative)
according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."
Also according to one analysis: the world is flat.
Apparently the NYT may have a different opinion.
Either that or they're so large $100,000 a month is so insignificant to them it's not the most viable cost-saving/revenue-improving project for them to start at this time.
Re:broken links? (Score:4, Interesting)
Personally I think that analysis is way out.
I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.
5 of those google links are dated in the last week, which I assume are related to this story.
$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories. Earnings of 500c pm (ie $5 for every 1000 visitors) would mean 20 Million visitors a month are clicking through to his stories specifically and can't be assuaged with any other content.
This would only be a loss if a similar / 404 / search landing page had a lower earnings rate.
Seems unlikely to me - I think this is just [very clever] linkbaiting from someone who, it appears, was sacked from the NYT and is trying to make a living elsewise.
Re: (Score:2, Interesting)
$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories.
Forgive me, father, for I have RTFA:
So essentially, the "one analysis" says that if they wanted to buy the very-roughly-estimated traffic they hypoth
Re: (Score:3, Interesting)
Re: Expensive Data? (Score:2)
I'm sure you are not a troll (YANAT), but the other side of slashdot is discussing how cheap data is, woe to the pay providers.
I think you mean that keeping data in a sophisticated manner is what grinds out IT Admin time, which eventually means a salary to pay. But the data itself is cheap, and 50,000 fellas on here can whack up something simple as a makeshift in a week for $5,000 and a month's supply of pizza&caffeine.
Re: (Score:1)
But the data itself is cheap, and 50,000 fellas on here can whack up something simple as a makeshift in a week for $5,000 and a month's supply of pizza&caffeine.
I am making a 3TB RAID 10 array (6 1TB drives) rackmount server for around $850
Re: (Score:2)
I'll grant you $150 in misc. expense.
The other $4000 is 2.5 hours of link maintenance a week for two years.
But even so, we agree that $100K is ludicrous. That's the price attitude that killed wall street.
Wayback machine (Score:5, Informative)
CNN's website doesn't have as many broken links. (Score:4, Informative)
Articles over a decade old still work!
Whoever designed theirs deserves a lot of credit.
Re:CNN's website doesn't have as many broken links (Score:4, Insightful)
Seriously though, don't give them standing ovations simply because everybody else fail. Tell me this in 50 years and I'll honestly clap my hands.
Re:CNN's website doesn't have as many broken links (Score:5, Funny)
Pay me $100,000 per month and I'll dishonestly clap my hands right now.
Re: (Score:2)
Hell, the way things are right now you could pay me $10,000 a month and I'll gladly clap my hands 40 hours a week in whatever venue you deem most appropriate.
Re:CNN's website doesn't have as many broken links (Score:5, Funny)
...I'll gladly clap my hands 40 hours a week in whatever venue you deem most appropriate.
Well now, that depends on what you're willing to have in between your hands while clapping, and how soft your hands are...
Re: (Score:2)
I'll do it for $50k, and I'll pretend to be genuinely impressed too!
Re: (Score:2)
my old national geographic magazines from 50 years ago still work. even the old ads are still there... NEAT!
seriously, in a hundred years there is going to be a huge history gap. it's great to read old magazines and books and newspapers. what is anyone in the year 2100 going to read from 2009? nothing will be printed out or compatible with whatever brain-link stuff they use in the future...
all you will have is old shaky-cam JJ Abrams videos as a record of the early 21st century... sad.
This sucks (Score:4, Insightful)
Re: (Score:2, Interesting)
Re: (Score:2)
Yep.
Remember this story from a few hours ago?
http://it.slashdot.org/article.pl?sid=09/05/15/0138204 [slashdot.org]
(~)
Locking up content is fun!! Then you can sue Pirates(TM) when someone copies a plane design. But if the site admin "never actually kept an offline copy" then years of data is gone!!
(/~)
(Speaking of which, that story sounds totally bogus. They coded live on production servers?? Sounds like they played dead.)
Re: (Score:2)
I like the idea in a way though that the consumers decide whether the content is useful enough to hang on to. In a
Re: (Score:2)
Just imagine what anthropologists in 1000 years time will think. They will mark this century as the turning point in human history (just like the last one ... and the one before that), and the only evidence of human culture is a few old tape back-ups from slashdot and 4chan.
Re: (Score:2)
Re: (Score:1)
Probably the love child of Dr. Seuss and Isaac Asimov.
Re:This sucks (Score:4, Informative)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
That'd be "Ms Fnd in a Lbry", but you'd do better to go back to the original source material, which would have been JL Borges's The Library of Babel.
Re: (Score:2)
Re: (Score:3, Interesting)
Bad Merge (Score:1)
This is so unfortunate. IHT was great before the merge, which was touted as a "new" version of IHT. Instead, they just canned it and attempted to transfer its content to the existing NYT site. And did a dreadful job, it seems.
I understand the logic - newspapers need to cut costs because they can't figure out the internet and it is killing them. But they lost a dedicated reader in me with this move.
The Internet Is the New Library of Alexandria (Score:5, Interesting)
Re: (Score:2)
Re:The Internet Is the New Library of Alexandria (Score:5, Funny)
And it's got unlimited space.
The internet is actually nearly full, I hope there is eno
Re: (Score:2)
Sadly, data on the internet is currently a lot more volatile than the library of Alexandria, and the internet's contents are likely to survive for much less time. Digital media doesn't have the lifespan of ancient media, even papyrus. :-(
Are broken links the real issue? (Score:1)
Re: (Score:2)
I still find it strange that it seems to be only old-world, *major* corporations that have this problem so badly.
Every random kid's blog and webcomic has archives dating back to the day the thing started and easily accessible.
Re: (Score:2)
Error establishing a database connection (Score:5, Funny)
I was interested in reading the analysis that led to the $100,000/month loss per month the guy's work was offline. So doing what you do, I clicked on the link and found it grandly hilarious to receive a 500 error stating: "Error establishing a database connection". Oh, the irony.
Re: (Score:2)
They're slashdotted, losing lots of traffic: so yes, it's ironic. But you can read the article if you want:
Paste the link "http://www.globaltechproducts.com/blog/1734/how-not-to-redesign-your-website-a-marketing-lesson-from-nytimescom/" into Google, you'll find the article in the Google cache.
The (to me questionable) basis for the calculation is that all old International Herald Tribune links are broken. It used to get X million hits per month, which are by a hokey calculation worth $100k.
Re: (Score:2)
this comment is missing from google cache, however...
Re: (Score:2)
Links should be permanent (Score:5, Interesting)
Whenever I redesign my site, I try hard to avoid changing and URLs. But if I do have to change a URL, I always make sure that there is a redirect (preferably a HTTP/301 permanent redirect) that points from the old URL to the new URL. Updating links is not enough, because you will always have links that come from external sites that you don't control, user bookmarks, links found in "Hey, check this article out" e-mails, etc.
This is one of those basic principles of the web that the W3C (and for those who don't pay attention to them, you can substitute that with "plain old common sense" here) strongly recommends.
It means that users can always find and view content. It means that you still retain your ad revenue. It means that you still keep your PageRank for external sites that link. It means less bitrot and a more useful web...
Re: (Score:1)
Well.. we have a huge majority of "designers" out there who design to Microsoft dogma and can't even be bothered to even check their web page using Firefox on their own machine right now. They could care less about any type of good practice let alone trying to conform (or even reading in the first place) the ideas that W3C has put out.
None of this is surprising to me in the least... just sad.
Re: (Score:2)
The problem there is this only works if one controls the _entire_ URL.
I had pages on AOL's FTP/webspace since its inception through AOL's ``sunsetting'' those services --- unfortunately, I published a number of papers which had links to http://members.aol.com/willadams [aol.com] so all the printed copies are out of date since there's no way to update them to http://mysite.verizon.net/william_franklin_adams/ [verizon.net]
It's this sort of thing which makes the MLA's decision to omit hard-coded URLs from their references....
http://w [insidehighered.com]
You're continuing your problem. (Score:2)
You complain about how all of your AOL-hosted links ceased to work and how you're unable to update all the places they were used to point to your (currently) Verizon-hosted content. Do you see the problem with this?
The solution to this is to get your own domain, so you retain the ability to move it at will. I started out with my primary domain (http://www.fencepost.net/ [fencepost.net]) because I wanted a reliable email address after two successive ISPs were bought out. I would never use a carrier-provided email address as
Re: (Score:2)
Acknowledged. For my part, I've quit putting my homepage URL in papers and instead will just upload stuff to CTAN and point to that.
I looked into registering a domain name, but coudn't find one I liked (not that I like william_franklin_adams) --- ::grrr:: squatters.
William
Re: (Score:2)
The solution might be to place a GUID and keywords in anything you post online, and specify its location by saying "Google GUID1239872129412 Joe Schmoe Lemur behavior paper"
Then if you move hosts, it'll eventually get picked up by search engines and people will be able to find it, even if the URL itself has changed. (Hell, it might even find a copy someone made and posted at their own site.)
Another story about the necessity of backups.... (Score:3, Interesting)
Re:Another story about the necessity of backups... (Score:3, Interesting)
It's hard to tell from the linked article (yeah, I read it) but it doesn't seem like Crampton has no copies of the articles (surely he would keep of his own stuff) but that they're just not accessible on the Internet. All the links that should point to them from the NYT and the IHT went kablammo when the two sites merged.
There's no way a back up on his end could fix this problem.
Re: (Score:3, Insightful)
I feel for the guy and his lost articles, [...]
I feel for him too. Of course the articles aren't his, they are his employers (unless he has a contract that says otherwise) - which is probably why he's bothered. If they were _his_ articles then he could wholesale upload them to his own site and reap the rewards (whatsoever they may be).
And THIS, dear-readers, is why paper will win (Score:5, Interesting)
In the digital age, wiping out thousands of volumes of material takes mere seconds. Permanently. Gone. Poof.
We have books, printed books, which go back hundreds and hundreds of years (well, written material; the printing press is a fairly recent invention).
We don't even have a record of some newspaper articles that came out 5 years ago. We're LOSING our history, not retaining it, because we lack sufficient "printing" to always keep a copy in circulation. Witness the Avism.com [slashdot.org] debacle and hundreds of other cases where this has happened.
Until we can have a hard-copy of digital media which can NOT be changed, edited, altered or redacted... we're lost.
When we all have "Kindle DX2" devices in the classroom for digital copies of our textbooks... what is stopping them from "gently changing" some of the wording over time, over a few years, to permanently alter the way our youth views the history of times they never lived through?
How can you compare one version of a website today, with the one that was there last week? Was anything changed? Was article content "censored" in any subtle way?
We're heading down a very slippery slope, when digital information can't remain static enough to hold through the years, and be validated and verified to be unchanged, with sufficient copies in enough hands, to ensure survivability. The Internet is not the place to "store" things you want to keep for years and decades.
Re: (Score:2)
What makes you believe this isn't already occuring with paper textbooks? I can't speak for the current crop (as new editions are pushed on schools practically every year) but when I was in middle / high school our social studies and
Re:And THIS, dear-readers, is why paper will win (Score:4, Insightful)
Fahrenheit 451 [wikipedia.org]:
Re: (Score:2)
You mean like these [wikipedia.org]?
Re: (Score:2)
Yes, except the shelf life of standard, single-use, recordable CDs is 5-8 years, max.
What do you envision happening when those CDs "expire" at that point? Copying the data down to a hard drive and re-burn every decade? Not feasible either.
Re: (Score:2)
I'm going to guess that varies based on the quality of the disc, because I have recordable CDs past that age that still work without a hitch. You're right that paper's the best way to go, but it's not the only way.
Re: (Score:2)
Would not like to be a historian in the future (Score:2)
Much of what we know about past days is from written material. With move towards net everything and the decline in print as the internet changes (and I do not mean just the web; email, gopher, irc, usenet, ftp archives, et al are all prone to this problem) much of our history will be lost to generations to come purely through attrition.
Then we have the problem of changing file formats, media which decays rapidly when compared to paper and decent inks, obsolescence of technology (try finding a laptop with a
Magazine websites do this all the time (Score:4, Interesting)
My company links to articles on a lot of magazine websites, and I'm just amazed at how often the links become broken. Sites get redesigned and they don't bother redirecting the old URLs to the corresponding new locations. Or, even worse, they just discard all of the old articles, or random articles disappear or come up blank or mangled. Does it not occur to them that websites, search engines, and blogs are left with broken links? Do they not realize that people bookmark the articles?
Re: (Score:2)
Rather than bookmark an article I save a copy to disk. It's the surest way of being able to read it later. Even if the site's admins are competent enough to keep the URL pointing at the right place, there's no guarantee that the article won't disappear behind a paywall.
Re: (Score:2)
This why Google Notebook is (was) so nice - makes it very easy to copy (with most formatting retained), which keeping the link to where it came from.
I've dabbled with some of the free replacements (like Zotaro) but none have been able to match the features and ease of use of Google's service.
Has it been fixed? (Score:2)
I clicked on the two links listed at the bottom of the open letter to Arthur Sulzberger (both are IHT links), and both now are redirected to the correct articles on the www.nytimes.com domain. Has the NYT fixed the problem and no one has just bothered to mention that?
Re: (Score:2)
Work for me too. Perhaps the web dudes at NYT were in cahoots to help him get this linkbait up.
Re: (Score:2)
Perhaps they want people to buy archive prints or reprints instead of looking up old content?
Do newspapers get many hits for old stories? I'd have thought most people go to a front page or section-page and work from there to get their news?
Jumping the gun? All the articles seem to be up (Score:2)
Re: (Score:2)
Read TFA more closely. He has reported for both the Times and the IHT. It's his IHT work that has disappeared, while the Times stuff is still there.
Re: (Score:1)
Errm... everyone knew Iraq had that stockpile of yellowcake, it wasn't a secret. The trouble was, since the IAEA had been monitoring it and it hadn't been touched, the US couldn't exactly claim that it showed Iraq was making WMDs. Hence the faked evidence of Saddam Hussein attempting to purchase more yellowcake that the IAEA wouldn't know about.
another part of terrible times web site (Score:2)
any good /. er could go on and on about the problems of the times website. I actually had to tell them that they needed a button so people could go back or forward one day at a time (any std site for a journal has this feature - look at say amer chem soc journals, there is a button that goes forward or back one issue)
I have repeatedly told them their comments suck and they should have slashcode and wikipedia - can you imagine how much traffic the times website would generate if each of their great articles
the real story (Score:2)
A "*pedia" in a far far future (Score:2)
Peter Wayner - author of a famous and well known book on compression algorithms, which managed to survive the Big Howl of Internet due to its relatively popularity on the time it was written. It was recovered thanks to thousands of fragments found in hundreds of hard drives all over the world.
Thomas Crampton - A supposedly journalist for the once famous New York Times. His personality is quite obscure and nothing is known about him, except for a short reference in the once famous Slashdot forum on Internet.
Thomas Crampton is an idiot. (Score:3, Informative)
When you read the article, you find one of the main reasons he wants the articles back up is because he himself doesn't have copies of the articles. TFA and Slashdot are full of angst towards the megacorp, but nobody seems to have noted this point.
The issue has been resolved. (Score:2)
Interesting. I got quite upset with the IHT-NYT change a while ago for exactly this reason: many bookmarks and links to news articles that I had made throughout the years evaporated overnight, making me regret not printing or saving the text of those articles when I had the chance. But apparently the NYT has fixed it now. Crampton links to two articles of a scoop he had a few years ago, and they resolve to a new page. And a bookmark that I have on the computer I'm working on now has the same thing, suggesti
Re: (Score:2)
Apologies for the reply to self, but I tried a few more links which did not resolve, but the current IHT landing page [nytimes.com] says it all: "The most recent IHT articles can now be found by searching NYTimes.com. We are in the process of moving IHT articles dating back to 1991 over to NYTimes.com. Thanks for your patience as we complete this transition."
$100k / month? (Score:2)
Analyses are a dime-a-dozen, and as we know from past experience, analysts are often biased, stupid, or insane.
So does it really matter than one analyst came up with a number that, if true, would make NYT look foolish?
His political leanings? (Score:2)
I don't know anything about this gentleman, but, maybe, his writings simply go against the current Illiberal pro-Democrat bias of the paper? They weren't always this way — most famously, NYT used to be against government-mandated minimum wage [ncpa.org] until 1999.
Perhaps, they are trying to score some favors from the current government in the hopes of getting substantial financial help (a bailout [washingtontimes.com], that was, no doubt, already promised to them) and certain writers are no longer welcome?
One does not need to be
Waaaaah (Score:2)
"They took my work and erased it! Please mommy help me!" - That's one solution. The other solution is for this journalist to get off his fat ass, buy a personal website, and publish all his back work for everyone to see.
You know, when I left Lockheed ten years ago most of my work ended-up in the dumpster too. That's life. If I felt it was important enough to publish, I'd simply copy it to my c: drive and later my personal website. It's a much simpler solution than whining to my ex-boss. It's MY job t
Welcome to the Web (Score:4, Insightful)
One of the greatest delusions that people have about the Web is that almost all information can be found on it somewhere. What total nonsense.
Stories rot from the Web faster than newspaper print ever has or ever will. All that we're left with is the most recent version or revision, which may have *nothing* to do with what was first written.
If you don't keep copies of your work that appears on the Web, you might as well have thrown them into a fire-place. And, as for everyone else, if you assume for even a moment that what you read on the Web about what happened even in technology news even five years reflects what people really wrote and thought at the time, you're a fool.
It's thanks to delusions like this that, for example, people can argue sincerely that Windows is popular because it's good; and not because Microsoft forced a monopoly on hardware vendors. Almost all the reports of DoJ vs. Microsoft from the time are long gone now. The proof that Microsoft's products are only popular because Microsoft made damn sure that no one else would have a chance to compete against them has vaporized.
The only thing newsworthy about what's happened here is that people think that stories disappearing like this is in any way what-so-ever noteworthy. It happens every day.
Steven
Fake figures - who says $100K (Score:2)
So where did the value of $100,000 come from?
"To buy that traffic from Google at $.20/click, you'd have to pay $100,000 a month"
So google says its worth 20 cents a click. What if I say it's only worth a cent a click then its worth $5000, or perhaps at 0.1 cents a click its worth $500.
All make believe. Don't tell me "an expert told you so" because I think a bunch of "experts" called "bankers" just got discredited a few months ago for overvaluing other virtual sales... ;-)
Except I guess this is America so the