New York Times Wipes Journalist's Online Corpus 94
thefickler writes "Reading about Peter Wayner and his problems with book piracy reminded me of another writer, Thomas Crampton, who has the opposite problem — a lot of his work has been wiped from the Internet. Thomas Crampton has worked for the New York Times (NYT) and the International Herald Tribune (IHT) for about a decade, but when the websites of the two newspapers were merged two months ago, a lot of Crampton's work disappeared into the ether. Links to the old stories are simply hitting generic pages. Crampton wrote a letter to Arthur Sulzberger, the publisher of the NYT, pleading for his work to be put back online. The hilarious part: according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."
broken links? (Score:2, Interesting)
the NYT is throwing away at least $100,000 for every month that the links remain broken."
now how much would it cost to fix all those links...
no wonder newspapers are not doing well
Re:This sucks (Score:2, Interesting)
The Internet Is the New Library of Alexandria (Score:5, Interesting)
Links should be permanent (Score:5, Interesting)
Whenever I redesign my site, I try hard to avoid changing and URLs. But if I do have to change a URL, I always make sure that there is a redirect (preferably a HTTP/301 permanent redirect) that points from the old URL to the new URL. Updating links is not enough, because you will always have links that come from external sites that you don't control, user bookmarks, links found in "Hey, check this article out" e-mails, etc.
This is one of those basic principles of the web that the W3C (and for those who don't pay attention to them, you can substitute that with "plain old common sense" here) strongly recommends.
It means that users can always find and view content. It means that you still retain your ad revenue. It means that you still keep your PageRank for external sites that link. It means less bitrot and a more useful web...
Another story about the necessity of backups.... (Score:3, Interesting)
And THIS, dear-readers, is why paper will win (Score:5, Interesting)
In the digital age, wiping out thousands of volumes of material takes mere seconds. Permanently. Gone. Poof.
We have books, printed books, which go back hundreds and hundreds of years (well, written material; the printing press is a fairly recent invention).
We don't even have a record of some newspaper articles that came out 5 years ago. We're LOSING our history, not retaining it, because we lack sufficient "printing" to always keep a copy in circulation. Witness the Avism.com [slashdot.org] debacle and hundreds of other cases where this has happened.
Until we can have a hard-copy of digital media which can NOT be changed, edited, altered or redacted... we're lost.
When we all have "Kindle DX2" devices in the classroom for digital copies of our textbooks... what is stopping them from "gently changing" some of the wording over time, over a few years, to permanently alter the way our youth views the history of times they never lived through?
How can you compare one version of a website today, with the one that was there last week? Was anything changed? Was article content "censored" in any subtle way?
We're heading down a very slippery slope, when digital information can't remain static enough to hold through the years, and be validated and verified to be unchanged, with sufficient copies in enough hands, to ensure survivability. The Internet is not the place to "store" things you want to keep for years and decades.
Magazine websites do this all the time (Score:4, Interesting)
My company links to articles on a lot of magazine websites, and I'm just amazed at how often the links become broken. Sites get redesigned and they don't bother redirecting the old URLs to the corresponding new locations. Or, even worse, they just discard all of the old articles, or random articles disappear or come up blank or mangled. Does it not occur to them that websites, search engines, and blogs are left with broken links? Do they not realize that people bookmark the articles?
Re:Another story about the necessity of backups... (Score:3, Interesting)
It's hard to tell from the linked article (yeah, I read it) but it doesn't seem like Crampton has no copies of the articles (surely he would keep of his own stuff) but that they're just not accessible on the Internet. All the links that should point to them from the NYT and the IHT went kablammo when the two sites merged.
There's no way a back up on his end could fix this problem.
Re:broken links? (Score:4, Interesting)
Personally I think that analysis is way out.
I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.
5 of those google links are dated in the last week, which I assume are related to this story.
$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories. Earnings of 500c pm (ie $5 for every 1000 visitors) would mean 20 Million visitors a month are clicking through to his stories specifically and can't be assuaged with any other content.
This would only be a loss if a similar / 404 / search landing page had a lower earnings rate.
Seems unlikely to me - I think this is just [very clever] linkbaiting from someone who, it appears, was sacked from the NYT and is trying to make a living elsewise.
Re:This sucks (Score:3, Interesting)
You do realize that it's sample code and thus is used ot illustrate the API in question? As it's illustrative only, it will be missing a lot of essential code in the name of clarity. Stuff like error handling (the docs will tell you what it returns and how it returns it), parameter/return checks on associated API calls, or even input checks. After all, most people want to see how to use the API in a few lines of code, not deal with a 1000-line program because the author decided to check every return value (even the ones to printf()) and abort gracefully in every potential instance. That's not sample code, and extracting the "how do I use this API?" information from it is quite difficult because of the extraneous code.
It's assumed a halfway competent programmer would realize that, and use the API properly with proper error checking and input sanitization. Alas, that isn't the case most of the time, and you'll find the sample code copied-and-pasted into production code by codemonkeys who don't appear to think. If I was a particularly vicious developer at Microsoft, I might code the sample code with known security holes (but any halfway decent programmer would fix since it would be obvious) and then check applications for those holes later...
Re:broken links? (Score:2, Interesting)
$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories.
Forgive me, father, for I have RTFA:
So essentially, the "one analysis" says that if they wanted to buy the very-roughly-estimated traffic they hypothetically lost, it would cost them $100K to do so.
"They'd have to spend $100K to get the traffic they were before" is NOT the same as "they are losing $100K as a result of the lost traffic," which the "analysis" suggests.
- RG>
Comment removed (Score:3, Interesting)