Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

[ Create a new account ]

New York Times Wipes Journalist's Online Corpus

Posted by timothy on Friday May 15, @08:12AM
from the check-the-unpersonals dept.
thefickler writes "Reading about Peter Wayner and his problems with book piracy reminded me of another writer, Thomas Crampton, who has the opposite problem — a lot of his work has been wiped from the Internet. Thomas Crampton has worked for the New York Times (NYT) and the International Herald Tribune (IHT) for about a decade, but when the websites of the two newspapers were merged two months ago, a lot of Crampton's work disappeared into the ether. Links to the old stories are simply hitting generic pages. Crampton wrote a letter to Arthur Sulzberger, the publisher of the NYT, pleading for his work to be put back online. The hilarious part: according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."
internet it media storage hardware storage story

Related Stories

[+] Ask Slashdot: What Can I Do About Book Pirates? 962 comments
peterwayner writes "Six of the top ten links on a Google search for one of my books point to a pirate site when I type in 'wayner data compression textbook.' Others search strings actually locate pages that are selling legit copies including digital editions for the Kindle. I've started looking around for suggestions. Any thoughts from the Slashdot crowd? The free copies aren't boosting sales for my books. Do I (1) get another job, (2) sue people, or (3) invent some magic spell? Is society going to be able to support people who synthesize knowledge or will we need to rely on the Wikipedia for everything? I'm open to suggestions."
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Reply
Loading... please wait.
  • the NYT is throwing away at least $100,000 for every month that the links remain broken."

    now how much would it cost to fix all those links...

    no wonder newspapers are not doing well

    • Re: (Score:2, Informative)

      according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."

      Also according to one analysis: the world is flat.

      Apparently the NYT may have a different opinion.

      Either that or they're so large $100,000 a month is so insignificant to them it's not the most viable cost-saving/revenue-improving project for them to start at this time.

      • Re:broken links? (Score:4, Interesting)

        by pbhj (607776) on Friday May 15, @09:31AM (#27965205) Homepage Journal

        Personally I think that analysis is way out.

        I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.

        5 of those google links are dated in the last week, which I assume are related to this story.

        $100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories. Earnings of 500c pm (ie $5 for every 1000 visitors) would mean 20 Million visitors a month are clicking through to his stories specifically and can't be assuaged with any other content.

        This would only be a loss if a similar / 404 / search landing page had a lower earnings rate.

        Seems unlikely to me - I think this is just [very clever] linkbaiting from someone who, it appears, was sacked from the NYT and is trying to make a living elsewise.

        • Re: (Score:3, Interesting)

          Personally I think that analysis is way out.

          I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.

          That's why you're not an investigative journalist. The $100k/mo. estimate is for all IHT articles which were erased in the merger; not just the one author.

          Do try to keep up.

  • Wayback machine (Score:5, Informative)

    by wjousts (1529427) on Friday May 15, @08:25AM (#27964539)
    Groovy baby [archive.org].
  • CNN's website doesn't have as many broken links.
    Articles over a decade old still work!
    Whoever designed theirs deserves a lot of credit.
  • This sucks (Score:4, Insightful)

    by ILongForDarkness (1134931) on Friday May 15, @08:30AM (#27964581)
    We've come to rely on being able to find things on the internet, it is sad to think that information might go away and cease to exist. That said, I guess it depends on the contract the writers have whether he has a right to have his body of work preserved or not. I mean if a company pays for your work it is theirs and not yours unless your contract entitles you to it. Once you've sold your work to somebody, they can never have anyone read it and use it to line hamster cages for all they care.
    • Re: (Score:2, Interesting)

      Yet another reason why locking up content is wrong. Let it be freely copied, and then ANYONE who finds the work valuable can potentially become a caretaker of the work and keep it accessible online. Then the only way a work would disappear is if nobody has interest or time to preserve it.
        • Re: (Score:3, Interesting)

          Hehe yeah. more than a GB of help files and still no help. Have you ever read the EULA for MSDN? It has nice phrases like "code supplied as is" etc. No guarantee that it will work, that it is suitable for any purpose, etc. pretty boiler plate. (I've always found it funny though that documentation that says "this is how you do that" isn't held to at least work for what it is sold to work for). Then, it has a bunch of stuff that says in effect, you won't accuse MS of writing bad code, if someone sues them bec

  • And it's got unlimited space. Strangely enough, some people are adamant about keeping their works out of this library. And I say they have the right to insure the internet forgets about them when they die. This poor soul seems to understand what's going on.
  • I was interested in reading the analysis that led to the $100,000/month loss per month the guy's work was offline. So doing what you do, I clicked on the link and found it grandly hilarious to receive a 500 error stating: "Error establishing a database connection". Oh, the irony.

  • by code65536 (302481) on Friday May 15, @08:49AM (#27964749) Homepage Journal

    Whenever I redesign my site, I try hard to avoid changing and URLs. But if I do have to change a URL, I always make sure that there is a redirect (preferably a HTTP/301 permanent redirect) that points from the old URL to the new URL. Updating links is not enough, because you will always have links that come from external sites that you don't control, user bookmarks, links found in "Hey, check this article out" e-mails, etc.

    This is one of those basic principles of the web that the W3C (and for those who don't pay attention to them, you can substitute that with "plain old common sense" here) strongly recommends.

    It means that users can always find and view content. It means that you still retain your ad revenue. It means that you still keep your PageRank for external sites that link. It means less bitrot and a more useful web...

  • I feel for the guy and his lost articles, but I am wondering why he did not keep backups of everything? The stories seem to be gone forever, or else his letter would be about to re-publishing. his stories on his own website.... That is a rather bad case of negligence on the publisher's side , but more so on the part of Mr. Crampton. For comparison: I work with a professional fotojournalist and this guy has been working for 50 years now and has archived everything (more than 1.5 million pictures) like a mad squirrel. If you ask him about an article he wrote in 1961, it takes him about five minutes to find a copy of the article and the raw materials. Everything analog but nonetheless... That makes you wonder if -while embracing digital media and the blogosphere - many journalists have not brought with them the necessary tools to manage and archive their digital assets.
    • It's hard to tell from the linked article (yeah, I read it) but it doesn't seem like Crampton has no copies of the articles (surely he would keep of his own stuff) but that they're just not accessible on the Internet. All the links that should point to them from the NYT and the IHT went kablammo when the two sites merged.

      There's no way a back up on his end could fix this problem.

    • Re: (Score:3, Insightful)

      I feel for the guy and his lost articles, [...]

      I feel for him too. Of course the articles aren't his, they are his employers (unless he has a contract that says otherwise) - which is probably why he's bothered. If they were _his_ articles then he could wholesale upload them to his own site and reap the rewards (whatsoever they may be).

  • In the digital age, wiping out thousands of volumes of material takes mere seconds. Permanently. Gone. Poof.

    We have books, printed books, which go back hundreds and hundreds of years (well, written material; the printing press is a fairly recent invention).

    We don't even have a record of some newspaper articles that came out 5 years ago. We're LOSING our history, not retaining it, because we lack sufficient "printing" to always keep a copy in circulation. Witness the Avism.com [slashdot.org] debacle and hundreds of other cases where this has happened.

    Until we can have a hard-copy of digital media which can NOT be changed, edited, altered or redacted... we're lost.

    When we all have "Kindle DX2" devices in the classroom for digital copies of our textbooks... what is stopping them from "gently changing" some of the wording over time, over a few years, to permanently alter the way our youth views the history of times they never lived through?

    How can you compare one version of a website today, with the one that was there last week? Was anything changed? Was article content "censored" in any subtle way?

    We're heading down a very slippery slope, when digital information can't remain static enough to hold through the years, and be validated and verified to be unchanged, with sufficient copies in enough hands, to ensure survivability. The Internet is not the place to "store" things you want to keep for years and decades.

  • My company links to articles on a lot of magazine websites, and I'm just amazed at how often the links become broken. Sites get redesigned and they don't bother redirecting the old URLs to the corresponding new locations. Or, even worse, they just discard all of the old articles, or random articles disappear or come up blank or mangled. Does it not occur to them that websites, search engines, and blogs are left with broken links? Do they not realize that people bookmark the articles?

  • When you read the article, you find one of the main reasons he wants the articles back up is because he himself doesn't have copies of the articles. TFA and Slashdot are full of angst towards the megacorp, but nobody seems to have noted this point.

  • Welcome to the Web (Score:4, Insightful)

    by sjvn (11568) <sjvn.vna1@com> on Friday May 15, @10:46AM (#27966541) Homepage

    One of the greatest delusions that people have about the Web is that almost all information can be found on it somewhere. What total nonsense.

    Stories rot from the Web faster than newspaper print ever has or ever will. All that we're left with is the most recent version or revision, which may have *nothing* to do with what was first written.

    If you don't keep copies of your work that appears on the Web, you might as well have thrown them into a fire-place. And, as for everyone else, if you assume for even a moment that what you read on the Web about what happened even in technology news even five years reflects what people really wrote and thought at the time, you're a fool.

    It's thanks to delusions like this that, for example, people can argue sincerely that Windows is popular because it's good; and not because Microsoft forced a monopoly on hardware vendors. Almost all the reports of DoJ vs. Microsoft from the time are long gone now. The proof that Microsoft's products are only popular because Microsoft made damn sure that no one else would have a chance to compete against them has vaporized.

    The only thing newsworthy about what's happened here is that people think that stories disappearing like this is in any way what-so-ever noteworthy. It happens every day.

    Steven