The DIY Book Scanner 177
azoblue writes "Daniel Reetz did not want to lug around heavy textbooks, so he built a book scanner to create digital copies. '... over three days, and for about $300, he lashed together two lights, two Canon Powershot A590 cameras, a few pieces of acrylic and some chunks of wood to create a book scanner that's fast enough to scan a 400-page book in about 20 minutes (PDF). To use it, he simply loads in a book and presses a button, then turns the page and presses the button again. Each press of the button captures two pages, and when he's done, software on Reetz's computer converts the book into a PDF file. The Reetz DIY book scanner isn't automated — you still need to stand by it to turn the pages. But it's fast and inexpensive.'"
Too bad slavery is illegal (Score:3, Funny)
This would be a good activity for the winter months when farming isn't possible.
Re: (Score:2, Funny)
This would be a good activity for the winter months when farming isn't possible.
That's why God gave us illegal immigrants.
Look out! (Score:4, Insightful)
Here comes the Publisher's Copyright Enforcement Gundams to give you "What For!".
Imagine that, thinking you could actually DO Something like that with your very own property.
What cheek!
Re: (Score:3, Insightful)
Not sure he did it for his own property. But it does prove that books have the best DRM of all.
Re: (Score:2)
Yes, like the dozens of times they've pulled it before.
Re: (Score:3, Insightful)
Let's recap:
-Consumers aren't having their rights protected
-Some courts are actually ruling in favor of removing customers rights (i.e. every time the RIAA has won some part of a case_
-Legislation to remove rights from consumers is getting more and more popular
-The RIAA and other organizations are making bank off of their sue, settle, and drop campaign. (Sue at random, settle for thousands, drop the cas
Re: (Score:3, Insightful)
Right. After all, scanners have only been around for about fifty years: the publishers just haven't noticed yet. This homebrew effort is sure to bring the matter to their attention.
Re: (Score:3, Interesting)
This allows people to generate high-quality scans of books. Especially with the price of high-quality camera
A bargain (Score:5, Informative)
Except for the lack of an automatic page-turner, Daniel's device is the same as one you can buy commercially for about $20,000 (http://www.treventus.com/bookscanner_pageturner.html).
He was wise to decide on manual page-turning.
Re: (Score:3, Interesting)
I have Kinko's/Staples/ Office Depot cut off the spine ($1-$5), clip it on all sides, and go home to my Fujitsu ScanSnap for ADF scanning, auto color/ b/w selection, and OCR. Oh, and you press the button once and walk away.
Re: (Score:2)
Re: (Score:3, Insightful)
The automatic page turner costs an additional 19700 / 833 hours = 23.64 per hour. Hire a high school student for 8.
Re: (Score:2)
Yeah, but then you have an automatic page turner you can sell on ebay.
Re: (Score:2)
True ... how much does a used automatic page turner fetch on ebay?
Re: (Score:3, Funny)
Better yet, how much does a high school student fetch on ebay?
Re: (Score:2)
http://www.vietnamhumanrights.net/Forum/JB_41104.htm [vietnamhumanrights.net]
Starting bid of $5400.
automated page turners pretty cheap (Score:2)
Re: (Score:2)
That's 35 DAYS of actual scanning, people don't work solidly.
more realistically with a 50 hour working week (which is rather on the high side) it's 17 working weeks or about a third of a man-year.
If you use minimum wage labour (or value your own time that low) it's probablly still cheaper than buying the commercial scanner new and throwing it away/putting it in the loft and forgetting about it afterwards but I bet it's higher than the cost of buying the commercial scanner used, doing the scans and reselling
Heh (Score:4, Insightful)
I was excited when I read this because it is a pain in the ass to turn the pages in a 1000 page Constitutional Law textbook. Thus, you can imagine my disappointment when I read that his machine doesn't automate this.
Most universities have at least one library which has a Ricoh scanner that does exactly what his does, i.e. it writes out a PDF onto your USB stick. I don't know where he's a graduate student, but I bet if he looked in his library he could have saved himself $300.
Re: (Score:2)
Most universities have at least one library which has a Ricoh scanner that does exactly what his does, i.e. it writes out a PDF onto your USB stick. I don't know where he's a graduate student, but I bet if he looked in his library he could have saved himself $300.
Except most scanners take on the order of tens of seconds to scan a page, and force you to pick up the book, turn the page, and put it flat again. This arrangement takes a picture and the book is in its normal orientation, so page turning is easy.
Re: (Score:2)
Re:Heh (Score:4, Informative)
He has details of the reasons on his blog danreetz.com/blog [danreetz.com]
Re: (Score:3, Insightful)
I do this for my law school textbooks (unless you're a book publisher, in which case I am joking and would never break the law).
What law are you breaking?
Whether you scan it and convert the OCRed text into an audio book, rip all the pages out and turn it into an art exhibit, or use the book for toilet paper, the publisher has no legal right (AFAIK) to stop you.
Re: (Score:2)
IANAL, but... Scanning books is a form of copying. Converting OCRed text into an audio book is a form of creating a derivative work. Both of these fall under the purvey of copyright law, and may or may not fall under fair use. It may be the sort of thing that you could fight and win in court, but you'd probably have to fight. And, of course, if the original poster explicitly created this machine because textbooks are expensive, then the "significant non-infringing uses" defense is definitely lower.
Using
Re: (Score:2)
You might want to read the front matter of just about every book published to see that they specifically address feeding the book into a computer in any way possible and say it is a violation of the copyright if done without permission.
Of course, nobody gives a rat's ass about copyright any longer. So torrenting the books from somewhere like Romainia should be just fine.
Re:Heh (Score:4, Insightful)
It doesn't matter what they say. It matters what the law says, and if they tell you that you can't do something the law says you can, the law wins. The more books add legal crap in order to be more like software EULAs, the more lies they will incorporate, like software EULAs.
I doubt there's much of a chance at all that you would be found guilty of copyright infringement for making a format change of your own book, for your own use. That's nearly the most straightforward example of fair use you could imagine. If you distributed it, sure; that's not fair use.
Re:Heh (Score:4, Insightful)
Not sure where you live, but in most areas format shifting is usually recognized as fair use. Whether or not torrenting the PDF counts as format shifting isn't a question that the courts have answered yet, but it's currently the most convenient method.
Inevitable DMCA smackdown coming? (Score:3, Insightful)
Re: (Score:2)
These have existed for a while now, I remember seeing one that actually did turn pages (but used a real scanner and wasn't gentle when it turned the page).
http://www.engadget.com/2006/02/22/build-your-own-fullauto-bookscanner/ [engadget.com]
That isn't to take away from what was done here, just to point out this isn't so new that the publishers/manufacturers don't already know about it.
Cameras usually stink for this.... (Score:4, Insightful)
The cameras he used were only five megapixels.
Might work for looking at the pages on your iPhone. Not gonna look very readable on your laptop screen, and forget about reading the book's footnotes.....
~
Re: (Score:3, Informative)
There's no problem with the resolution.
9" x 6" page, scanned at 300 dpi = 2700 x 1800 pixels = 4.86 MP.
Re: (Score:2, Informative)
Lots of book scanners use ccds. They are good enough. No one really wants a 'portable' scanned document that weighs in at 3 gigabytes anyway, current laptop IO makes that a pain in the ass.
Re:Cameras usually stink for this.... (Score:5, Informative)
You haven't actually tried this have you? I've had various flatbed A4 scanners over the years, all at much higher resolution than a camera, and hence all got down-sampled afterwards for my display that is only 1.5MP anyway. Then I switched to using a phone camera with only a 2MP CCD, but a really good lens and decent macro mode (Sony-Ericcson Cybershot for those that are interested). As long as the focus was good it produced perfectly readable shots, and so it became my portable scanner. These days I mostly shot stuff at home so I have a 12MP DSLR to hand. It's huge overkill, and I massively down-sample stuff afterwards, but entirely readable. So your basic claim that this can't be done with a camera based on the resolution compared to a scanner is a complete load of bollocks. The focus of the lens tends to be the important issue.
Re: (Score:3, Informative)
FYI, the color camera on the Mars Rovers.
One Megapixel. Really spiffy and detailed images of the Martian landscape for only one megapixel, don't you think?
Also, TFA states he's using OCR to create a PDF.
If the image from the camera is sharp enough, the OCR software should have no trouble "reading" it.
Re: (Score:2)
It might be nice if you understood digital photography before opining on something you clearly know nothing about.
The Mars Rover camera is a very special instrument. How consumer digital cameras work is with something called a Bayer matrix of red, green and blue filters. The end result is that you get RGB values by interpolation - in reality you have about 1/4th the resolution of the sensor. You can get pretty fancy with the interpolation, but there is still a huge loss of detail. When the output is a J
Re:Cameras usually stink for this.... (Score:5, Funny)
I am well aware of how the Mars cameras work, having done a metric shitload of B&W "color" photography via filters myself.
And you, obviously, know exactly dick about not being an asshole.
Re: (Score:2)
No, it's fine. A laptop screen is 1400x900 or thereabouts; even a cheap camera will have better resolution than that. It's not going to be a problem unless you're doing a fair amount of zooming in.
At 1200dpi, an 8.5" x 11" document will be 10,200 x 13,200 resolution. That may be useful for some purposes, but for simple text browsing it's overkill by nearly a whole order of magnitude.
I've (Score:4, Funny)
What a coincidece! I too have a book scanner that scans books, and requires a human operator to attend to turning the pages.
It's called a scanner.
Re: (Score:3, Interesting)
Re: (Score:2)
(200 scans) * (6 seconds / scan) = 1200 seconds
Otherwise known as 20 minutes.
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
And if you can do 400 pages on a flat bed scanner in 20 min, i bet you could do it much much faster on this guys setup.
Re: (Score:2)
Re:I've (Score:5, Informative)
http://www.geocities.jp/takascience/lego/fabs_en.html [geocities.jp]
turning the pages and scanning is childs play
Some people please mod this right to the top! (Score:2)
Mods - please mod this right to the top!
Quoting from link:
...
Or, everyone had been thinking so, until I found
@that a scratch of an eraser can turn a page, and
@that if you place the scanner upside down you don't need to flip the book,
Re: (Score:2)
Plus it uses a normal scanner and has an automated page turner !
Looks like the BOM is under $200
Re: (Score:3, Funny)
Has anyone tried shotgun scanning yet? Irregularly shred the books, feed the shreds into a bulk scanner, and use a computer to reassemble.
Re: (Score:2)
Why? Just give them the digital file, and they'll regain some much needed shelf space.
Re: (Score:2)
Re: (Score:2)
My library does appreciate lending books and returning them as a digital copy. Can you direct me to one that does?
Seek, and ye shall find.
repost (Score:5, Informative)
http://bkrpr.org/doku.php [bkrpr.org]
Same thing, much cheaper (I built mine for ~150 USD.)
Re:repost (Score:5, Informative)
Re: (Score:2)
Besides being half the price of the other setup, there is a larger consideration. It is the size. I have no place to store the scanner they use in this article. I am hard pressed to find a place I could set it up other than in the middle of my living room. The smaller scanner for half the price I could find a place to store it when I am not using it.
Re: (Score:2)
I don't get it. Can't you simply leave out the "front" side of the box, that is the side where you'd sit if you were reading the book? The cameras don't need a piece of glass there, and the whole contraption could still be stable. That way you could reach in and turn the page without lifiting the glass box. Seems much more convenient. I must be missing something.
Re: (Score:3, Informative)
Your idea would end up with bent pages.
Re: (Score:2)
Yep, there it is. That is what I was missing. Thanks.
Are there scanners that accept a stack of sheets? (Score:2, Insightful)
If so, wouldn't it be easier to just rip out the binding and put in the pages? The $15 cost of buying another copy is less than all that boring, repetitive manual labor.
Re:Are there scanners that accept a stack of sheet (Score:5, Informative)
You must not have ever gone to college. A textbook for $15? Get real.
Re:Are there scanners that accept a stack of sheet (Score:4, Insightful)
One semester's worth of books in college today runs around $1000. With this device you can return the books after you've scanned them. If you rip out the binding, most bookstores are going to frown on returns.
So this device saves about $700 the first semester, and $1000/semester after that.
Re: (Score:2)
I'd be for cutting off the binding of all the books and using a standard duplex scanner - you'll be able to sell the books on to a poor student (or give them away) and you'll be able to sell copies of the format shift to your fellow students; you'll need proof that a) they own the book and b) the publisher doesn't do an electronic public sale already. You could even buy a glued-tape style binding machine if you found that it was cost effective.
Re: (Score:2)
Oh it's absolutely illegal. But how would you ever hope to catch them?
He's just pretending (Score:3, Insightful)
He keeps talking about how expensive the books are. Clearly he is just using this to scan other people's books to avoid paying.
Still a pretty cool build though :P
Re: (Score:3, Informative)
Re:He's just pretending (Score:5, Insightful)
He may be scanning books to pirate them. However, I am a college student as well but trying to save money by pirating the books is not my objective.
I am in my 40's and my eyesight is not what it used to be. Here is why I would buy the books and scan them.
1. To be legal and comply with the law. I may very well by the books used, to get them as cheaply as possible. But I will buy them.
2. It is much lighter for me to carry one laptop around on campus, perhaps with copies of all the books I have used for all terms up to the current term.
3. I can zoom the pages to a comfortable size to read the text.
4. I now have the ability to search through the text.
5. I can use a text-to-speech reader to listen to the book, I can even make an mp3 of the book if I so desired.
To me it sounds like a bargain
Re: (Score:2)
"Clearly he is just using this to scan other people's books to avoid paying."
Textbook makers and colleges exploit a captive student population, so that attitude is understandable.
high quality digital cameras doom textbooks (Score:5, Interesting)
This is a market that relies on outrageous reproduction prices just like cd's used to. They are equally doomed. I know a LOT of college students who no longer buy books ... they rent them for free by buying them, shooting them, and returning them. It may take a couple of hours to do manually without a device like this, but $80 per hour is pretty good wages for a college student.
Re: (Score:2)
Or just download them from other students?
I recently taught an upper level computer science course in a second world country. I was worried about whether the students would have access to books. No problem, students have already digitized all common undergraduate text books and share them on various eastern european websites. So the official course webpages often just link the textbook directly.
Re: (Score:2)
Re: (Score:2)
Thanks to the DPMCA (the Digital Post-Modernist Copyright Act), it's now required to have a professor explain post-modernist works to you.
Any device which enables you to circumvent the professor and understand Lacan or Derrida directly can land you in jail!
Bandsaw (Score:2)
Just use a bandsaw to cut off the spine and feed it through a normal scanner with a sheet feeder. Duh. Faster, cheaper, and better results along the spine.
Oh, you wanted to keep the books INTACT?
better wy (Score:4, Informative)
from the comments with the article
posted by: irrational | 12/11/09 | 11:56 pm
I do it in 5 steps, and you get rid of the book when you’re done since you don’t need to store it. After you get done putting 200 hours into your creation, you’ll have spent thousands of dollars worth of your time. I solved this problem much more quickly years ago:
1. Buy a good sheet-fed and high-speed scanner. I have a Panasonic KV-S2026 color.
2. Get a decent jigsaw from Home Depot. Use metal cutting blades (24 teeth/inch or better)
3. Saw the spines off the book and for God’s sake use some C-clamps on each end of the book. Preferably sandwich them between two flat boards.
4. Remove and feed sheets through the scanner to OmniPage and text recognize the pages.
5. Save as PDF.
6. Repeat. You now have searchable digital books!
Re:better wy (Score:4, Insightful)
Even thousands of dollars worth of your time can be recouped easily over 4-5 years of college book costs. And rarely will a college student find a job that pays better than scanning their own books to save book costs.
A million monkeys... (Score:2)
might not be able to *write* the entire collection of Shakespeare, but with this setup, I'm quite sure that they would be able to digitize it!
Well, ironically (Score:3, Insightful)
Ironically, all these books that he and others are trying to scan into a digital format where created in a digital format from the start, sitting on a publisher's computer somewhere.
Thanks copyright laws! Thank you very little.
Re: (Score:3, Informative)
This means that every textbook HAS a doc or PDF version you can get from the publisher. As a professor I regularly get pdf versions of my text books for "disabled" students who can't afford the $95 these leeches charge for the text I use.
I'm in the process of putting together a "text pack" that consists of short excerpts from dozens
Re: (Score:2)
Did the same thing with just a single camera (Score:3, Informative)
Re: (Score:3, Insightful)
I was reading about OCR accuracy in my Game Developer magazine just last night, and they were lamenting that 98% accuracy really wasn't good enough for them. I know that the difference between personal and professional use is rather wide, but they printed a few sentences with 98% accuracy and I will admit, it was distracting. Of course, if they hadn't mentioned, would I have noticed?
Re: (Score:3, Interesting)
When you OCR the resulting PDFs from using a scanner, you use a mode that includes data from the original scan. For instance, I just use Adobe Acrobat's "clear scan" OCR mode. What it does is it OCRs the text, and uses the OCR data to sharpen the scan of the letters in the PDF document. It then downsamples all the image data to a resolution that you specify. Basically, the resulting PDF is a hybrid between an OCRed file and the original image data that was scanned in. You can easily read all of the tex
Copier (Score:2)
I thought about doing this several years ago to archive a huge stack of old lab notebooks, then we bought some Ricoh copiers that were also scanners with a platen large enough to scan two pages at once. I was able to turn a 300 page notebook into pdfs in about a half hour.
Dupe ? (Score:3, Informative)
The scanner was described 3 months ago in a question to Ask Slashdot:
http://ask.slashdot.org/story/09/09/27/199251/Software-To-Flatten-a-Photographed-Book [slashdot.org]
The answer:
http://ask.slashdot.org/comments.pl?sid=1383895&cid=29559637 [slashdot.org]
on a related note (Score:2)
I have a project that requires text recognition. I'm need to quickly identify the presence of text URLs in several thousand photographs. In the easy cases, the URL is a solid color on a contrasting background, added as a band across the top or bottom of the photo. But in the hard cases it's a partially transparent watermark across the center of the photo that may be rotated several degrees from horizontal. The good news is that the URLs all start with "http://", and I don't need the software to capture
Re: (Score:2)
If it can't directly, OCR to a plain text file and grep for http:/// [http]
See also the BookLiberator, a more compact design (Score:5, Interesting)
See also the BookLiberator [bookliberator.com], a somewhat more compact cube-in-cradle design, that's also easy to build. Although soon you won't have to build your own: we're prototyping a manufacturable, flat-packed kit to sell from our online store; see questioncopyright.org/bookliberator [questioncopyright.org] for more about the project. It should be ready next year.
None of which is to detract from Reetz's accomplishment, of course. This renaissance in personal book scanners is going to make it easier for all of them, in the long run, especially as we can share the same open source software among all the scanners.
I wonder if anyone in my area has such a rig? (Score:2)
I have a book I would LOVE to preserve digitally. I have an extremely rare and out of print book -- it doesn't have an ISBN or anything! Technically, though, I believe it is copyrighted. I would like to scan it in and OCR it into a usable format that can then be put anywhere. (PDF bitmap pages are ridiculously large!) It is "Home Again" by James Edmiston. Copyright 1955 by James Ewen Edmiston, Jr. First Edition, signed by the author. Library of Congress number 55-5265. It is a significant and import
Re:I wonder if anyone in my area has such a rig? (Score:5, Insightful)
Based on the last 40 years of Disney legislation?
For-fucking-ever.
Re: (Score:2)
Well Merry Christmas to me! I found it! The very same file... and perhaps in the very same place. Check out the file dates in here...
ftp://ftp.de.flightgear.org/ftp.monash.edu.au/pub/nihongo/ [flightgear.org]
eBook (Score:2)
Now if only textbooks came as e-books, then this whole tech would be un necessary.
Re: (Score:2)
Re: (Score:2)
I could have sworn I saw this on slashdot before.. (Score:2)
Something just like this setup was in a comment for an ask slashdot article -
http://ask.slashdot.org/comments.pl?sid=1383895&cid=29559637 [slashdot.org]
Plenty of links, but what about page turning? (Score:2)
There's plenty of people working on this at the DIY Book Scanning site [diybookscanner.org], but what they all lack... is page turning. I found this great project [youtube.com] some students came up with that is simplistic and doesn't require you to preload pages at all.
Incorporate that, with the glass/plexi platen of the stock DIY book scanning projects, and you have a 100% complete, automatic, turn-it-on-and-walk-away book scanner from beginning to end.
Just saw the spine off! (Score:3, Informative)
Re: (Score:2)
Re: (Score:2)
You have a Kodak printer too, eh? :P
You can run it from a Windows VM.
Re: (Score:2)
http://www.theregister.co.uk/2009/07/18/amazon_removes_1984_from_kindle/ [theregister.co.uk]
Re: (Score:2)
Re: (Score:2)
An old wall-wart from some ancient piece of electronic gear can often deliver the needed power. Anyone capable of building this scanner should be able to scrounge up a functioning replacement power source.
Re: (Score:2)