Forgot your password?
typodupeerror
Books Businesses Google Media The Internet Hardware

How Google's High Speed Book Scanner De-Warps Pages 209

Posted by ScuttleMonkey
from the onto-dewarping-brains-next dept.
Hugh Pickens writes "Patent 7,508,978, awarded to Google, shows how the company has already managed to scan more than 7 million books. Google's system uses two cameras and infrared light to automatically correct for the curvature of pages in a book. By constructing a 3D model of each page and then 'de-warping' it afterward, Google can present flat-looking pages online without having to slice books up or mash them onto a flatbed scanner. Stephen Shankland writes that the 'sophistication of the technology illustrates that would-be competitors who want to feature their own digitized libraries won't have a trivial time catching up to Google.' First, a book is placed on a flat surface, while above it, an infrared projector displays a special mazelike pattern onto the pages. Next, two infrared cameras photograph the infrared pattern from different perspectives. 'The images can be stereoscopically combined, using known stereoscopic techniques, to obtain a three-dimensional mapping of the pattern,' according to the patent. 'The pattern falls on the surface of (the) book, causing the three-dimensional mapping of the pattern to correspond to the three-dimensional surface of the page of the book.'"
This discussion has been archived. No new comments can be posted.

How Google's High Speed Book Scanner De-Warps Pages

Comments Filter:
  • Re:Playing Catch-up (Score:5, Informative)

    by fuzzyfuzzyfungus (1223518) on Friday May 15, 2009 @04:47PM (#27972387) Journal
    Obviously it was worthy enough to be issued; but I don't know how worthy it is in the broader sense.

    Notably, for instance, there has been a fair bit of interest, for some years, in using digital cameras in concert with projectors, either for automatic keystone/distortion correction, for projectors that aren't perfectly aligned with the projection surface, or for automatic coordination of multiple projectors illuminating the same surface, without laborious manual tiling adjustment. This is, in essence, an equivalent problem(inferring a surface's geometry based on pictures of a known image projected upon it).

    The IEEE has held "Projector-Camera systems" workshops since 2003 [procams.org], and somebody was obviously working on it before that. I'm not saying that Google's patent falls into asshole troll territory or anything; but the notion of doing surface geometry inference based on known image projection isn't nearly as novel as it might seem.
  • Re:Patent!!??!! (Score:2, Informative)

    by aashenfe (558026) on Friday May 15, 2009 @04:51PM (#27972439) Journal
    Simple, I was trying to be funny. Notice the smiley :)
  • by ebingo (533762) on Friday May 15, 2009 @04:54PM (#27972479)
    There are scanners that flip pages themselves like this one: http://www.youtube.com/watch?v=UyB5c3S4vzc&feature=related [youtube.com] but I've seen somewhere (can't remember where though) a video of a scanner that was faster and didn't use vacuum to flip pages. It was quite a lot less noisy.
  • Re:Patent!!??!! (Score:5, Informative)

    by Dewin (989206) on Friday May 15, 2009 @04:57PM (#27972511)

    I believe the pattern barcode scanners use is simply trying to look for the barcode in several different directions, but I could be wrong.

    I also believe there's either rudimentary correction for common types of distortion (i.e. on cylindrical objects) or just wide enough tolerances to allow it to work anyways.

  • Re:Why? (Score:5, Informative)

    by ChaosDiscord (4913) * on Friday May 15, 2009 @05:11PM (#27972681) Homepage Journal
    Google is mostly scanning books borrowed from university libraries. Librarians get cranky if you borrow a book and return a stack of loose sheets of paper.
  • by Anonymous Coward on Friday May 15, 2009 @05:14PM (#27972713)

    Uhhh doesn't Rock Band 2 do that with a miniature microphone (and light sensor) built into the revised guitar?

  • Re:Patent!!??!! (Score:4, Informative)

    by profplump (309017) <zach-slashjunk@kotlarek.com> on Friday May 15, 2009 @05:35PM (#27972963)
    It's just wide tolerances. The whole UPC-scanning system was designed so that the output from the light return sensor could be read directly (ignoring some minor gain control/etc.) as a digital data stream, with the clock rate determined by the horizontal scan rate. There's no reason to do distortion correction because it's not reading an image in the first place, it's just reading a series of high/low signal returns as serial data. I'm sure you could build a more complicated system to does 2-D or 3-D imaging and distortion correction, but it's way more work than is necessary to read a linear UPC.
  • Re:Patent!!??!! (Score:5, Informative)

    by Timmmm (636430) on Friday May 15, 2009 @05:51PM (#27973099)

    You jest, but this technique *has* been around for years. I remember when digital cameras first became available there was a product that could perform a 3D scan by projecting a pattern onto the object and using an offset picture. I think the pattern came on a slide - that's how long ago it was! Here's a whole wikipedia page about the scanning technique: http://en.wikipedia.org/wiki/Structured_Light_3D_Scanner [wikipedia.org]

    This picture is especially good: http://en.wikipedia.org/wiki/File:6-seat.jpg [wikipedia.org]

    Anyway after reading the patent abstract, it isn't about the 3D scanning at all, it appears to be about an algorithm to find the fold once you've already got the point cloud. I would have thought that was fairly trivial. A possible approach would be to take the radon transform of the height map and find the smallest value that's roughly in the middle.

  • Re:Playing Catch-up (Score:3, Informative)

    by Pinky's Brain (1158667) on Friday May 15, 2009 @06:49PM (#27973699)

    Really? Structured light to find 3D geometry is old hat ... the optical and signal processing part of book scanning seem pretty easy, making the mechanical part for page flipping robust seems a lot harder to me.

  • by The Empiricist (854346) on Friday May 15, 2009 @06:58PM (#27973781)

    Cough, you don't ahve to. I can copy your book all gad damn day long and have not violated your rights or the copyright code.
    The moment I try to distribute them, then it's a copyright violation.

    Be sure to check out the exclusive rights in copyrighted works [cornell.edu] before making blanket assertions on what is and is not legal under copyright law. The exclusive rights granted by copyright include both reproduction and distribution. There are lots of exceptions to these exclusive rights, but an interpretation that completely eviscerates the exclusive right to reproduce a work is not supported by the Copyright Act.

  • Re:Patent!!??!! (Score:4, Informative)

    by petermgreen (876956) <plugwash.p10link@net> on Friday May 15, 2009 @07:25PM (#27974043) Homepage

    It certainly is mathematics and it's not that hard to understand either. basically it is the mathematical equivilent of what a hard field tomograph does.

    Consider a function of two values and consider those values to be 2D coordinates. Consider also that the function is zero outside of a defined area.

    Now consider that there are an infiniate number infinitely long number of straight lines passing through that area and each can be defined by two parameters, an angle and an offset from the orgin in the direction perpendicular to the line.

    Along each of those lines an integral can be calculated. those integrals form the radon transform of the function (with each integral being identified by the two parameters).

    Not really that complicated, the trickiest bit is probablly deciding how best to approximate the line integrals from your limited number of data points.

  • by Bob Wehadababyitsabo (629809) on Friday May 15, 2009 @08:22PM (#27974601)
    There are automatic page turning machines that use puffs of air and a stylus to move through a book.
  • by mattack2 (1165421) on Friday May 15, 2009 @09:55PM (#27975333)

    I can't find proof in a quick search, but I do remember others posting responses here recently (possibly Anonymous Cowards) to people mentioning the 20% time with things like (paraphrase) "that will be useful for Google". In other words, the implication (or at least my inference) was that while they are technically "non-Google", the intent was that eventually they would be Google projects or the projects would be killed off.

    I have no first hand knowledge of that, however.

    The small paragraph http://en.wikipedia.org/wiki/Google#Innovation_Time_Off [wikipedia.org] is interesting, and says (with a citation)

    In a talk at Stanford University, Marissa Mayer, Google's Vice President of Search Products and User Experience, stated that her analysis showed that half of the new product launches originated from the 20% time.

  • Re:Playing Catch-up (Score:5, Informative)

    by tomz16 (992375) on Friday May 15, 2009 @11:25PM (#27975913)

    It's simply called adaptive optics (AO). In AO, a guidestar is a natural isolated point-like star that is close to your science object (what you are trying to look at). If a laser is used to excite the sodium layer to create an artificial reference, it's called a "laser guidestar".

    Anyway, this "trick" is completely different from adaptive optics in both the mathematics and implementation.

10 to the 12th power microphones = 1 Megaphone

Working...