The Surprising History of Google’s Push to Scan Millions of Library Books

For nearly 20 years, Google has undertaken an ambitious mission to digitize the content of some of the world’s largest research libraries.

It seemed like the beginning of a new era, when scholars and the public were able to make new connections and discoveries in the kind of collective digital library that was once the stuff of science fiction. But it soon became apparent that the actual plan would become more controversial than its organizers could ever imagine.

In this week’s EdSurge Podcast, we tell the story of this ambitious book scanning effort that sparked an epic legal battle between publishers, authors and technologists. In a way, it’s a story that seems largely forgotten.

To do this, we reached out to Roger C. Schoenfeld, co-author of the new book, “Along Came Google: A History of Library Digitization.” Schoenfeld is a longtime leader in the library community and is the Director of Programs at Ithaka S+R, a nonprofit educational consulting firm.

We’ve gone too far wondering why people aren’t talking more about this part of recent educational technology history, and what lessons can still be learned from it.

EdSurge: Not so long ago, it was pretty rare to have whole texts of books scanned and available, right?

Roger C. Schoenfeld: Not so long ago at all, you know, 15 years ago it was pretty rare, actually. So the way people discovered books was really different. You browsed the catalog of cards or went to a bookstore, or browsed through the piles. It was a very different experience.

So we mentioned what Google did around 2004.

There have been a whole series of efforts to digitize library materials beforehand. This is a really important thing to keep in mind. Our story isn’t just, there wasn’t anything then there was Google. Our story is that there was already a lot of activity going on. The Internet Archive was active. Carnegie Mellon University was active. There were many, many individual libraries and collaborating libraries active in digitization.

But the efforts were sporadic. They did not measure. They were often risk averse and concerned about digitizing copyrighted material that was still subject to copyright. There were all kinds of limitations – and that does not detract from the great work that has been done.

And then Google came along. And what really happened was that this dream that had been for librarians, technologists, and others for decades—to expand access to knowledge and make book collections widely accessible—found this dream the catalyst that was necessary to make it happen at the scale that was necessary to realize the potential vision.

The catalyst had two components. Some people will really focus on, “Well, my God, Google had unlimited money, you know, relatively.” But in fact, the amount of money that Google invested was an amount of money that some institutions were probably willing to invest – surely 50 or a hundred universities collectively could easily have invested. So it wasn’t literally the amount of money they brought in.

They’ve also brought in some technology, and designed some new ways to do book scanning faster and more efficiently. But I would argue that the thing that Google bought was actually some kind of catalytic role to say, “This is going to happen, and this is going to happen quickly, and we will work with anyone who is willing to do it with us.”

Instead of trying to do something in some sort of consensus-driven collaboration across a dozen or a hundred major university libraries, they said, “Let’s find five who are willing to work with us, and we’ll use confidentiality and other kinds of methods to get those five moving at the speed that we want to go forward”— If I could call it that – in the form of a Silicon Valley timeline rather than a traditional academic timeline.

Listen to the full interview today EdSurge Podcast for this week.

Leave a Reply

Your email address will not be published. Required fields are marked *