1. Archive


The Internet is already home to millions of books, and may soon have all of them neatly parsed and linked. But can you curl up with a good site?

In secret locations and using secret methods, human beings are scanning lots and lots of books for Google, the world's largest Web-search company. That humans are involved is beyond doubt (fingers are visible in the corners of many pages on although this is uncharacteristic of Google, which has a fetish for purist technology.

Google will not divulge exact numbers, but Daniel Clancy, the project's lead engineer, gives enough guidance for an educated guess: Google's contract with one university library, Berkeley's, stipulates that it must digitize 3,000 books a day. The minimum for the other 12 universities involved may be lower, but the rate for participating publishers is higher. So a conservative estimate has Google digitizing at least 10-million books a year. The total number of titles in existence is estimated to be about 65-million.

Google's is not the only project of its kind. The Internet Archive, for instance, is a nonprofit organization founded in 1996 by Brewster Kahle, a San Francisco idealist who wants to re-create a modern Library of Alexandria containing all public-domain texts and videos. Amazon has been scanning books, as have Microsoft and Yahoo, Google's biggest rivals, and individual libraries around the world. Eager not to be left out, publishers are also doing the same. But Google's effort, in scale and ambition, is off the charts.

As books go digital, new questions, both philosophical and commercial, arise. How, physically, will people read books in future? Will technology "unbind" books, as it has unbundled other media, such as music albums? Will reading habits change? What happens when books are interlinked? And what is a book, anyway?

Change is least likely in the physical medium of books. Electronic books do exist; the best-known is the Sony Reader, a book-sized gadget made by the eponymous consumer-electronics company. Sony currently makes 12,000 books available online for download, but "our mission is not to replace the print book," says Ron Hawkins, the Sony Reader's marketing boss.

There is an obvious analogy between what Apple's iPods have done to CD players and what electronic books may do to the printed page, but the shift is unlikely to be quite so comprehensive. The simplest difference is that transferring one's old music CDs on to iPods is easy, whereas transferring one's old books onto an e-book is impossible.

So who is going to read the millions of pages that Google and its colleagues are so busy digitizing? Some people will read them on-screen, some will use Google as a taster for books they will then buy in paper form or borrow from a library, and still more will use it to look for specific snippets that interest them.

The biggest changes are likely to be seen in what becomes a book in the first place. Here the Internet may indeed be to some book genres what Apple has been to music or what YouTube (now part of Google) has been to video. (And who knows what the new iPhone will do for any of these?) Among younger listeners albums are dead. They have been replaced by playlists of individual songs designed to be shared with friends.

In books this has already happened for encyclopedias. Wikipedia, which is free, collaborative and online, has eaten into sales of paper-bound alternatives. So books that people would not traditionally read in their entirety, or that require frequent updating, are likely to migrate online and perhaps to cease being books at all. Telephone directories and dictionaries, and probably cookbooks and textbooks, will all fall into this category.

With nonfiction, the situation is more nuanced. Many nonfiction books express an intellectual idea. Traditionally, the only way to deliver such an idea profitably involved binding it into a 300-page book, says Seth Godin, a blogger and author of eight books on marketing.

"If you had a 50-page idea, you couldn't make any money from it," he says, so a lot of nonfiction books end up on shelves with 250 unread pages. Freedom from such rigidities may save a lot of authorial time.

Nonfiction books will also benefit from another change that comes with digitization. Like Web pages, digitized books can have incoming and outgoing hyperlinks. On at the moment, links are only to entire books. But in future, says Google's Clancy, links will point to and from specific phrases or words inside books. Footnotes, citations and bibliographies are obvious points for live links.

This has several benefits. It will help scholarly research, since it makes primary sources much more accessible. And it will reduce the slog of academic book-worming - jotting down the location of a book, trudging through the library, pulling it off the shelf, waiting for the photocopier - to the negligible effort of clicking a mouse.

Such links will make books easier to discover, by helping search engines. As link structures develop around books, search algorithms can count incoming links as "votes," giving more weight to incoming links from much-cited places and less to obscure ones. The (offline) citation culture of academic literature already works this way. This is what gave Larry Page, one of Google's co-founders, the original idea for his search algorithm, which he called PageRank.

What about all the genres of books that fill a different human need? Some types of fiction - novels as well as novellas - are likely to migrate online and to cease being books. Many fantasy fans, for example, have already put aside books and logged on to "virtual worlds" such as World of Warcraft, in which muscular heroes and heroines get together to slay dragons and such. Science fiction may go the same way, and is arguably already being created by "residents" of online worlds such as Second Life.

Most stories, however, will never find a better medium than the paper-bound novel. That is because readers immersed in a storyline want above all not to be interrupted, and all online media teem with distractions (even a hyperlink is an interruption). People do not read fiction in order to accomplish a specific task in a limited amount of time, as they read reference and schoolbooks. Random-access dictionaries and cookbooks may be useful; random-access novels less so.

What about short stories and poems? Being short, they fit the new media, so some may do well online and need not be bound in paper. Commuters could receive their daily haiku or sonnet on their mobile phones while taking the bus to work. They might also use the new media to enjoy poetry in a more traditional way. "Storytelling started as oral history," says Adam Smith, the boss of Google's book project, so a partial reversion to that form, through podcasting, would be natural.

But even anthologies of short stories and poems, like longer novels, are unlikely to disappear. People want to be guided by others. They also want media suitable for unhurried reading in beds and bathtubs and on beaches. Above all, they want paper books for what digitization is revealing them to be. Books are not primarily artifacts, nor necessarily vehicles for ideas. Rather, as Godin puts it, they are "souvenirs of the way we felt" when we read something. That is something that people are likely to go on buying.