http://www.nybooks.com/articles/archives/2010/dec/23/library-three-jeremiad…
The Library: Three Jeremiads
December 23, 2010
Robert Darnton
When I look back at the plight of American research libraries in 2010,
I feel inclined to break into a jeremiad. In fact, I want to deliver
three jeremiads, because research libraries are facing crises on three
fronts; but instead of prophesying doom, I hope to arrive at a happy
ending.
I can even begin happily, at least in describing the state of the
university library at Harvard. True, the economic crisis hit us hard,
so hard that we must do some fundamental reorganizing, but we can take
measures to make a great library greater, and we can put our current
difficulties into perspective by seeing them in the light of a long
history. Having begun in 1638 with the 400 books in John Harvard’s
library, we now have accumulated nearly 17 million volumes and 400
million manuscript and archival items scattered through 45,000
distinct collections. I could string out the statistics indefinitely.
We collect in more than 350 languages and many different formats. We
have 12.8 million digital files, more than 100,000 serials, nearly 10
million photographs, online records of 3.4 million zoological
specimens, and endlessly rich special collections, including the
largest library of Chinese works outside of China (with the exception
of the Library of Congress) and more Ukrainian titles than exist in
Ukraine.
We want to make it possible for other people to consult those
collections by digitizing large portions of them and making them
available, free of charge, to the rest of the world from an online
repository. We group the material around themes such as women at work,
immigration, epidemics and disease control, Islamic heritage, and
scientific explorations—2.3 million pages in all. This Open
Collections Program, as we call it, is part of a general policy of
opening up our library to the outside world and sharing our
intellectual wealth. The latest project is devoted to reading, its
practices and history. It involved the digitization of more than
250,000 pages from manuscripts and rare books, including richly
annotated works such as Melville’s copy of Emerson’s essays and
Keats’s copy of Shakespeare.
California / Rebecca Solnit
There are few places aside from research libraries where rare books
and e-books can be brought together. At Harvard we use combinations of
them for teaching as well as research. I now teach a seminar on the
history of books in our rare book library. It begins with Gutenberg.
The students investigate the origins of printing by examining a
Gutenberg Bible, the real thing, and they do not just stare at it from
a respectful distance, but they are invited to leaf (carefully)
through its pages in order to appreciate the varieties of rubrication
and typographical design. The seminar ends in a high-tech lab on the
bottom floor of Widener Library, where experts in digitization explain
how to adjust nuances of color while scanning medieval manuscripts.
Despite financial pressure, we therefore are advancing on two fronts,
the digital and the analog. People often talk about printed books as
if they were extinct. I have been invited to so many conferences on
“The Death of the Book” that I suspect it is very much alive.
In fact, more printed books are produced each year than the year
before. Soon there will be a million new titles published worldwide
each year. A research library cannot ignore this production on the
grounds that our readers are now “digital natives” living in a new
“information age.” If the history of books teaches anything, it is
that one medium does not displace another, at least not in the short
run. Manuscript publishing continued to thrive for three centuries
after Gutenberg, because it was often cheaper to produce a small
edition by hiring scribes than by printing it. The codex—a book with
pages that you turn rather than a scroll that you read by unrolling—is
one of the greatest inventions of all time. It has served well for two
thousand years, and it is not about to become extinct. In fact, it may
be that the new technology used in print-on-demand will breathe new
life into the codex—and I say this with due respect to the Kindle, the
iPad, and all the rest.
But without neglecting our collections of printed books, we must forge
ahead on the other, the digital front. Our purchases of e-resources
increased by 25 percent at Harvard last year. We are expanding our
enormous Digital Repository Service in a campaign not just to save
digital texts but to help solve the problem of preserving them. A new
Library Lab is inventing techniques for digital browsing and the
preservation of e-mail, websites, and born-digital archives. Our
open-access repository, DASH, is making current articles by Harvard
faculty available online and free of charge throughout the world. And
we plan to collaborate with MIT in building joint digital collections.
In short, we are looking far ahead into the twenty-first century, and
we hope to help shape the information society of the future.
Still, there is no disguising the fact that research libraries are
going through hard times—times so hard that they are inflicting
serious damage on the entire world of learning. We face three
especially difficult problems, which I would like to discuss by
drawing on my own experience, recounted in the form of three
jeremiads.
Jeremiad 1
In 1998 I had my first encounter with a problem that now pervades the
academic world. It can be described as a vicious circle: the
escalation in the price of periodicals forces libraries to cut back on
their purchase of monographs; the drop in the demand for monographs
makes university presses reduce their publication of them; and the
difficulty in getting them published creates barriers to careers among
graduate students. Although librarians have lived with this problem
for decades, faculty are only dimly aware of its existence—not
surprisingly, because libraries pay for the journals, professors
don’t.
When this problem first dawned on me as chairman of Princeton’s
library committee in the 1980s, the price of journals had already
increased far more than the inflation rate; and the disparity has
continued until today. In 1974 the average cost of a subscription to a
journal was $54.86. In 2009 it came to $2,031 for a US title and
$4,753 for a non-US title, an increase greater than ten times that of
inflation. Between 1986 and 2005, the prices for institutional
subscriptions to journals rose 302 percent, while the consumer price
index went up by 68 percent. Faced with this disparity, libraries have
had to adjust the proportions of their acquisitions budgets. As a
rule, they used to spend about half of their funds on serials and half
on monographs. By 2000 many libraries were spending three quarters of
their budget on serials. Some had nearly stopped buying monographs
altogether or had eliminated them in certain fields.
Another rule of thumb used to prevail among the better university
presses. They could count on research libraries purchasing about eight
hundred copies of any new monograph. By 2000 that figure had fallen to
three or four hundred, often less, and not enough in most cases to
cover production costs. Therefore, the presses abandoned subjects like
colonial Latin America and Africa. They fell back on books about local
folklore or cooking or birds, works that fit into niches or could be
marketed to a broader public but that had little to do with scholarly
research. And graduate students fell victim to the notorious syndrome
of publish or perish.
As president of the American Historical Association in 1999, I thought
I could do something, at least in a small way, to reverse this trend.
I persuaded the Andrew W. Mellon Foundation to finance a program,
called Gutenberg-e, that would award prizes to the best Ph.D. theses
in the most endangered fields. The prize money would subsidize the
cost of converting the dissertations into books, books of a new kind,
electronic books that would take advantage of the new technology to
incorporate all sorts of new elements—film clips, recordings, images,
and whole collections of documents. The originality and the quality of
these e-books would legitimate a new form of scholarly communication
and revive the monograph.
One of the first questions that the people at Mellon asked me was
“What is your business plan?” Although I had never heard of a business
plan, I soon began to appreciate the economic conditions of
scholarship. Columbia University Press developed a program to sell the
e-books to research libraries as a package for a moderate subscription
price. The libraries responded favorably, but the scholars had
difficulty in producing their books on time, the pipeline became
clogged, and the delayed output hurt sales. In the end, after seven
years of struggle, we produced a fine series of thirty-five books, and
we had begun to cover our costs. But Columbia, like many university
presses, came under severe financial pressure. It decided that it
could not continue the series after the Mellon grant ran out. The
books were assimilated into the Humanities E-Book program developed by
the American Council of Learned Societies, and they are still
available online. But Gutenberg-e did not open up an escape route from
the problems of sustainability that were plaguing academic life.
Jeremiad 2
A few years later, “sustainability” had become a buzz word, and the
inflationary spiral of journal prices had continued unabated. In 2007
I became director of the Harvard University Library, a strategic
position from which to take the full measure of the business
constraints on academic life. Although economic conditions had
worsened, the faculty’s understanding of them had not improved.
How many professors in chemistry can give you even a ballpark estimate
of the cost of a year’s subscription to Tetrahedron (currently
$39,082)? Who in medical schools has the foggiest notion of the price
of The Journal of Comparative Neurology ($27,465)? What physicist can
come up with a reasonable guess about the average price of a journal
in physics ($3,368), and who in the humanities can compare that with
the average price of a journal in language and literature ($275) or
philosophy and religion ($300)?
Librarians who buy these subscriptions for the use of faculty and
students can shower you with statistics. In 2009, Elsevier, the giant
publisher of scholarly journals based in the Netherlands, made a $1.1
billion profit in its publishing division, yet 2009 was a disastrous
year for library budgets. Harvard’s seventy-three libraries cut their
expenditures by more than 10 percent, and other libraries suffered
even greater reductions, but the journal publishers were not
impressed. Many of them raised their prices by 5 percent and sometimes
more. This year, the publishers of the several Nature journals
announced that they were increasing the cost of subscriptions for
libraries in the University of California by 400 percent. Profit
margins of journal publishers in the fields of science, technology,
and medicine recently ran to 30–40 percent; yet those publishers add
very little value to the research process, and most of the research is
ultimately funded by American taxpayers through the National
Institutes of Health and other organizations.
University libraries have little defense against excessive pricing. If
they cancel a subscription, the faculty protest about being cut off
from the circulation of knowledge, and the publishers impose drastic
cancellation fees. Those fees are written into contracts, which often
cover “bundles” of journals, sometimes hundreds of them, over a period
of several years. The contracts provide for annual increases in the
cost of the bundle, even though a library’s budget may decrease; and
the publishers usually insist on keeping the terms secret, so that one
library cannot negotiate for cheaper rates by citing an advantage
obtained by another library. A recent court case in the state of
Washington makes it seem possible that publishers will no longer be
able to prevent the circulation of information about their contracts.
But they continue to sell subscriptions in bundles. If in negotiating
the renewal of a contract a library attempts to unbundle the offer in
order to eliminate the least desirable journals, the publishers
commonly raise the prices of the other journals so much that the total
cost remains the same.
While prices continued to spiral upward, professors became entrapped
in another kind of vicious circle, unaware of the unintended
consequences. Reduced to essentials, it goes like this: we academics
devote ourselves to research; we write up the results as articles for
journals; we referee the articles in the process of peer reviewing; we
serve on the editorial boards of the journals; we also serve as
editors (all of this unpaid, of course); and then we buy back our own
work at ruinous prices in the form of journal subscriptions—not that
we pay for it ourselves, of course; we expect our library to pay for
it, and therefore we have no knowledge of our complicity in a
disastrous system.
Professors expect services from their libraries, even if they never
set foot in them and consult Tetrahedron or The Journal of Comparative
Neurology from computers in their labs. A few, however, have stared
the problem in the face and seized it by the horns. In 2001 scientists
at Stanford and Berkeley circulated a petition calling for their
colleagues to submit articles only to open-access journals—that is,
journals that made them available from digital repositories free of
charge, either immediately or after a delay.
The effectiveness of such journals had been proven by BioMed Central,
a British enterprise, which had been publishing a whole series of them
since 1999. Led by Harold Varmus, a Nobel laureate who is now director
of the National Cancer Institute, American researchers allied with the
Public Library of Science founded their own series, beginning with
PLoS Biology in 2003. Foundations provided start-up funding, and
ongoing publication costs were covered by the research grants received
by the authors of the articles. Thanks to rigorous peer review and the
prestige of the authors, the PLoS publications were a great success.
According to citation indexes and statistics of hits, open-access
journals were consulted more frequently than most commercial
publications. By 2008, when the National Institutes of Health required
the recipients of its grants to make their work available through open
access—although it permitted an embargo of up to twelve months—cracks
were appearing everywhere in the commercial monopoly of publishing in
the medical sciences.
But what could be done in all the other disciplines, especially those
in the humanities and social sciences, where grants are not so
generous, if they exist at all? Several universities passed
resolutions in favor of open access and established digital
repositories for articles, but the compliance rate of the professors,
often 4 percent or less, made them look ineffective. At Harvard we
developed a new model. By a unanimous vote on February 12, 2008,
professors in the Faculty of Arts and Sciences bound themselves to
deposit all of their future scholarly articles in an open-access
repository to be established by the library and also granted the
university permission to distribute them.
This arrangement had an escape clause: anyone could refuse to comply
by obtaining a waiver, which would be granted automatically. In this
way, professors retained the liberty to publish in closed-access
journals, which might refuse to accept an article available elsewhere
on open access or might require an embargo. This model has now spread
to other faculties at Harvard and to other universities, but it is not
a business model. If the monopolies of price-gouging publishers are to
be broken, we need more than open-access repositories. We need
open-access journals that will be self-sustaining.
A supplementary program at Harvard now subsidizes publishing fees for
articles submitted to open-access journals, up to a yearly limit, for
each professor. The idea is to reverse the economics of journal
publishing by covering costs, rationally determined, at the production
end instead of by paying for an exorbitant profit in addition to the
production costs at the consumption end. If other universities adopt
the same policy and if professors apply pressure on editorial boards,
journals will shift, little by little, one after the other, to open
access. The Compact for Open-Access Publishing Equity (COPE), launched
this year, is an attempt to create a coalition of universities to push
journal publishing in this direction. It also envisages subsidies for
authors who cannot expect financial help from grants or their home
universities.
If COPE succeeds, it could save billions of dollars in library
budgets. But it will only succeed in the long run. Meanwhile, the
prices of commercial journals continue to rise, and the balance sheets
of university presses continue to sink into the red. In 2003 Walter
Lippincott, the director of Princeton University Press, predicted that
twenty-five of the eighty-two university presses in the United States
would disappear within five years. They are still alive, but they are
barely holding on by their fingernails. They may find a second life by
publishing online and by taking advantage of technological innovations
such as the Espresso Book Machine. This can download an electronic
text from a database, print it out within four minutes, and make it
available at a moderate price as an instant print-on-demand paperback.
But just when this glimmer of hope appeared on the horizon, it was
overshadowed by the most powerful technological innovation of them
all: relevance-ranking search engines linked to gigantic databases, as
in the case of Google Book Search, which is already providing readers
with access to millions of books. This brings me to Jeremiad 3.
Jeremiad 3
Google represents the ultimate in business plans. By controlling
access to information, it has made billions, which it is now investing
in the control of the information itself. What began as Google Book
Search is therefore becoming the largest library and book business in
the world. Like all commercial enterprises, Google’s primary
responsibility is to make money for its shareholders. Libraries exist
to get books to readers—books and other forms of knowledge and
entertainment, provided for free. The fundamental incompatibility of
purpose between libraries and Google Book Search might be mitigated if
Google could offer libraries access to its digitized database of books
on reasonable terms. But the terms are embodied in a 368-page document
known as the “settlement,” which is meant to resolve another conflict:
the suit brought against Google by authors and publishers for alleged
infringement of their copyrights.
Despite its enormous complexity, the settlement comes down to an
agreement about how to divide a pie—the profits to be produced by
Google Book Search: 37 percent will go to Google, 63 percent to the
authors and publishers. And the libraries? They are not partners to
the agreement, but many of them have provided, free of charge, the
books that Google has digitized. They are being asked to buy back
access to those books along with those of their sister libraries, in
digitized form, for an “institutional subscription” price, which could
escalate as disastrously as the price of journals. The subscription
price will be set by a Book Rights Registry, which will represent the
authors and publishers who have an interest in price increases.
Libraries therefore fear what they call “cocaine pricing”—a strategy
of beginning at a low rate and then, when customers are hooked,
ratcheting up the price as high as it will go.
To become effective, the settlement must be approved by the district
court in the Southern Federal District of New York. The Department of
Justice has filed two memoranda with the court that raise the
possibility, indeed the likelihood, that the settlement could give
Google such an advantage over potential competitors as to violate
antitrust laws. But the most important issue looming over the legal
debate is one of public policy. Do we want to settle copyright
questions by private litigation? And do we want to commercialize
access to knowledge?
I hope that the answer to those questions will lead to my happy
ending: a National Digital Library—or a Digital Public Library of
America (DPLA), as some prefer to call it. Google demonstrated the
possibility of transforming the intellectual riches of our libraries,
books lying inert and underused on shelves, into an electronic
database that could be tapped by anyone anywhere at any time. Why not
adapt its formula for success to the public good—a digital library
composed of virtually all the books in our greatest research libraries
available free of charge to the entire citizenry, in fact, to everyone
in the world?
To dismiss this goal as naive or utopian would be to ignore digital
projects that have proven their worth and feasibility throughout the
last twenty years. All major research libraries have digitized parts
of their collections. Since 1995 the Digital Library Federation has
worked to combine their catalogues or “metadata” into a general
network. More ambitious enterprises such as the Internet Archive,
Knowledge Commons, and Public.Resource .Org have attempted
digitization on a larger scale. They may be dwarfed by Google, but
several countries are now determined to out-Google Google by scanning
the entire contents of their national libraries.
In December 2009 President Nicolas Sarkozy of France announced that he
would make €750 million available for digitizing the French cultural
“patrimony.” The National Library of the Netherlands aims to digitize
within ten years every Dutch book, newspaper, and periodical produced
from 1470 to the present. National libraries in Japan, Australia,
Norway, and Finland are digitizing virtually all of their holdings;
and Europeana, an effort to coordinate digital collections on an
international scale, will have made over ten million objects—from
libraries, archives, museums, and audiovisual holdings—freely
accessible online by the end of 2010.
If these countries can create national digital libraries, why can’t
the United States? Because of the cost, some would argue. Far more
works exist in English than in Dutch or Japanese, and the Library of
Congress alone contains 30 million volumes. Estimates of the cost of
digitizing one page vary enormously, from ten cents (the figure cited
by Brewster Kahle, who has digitized over a million books for the
Internet Archive) to ten dollars, depending on the technology and the
required quality. But it should be possible to digitize everything in
the Library of Congress for less than Sarkozy’s €750 million—and the
cost could be spread out over a decade.
The greatest obstacle is legal, not financial. Presumably, the DPLA
would exclude books currently being marketed, but it would include
millions of books that are out of print yet covered by copyright,
especially those published between 1923 and 1964, a period when
copyright coverage is most obscure, owing to the proliferation of
“orphans”—books whose copyright holders have not been located.
Congress would have to pass legislation to protect the DPLA from
litigation concerning copyrighted, out-of-print books. The rights
holders of those books would have to be compensated, yet many of them,
especially among academic authors, might be willing to forgo
compensation in order to give their books new life and greater
diffusion in digitized form. Several authors protested against the
commercial character of Google Book Search and expressed their
readiness to make their work available free of charge in memoranda
filed with the New York District Court.
Perhaps even Google itself could be enlisted in the cause. It has
digitized about two million books in the public domain. It could turn
them over to the DPLA as the foundation of a collection that would
grow to include more recent books—at first those from the problematic
period of 1923–1964, then those made available by their rights
holders. Google would lose nothing by this generosity; each digitized
book that it made available could, if other donors agree, be
identified as a contribution from Google; and it might win admiration
for its public-spiritedness.
Even if Google refused to cooperate, a coalition of foundations could
provide enough to finance the DPLA, and a coalition of research
libraries could provide the books. By working systematically through
their holdings, a great collection could be formed. It would conform
to the highest standards in its bibliographical apparatus, its
scanning, its editorial decisions, and its commitment to preservation
for the use of future generations.
Should the Google Book Search agreement not be upheld by the court,
its unraveling would come at an extraordinary moment in the
development of an information society. We have now reached a period of
fluidity, uncertainty, and opportunity. Things have come undone, and
they can be put together in new ways, subordinating private profit to
the public good and providing everyone with access to a commonwealth
of culture.
Would a Digital Public Library of America solve all the other
problems—the inflation of journal prices, the economics of scholarly
publishing, the unbalanced budgets of libraries, and the barriers to
the careers of young scholars? No. Instead, it would open the way to a
general transformation of the landscape in what we now call the
information society. Rather than better business plans (not that they
don’t matter), we need a new ecology, one based on the public good
instead of private gain. This may not be a satisfactory conclusion.
It’s not an answer to the problem of sustainability. It’s an appeal to
change the system.
—November 23, 2010