Economics and Organization of Bibliographic Data
Posted by Sean Chen on June 9, 2007
In the background paper for the third meeting of the Library of Congress’ Working Group on the Future of Bibliographic Control there are a series of questions that the meeting requests comment on. One of them stood out for me:
4) A recurrent theme of the previous meetings was more fully integrating bibliographic data (such as MARC records, terminologies, authority files, et al.), which currently exist as “data silos,” into the fabric of the World Wide Web. In particular, terminologies and authorities were seen as important resources that could be used in a variety of ways. From a design perspective, how do we move from “data silos” to “data services,” that increase the potential value of bibliographic data by treating them as interconnected resource collections, addressable via URIs and accessible over Web protocols? Organizationally, how might this goal be accomplished, supported, and maintained? Economically, what factors need to be considered?
We’ve been talking around this.
Much of the discourse until now has been of the nature, “Wouldn’t it be great if … LCC was available in an open web service … MARC went away … webify our infrastructure … get rid of catalogers … have better OPACS.”
So what makes a silo? I’m pretty sure I can speak for everyone and say we (libraries) want to be relevant in the information future. How have we backed ourselves into a corner? And what exactly is that corner?
I am kind of suspicious of talk about all silos being bad. I think there can be an argument made on the behalf of silos. Silos exist for a reason, the information needs of a community are different. Then again there is probably a stronger argument that our silos exist because of the way we have acquired, collected, organized, and developed resources with our vendors and within our libraries.
Who do I see as the custodian, maintainer, of the bibliographic future of web services, standards, and open data? So who are the players:
- Library of Congress & the Project for Cooperative Cataloging
- OCLC
- Vendors
- ARL
- Corporation: Microsoft, Google etc.
So … who has been actually doing anything?
The distinct impression I get with LC is that they are initiating this entire process because there is a crisis in scalability and more importantly severe budgetary pressures. Are they really going to take, and more imporantly be capable of making the next steps? Including drastic organization changes, significant changes in legislation to make things happen, and of course dealing everything that being a Federal agency entails?
OCLC? Is the cooperative really moving in that direction? I get a sense they are. Worldcat.org is supposed to open up more later in the year. But is this the right direction? We’re actually talking about a different beast, opening up and revealing a lot more of the plumbing for a whole host of applications that we can only just begin to imagine. I actually have heard very little comment from their representatives from the meeting summaries. In my mind OCLC is probably the place where the organization, support and funding will be headed. This is something that member libraries, need and appear to want.
ARL has a stake, but being the plumbers of our bibliographic future isn’t in their charge.
The search corporations? Are they thinking in a long enough horizon here. Google probably, I’m not really sure what Microsoft was bringing to the table (they are listed as having a representative).
Our library system vendors? I don’t think they are that interested in selling us a new way of doing things that may very well put them all out of business?
So … do we need a new organization to do this? This organization will just not have to be a big player in making standards happen. But would also have to create systems, get people to buy into them, provide services over a very long horizon, and be a good custodian of our cataloging as a “public good”. I don’t think we will have to reinvent the wheel, especially with things happening in Internet time, but it might be necessary.
One last thought, my hunch is that moving forward with the opening up of bibliographic data will cost a bunch, but … I think in the long run it will pay for itself in terms of improved retrieval, resource sharing, new opportunities created, and just plain coolness. The alternative is even more irrelevance of libraries and their staggering collections.
Of course … anyone have hard numbers, I would be curious to see some real calculations or studies. Heck, a methodology would be a start. I’m not sure what would be a comparable situation, or how you can calculate value for metadata, and the wealth it in turn provides access to, or creates in new knowledge.

June 13, 2007 at 1:24 am
You can always keep it in a silo
I read with interest Sean Chen’s post Economics and Organization of Bibliographic Data over on schenizzle. He was reacting to the overall theme he sees coming out of the Library of Congress’ Working Group on the Future of Bibliographic Con…