MPoW has sent me to the Ex Libris Users of North America’s first annual meeting. As far as I understand it the conference is a merging of the old Aleph user’s groups with the other user groups for the library tools that Ex Libris has i.e. SFX/Metalib.
First thing up on the plate was a pre-conference workshop devoted to MARC 21 Format for Holdings Data (MFHD). The three hour workshop was presented by Patti Hatch, a librarian/analyst at Harvard University library system.
Now when most librarians hear about MARC they probably think of bibliographic data. Title, statement of responsibility, subjects, entries, years of publication, etc. Fortunately, or, unfortunately depending on your point of view there is a whole raft of library data that is encoded in a MARC format including authority records (which author, subject, genre heading/string is the “right” one), classification schemes (as far as I know used by the Library of Congress’ to produce ClassWeb and the print Library of Congress Classification), and holdings data (what does that library own, and where is it, and when I can use the resource
Now I like to think that holdings is the messy underbelly of the library data. No two libraries do it the same, and there is no clear content standard (what we consider the metadata), versus extensive data (how we encode the metadata) and display (how we spit it back out the users) standards. There is no bright line between these things sometimes, witness how cataloging standards (Anglo-American Cataloging Rules) are easily conflated with the data standard, MARC, when MARC is actually used to encode a whole panoply of metadata which isn’t necessarily cataloged under AACR2.
So what makes holdings so messy? I’m not entirely sure how to explain it. What seems true is that holdings are unique for each library, i.e. every library is unique.
My main objection to MFHD is that is bloody complicated. Well, strike that, you can implement it in such a way that it is beyond complicated. In my library automated world there is usually a minimum requirement of being able to track what you own on an item/piece level in the, usually in a database/inventory type system. Why the heck do you need a second way to record the same exact sort of information? At the least if you are going to reformat the display of your inventory obviously it should derive from your existing inventory/item data, not be maintained in a seperate file. But that is what certain library stems end up having you do.
So why? Well you can use the MFHD to record data that doesn’t comfortably fit into other places! In this case it also encodes publication pattern information. Egads!, What? In a sense the MFHD contains a regular expression which says what you should have and what you will have in the future. Why would you do this? In a word, materials handling. Being able to predict when a serial will arrive from your bookseller, what will arrive and what it will be called is obviously something you want in your library system, especially if you have any number of periodicals. Except for the life of me I can’t figure out how this type of data got into the holdings record, rather than being set at the title/bibliographic level. Now “publication pattern” is an oxymoron. Sure most periodicals are periodic, they come out every month, or every quarter or what not. The problem is that every publisher does it a different way. Some may issue combined numbers (American Libraries has a combined June/July issue, I guess everyone is too busy at the ALA general meeting to do any work), change their frequency, issue supplements, issue an index, have a symposium issue, or just skip publishing.
So to handle this complexity the regular expression encoding is complicated. Really really complicated, like twenty pages for just the single field long. Like needing a mere one hundred and seventy two slides, long. Just thinking using it gives me a headache, let alone beginning to program a parser.
Unfortunately I’m not fully sure what the alternatives are. ONIX has a proposal/recommentation for a format that a publisher or vendor could use to push individual issue publication information to library systems but that is in the works and would in my opinion take a while for ILS’s to even begin to implement, let alone for librarians to digest the implications of. On top of that you are depending on publishers to feel that their customers will be able to use this pushed/harvested acquisitions data.