NCSU, Endeca ProFind, Online Public Access Catalogs

The Duke University Libraries hosted North Carolina State University’s Kristen Antelman, Associate Director for Information Technology, and Emily Lynema, lead developer for the new catalog interface at our monthly library staff presentation/talk. Some tidbits from the talk.

  • They started really talking about redoing their OPAC last year but actual handoff of the project to a developer didn’t happen until summer of 2005. Their initial plan was to have something deployed before the spring 2006 academic semester. The went live with it in January 2006. I think this is indicative of how things are now. Things happen faster, libraries need to get used to this AND take advantage of it.
  • Similarly, they actually had a very small implementation team, department heads i.e., systems, cataloging, reference, current ILS admins (all with other responsibilities) and ONE dedicated developer. I’m not sure if this is a sign of the apocolypse but it is reality of how things are done in libraries
  • NCSU seems to be able to get the most out of their people. It seems like they have faith in their employees to just do it, give smart people the tools and they can do it. That is lesson for all library managers out there I think.
  • Part of the decision in doing their own application was that Sirsi/DYNIX didn’t look like they were going to do anything revolutionary with the OPAC
  • Endeca ProFind is another layer between their ILS (Sirsi/Dynix of some flavor) and the web application which is their OPAC. Catalog records with expanded item information would be extacted nightly and fed into the Endeca software to update indexes/databases which the OPAC would work off of.
  • I suspect that most of the work that NCSU had to do was to make the web application work with whatever database the Endeca software generated from the extracted MARC records. I’m sure Endeca had to do a bit of work to extract stuff from MARC records, and map to their database. Another thing is that the MARC that is extracted from NCSU’s catalog is simple flat file records, not transmission format but broken up by some sort of marcbreaking routine.
  • Setup cost was six-figures. Yearly maintainence fee was 18%. Serious money. Especially for a new cost. NCSU’s collection is on the order of 1.6M records, so who knows how that cost would scale for really large collections. At the same time there is a question as to how cost-effective this solution would be for a smaller collection. They mentioned that this was ameliorated by the fact the system just works and they don’t have to waste time on the ILS pig.
  • They are still optimizing performance of the backend. Nightly indexing takes on the order of seven hours. This will supposedly improve after some bugs are swatted.
  • Future enhancements include the implementation of FRBR identification of records in order to “roll-up” related works and manifestations. There was a blurb on the slide about taking advantage of LC’s FRBR algorithm but I didn’t get to ask them any details [hey the room was full of oohhhing-ahhhing librarians]
  • I didn’t get a good sense of how they were doing relevancy but the gist of it was that they were basing it on the semantics of the actual search, weighing certain fields (title, subject author) more than others (notes). I didn’t hear anything about how well this is actually working out though. They did note that their original OPAC would only really give “relevant” results in the first five returned records for a keyword search about 15% of the time.

Overall, good stuff. It was nice hearing it from the horses’ mouths.