New Projects

An upcoming school project is providing the impetus to begin another, bigger, long-term project, the archive. My last post expounded on the great expanse of cataloging and what all is involved in the data management and location side of establishing an accessible archive. I already have an archive per se, a collection of photos, documents, papers, letters, and a handful of artifacts, the issue is it is little more than boxes of stuff, not the searchable and accessible collection it should be. As the previous post indicated the metadata captured and the form it takes provides the searchable elements of the catalog. This is where the old computer axium, garbage in – garbage out, stands very true. Bad use of, or poor, non-standard quality metadata is worse than none at all.

There will be more on the school project in future posts, suffice it to say at this point it is a pilot project to define the standards for a permanent digital media archive comprised of digitised magnetic analog media. The goal is to establish the background policies and procedures for an entity to build a media archive from old magnetic media before it degrades to the point it can no longer be accessed and to make the created digital media searchable and accessible. Searchable and accessible being the key operators, hence the need for a thorough look at what metadata will be useful, and to what level should the metadata be standardized to easily integrate with other institutions in a shared environment.

While my own archive has been nagging at the back of my mind for years, having a project along similar lines for a graduate project helps breaks the rust of apathy and stagnation. The project is under the auspices of an internship that will span two sub-terms, from mid-August to mid-December, and will be a pilot project that is primarily an investigation of what would be required to establish an archive. While I will be digitizing some media for the project, the primary objective is to gather data for a thorough report that will outline the policies and procedures for starting and maintaining a permanent archive along with the projected cost of maintaining it. I am hopeful that the my report from the pilot project will result in a decision by the organization to take on the full archive, but even if they chose not to, I will have set out to build a working archive and have the pilot project to show for it as well as the skills to continue with my own archive.

One of my biggest questions was were to build the archive website. I am running a testbed on and internet accessible server to learn the platform I have chosen to build the archive on, but I wanted to build the pilot project on an internal machine, something not hosted by a provider to allow for complete control and an opportunity to try and break it. I decided to run the pilot on a Raspberry Pi 5 with 8GB of memory and a 1TB SSD. I am familiar with running servers on the Pi platform and keeping them secure in a production environment which will help reduce IT needs. I have an isolated sandbox and can tunnel into the server which reduces the organizational expense to all but naut. I am used to taking on the IT/IS responsibilities for projects of this scale so this was a no brainer. In the runup to the project’s start date I am working on familiarizing myself with the inner workings of the Omeka platform and how it handles the metadata and customization.

That’s all for now,
~Jon

Cataloging

Over the summer I have been evaluating my studies and potential paths forward academically and professionally as well as considering optimal locations to pursue these paths. My proclivity towards physical objects leads me towards the public history side of my interests. I find the prospect of part-time teaching while working in an historical archive or museum very appealing. I am also drawn to the idea of digitizing an organization’s assets and producing expanded presentations of digitized assets. Expanding an objects presence in an archive might include multiple photographic representations, 3D scans, video, audio, text transcriptions, documentary editing, and a host of other representations. A printed pamphlet might lead to a dozen representations each needing to be preserved, cataloged, archived, and made accessible to a range of consumers.

Cataloging of assets can be an overlooked, or at least under appreciated element in the process of public history training which can be seen particularly in the community museum or cultural archive environment where volunteers make up the majority of practitioners in the typical facility. Public history is a nebulous term in general and spans so many disciplines that one could not expect to be a subject matter expert on any one area in a single graduate program, it is a generalist degree. While cataloging and finding aids are discussed and their importance expressed, no detailed attention is placed on cataloging models or how to establish or improve a system, presumably to leave the details up to to on-the-job-training. OJT is fine for institutional knowledge, but too little attention is placed on the mechanics of cataloging in a typical public history program. A detailed appreciation of cataloging comes from library sciences. I have come to this conclusion after years of casually trying to establish an archive cataloging system for a family genealogy collection and personal mixed media library.

The last six years of academics have seen my personal library grow in the measure of yards-per-semester with an extra yard or so on non-academic interests over the course of each year, and while my interest in finding a mixed media friendly cataloging system has yielded some possibilities, the search to find an institutional level solution has not resulted in success. This summer I decided to actively search and research for a mixed media Integrated Library Management System (ILS/LMS) with built-in cataloging features. Thus far I have added another half-yard of books and come to a handful of conclusions. With regard to the ILS, to really explore this I will need to build a Linux, Apache, MySQL, PHP (LAMP) server for each of the perspective ILS options of which two stand out as likely candidates, Koha and Evergreen. Another is that I will still need to manually catalog code each entry which means deciding between Dewey Decimal Catalog (DDC) and Library of Congress Catalog (LOC) classification numbers. I will likely use Library of Congress Subject Headings (LCSH) either way. Lastly I have come to the conclusion that library science reference materials are bloody expensive.

Thus far the so-called guides on classification are good at high-level descriptions of the basic mechanics of classification but not so good at the nuts and bolts and the core organizational and metadata guidance like MARC, RDA, AACR2 which at first glance are arcane and exceedingly complex will take time to digest and internalize. The current DDC index, all four volumes, is way too expensive for the non-institutional curious researcher. A subscription to the LOC index is $375 and WebDewey is $391. The advantage with Dewey is I can order the print version and amortise that over 10 years if ordered once a decade, presently the four volume set would run $520.

So, where does this leave me in regard to cataloging? There are two main considerations to resolve, the first is the ILS. I can build a LAMP server on a single-board computer with PoE and SSD-bootable capabilities for $250-350 which should work fine as an evaluation server. This would give me the necessary data to determine the specifications and cost of a production system. The second consideration is the classification system, DDC or LOC. At this time I am leaning towards DDC. I am already familiar with Dewey and having hardcopy, though less fun to search through, will provide years of access unlike an annual digital subscription. Regardless, the hardware is the easy part. I will work on the parts list and put together at least one LAMP server to test ILSs. The cataloging/classification piece of the puzzle will require some more research.