Pilot Project and More

The intention was to post last month but the workload for school has been heavier than expected. The pilot project included in the last post, building an organization’s media archive has been going well. One of the first determinations was that Omeka would not be the platform solution for the project. Omeka is a fine platform, but it is too complex for the organizations environment, personnel, the single media type, and the minimal metadata needs. In this case a simpler PHP/HTML front end for a small database will be more effective and easier to train the organization’s staff on if they decide to continue with the project after the pilot program is completed in December. As of today, the server is built, the platform software (a Linux system with Apache, MariaDB, and PHP [a LAMP server]) is installed, now it’s time to work on the database tables and a simple front end. This will likely be the main daily activity well into October.

In the Local History / Public History class, the next assignment is to analyze an occupation in the field/sub-field of interest, paper conservation was the choice I made as this is an area of great interest to me. Without any local practitioners or local educational resources the decision was made to dive in with a survey of practitioners. The survey is very narrow demographically and geographically, as well as being specific to paper conservation for books and documents, it is limited to the New England area. I would like to relocate to Maine in the near future so the geographic location made sense. Due to the narrow confines imposed, there was a total pool of 115 practitioners invited. As of this writing, there have been 16 respondents, 13.9%, which is pretty good. The hope is to have as close as possible to a 30% response which is asking a lot. Anyone having done a survey by cold-emailing professionals in a field knows anything above 10% is a good response. There are a few more days before the paper needs to be written so we shall see how close I get. The survey has 5 demographic questions and 5 questions on the education of new candidates to the field. SurveyMonkey was used because I have used that tool in the past. The results that have come in have provided an idea for a more detailed research project, an expanded version of this paper, that could make for an interesting journal article. SurveyMonkey is out of the question for an expanded research project, however, they have gotten far more expensive than is practical. Poking around a bit, I found a survey platform that can be added to an existing website and was easy to install and get started. The platform still needs to be explored and learned. This platform could make conducting survey research in a way that fits my workflow and reduces time to process and publish without spending a lot of money for what should be basic feature sets much easier.

Including the two projects above, a collection management system and a research project on paper conservators, there is also a documentary edition project that may move forward. On a recent visit to the local museum, while talking with one of the collections staff about another project for class, they brought up a journal they saw in one of the archive spaces that sounded like it would make a good candidate for a documentary edition for publication. An interest in pursuing this project has been expressed, as of yet no response, however, an in-person follow up will be forthcoming as this would be an enjoyable project and a solid CV / portfolio builder.

Until next time,
~Jon

Edited 21 SEP 2024

New Projects

An upcoming school project is providing the impetus to begin another, bigger, long-term project, the archive. My last post expounded on the great expanse of cataloging and what all is involved in the data management and location side of establishing an accessible archive. I already have an archive per se, a collection of photos, documents, papers, letters, and a handful of artifacts, the issue is it is little more than boxes of stuff, not the searchable and accessible collection it should be. As the previous post indicated the metadata captured and the form it takes provides the searchable elements of the catalog. This is where the old computer axium, garbage in – garbage out, stands very true. Bad use of, or poor, non-standard quality metadata is worse than none at all.

There will be more on the school project in future posts, suffice it to say at this point it is a pilot project to define the standards for a permanent digital media archive comprised of digitised magnetic analog media. The goal is to establish the background policies and procedures for an entity to build a media archive from old magnetic media before it degrades to the point it can no longer be accessed and to make the created digital media searchable and accessible. Searchable and accessible being the key operators, hence the need for a thorough look at what metadata will be useful, and to what level should the metadata be standardized to easily integrate with other institutions in a shared environment.

While my own archive has been nagging at the back of my mind for years, having a project along similar lines for a graduate project helps breaks the rust of apathy and stagnation. The project is under the auspices of an internship that will span two sub-terms, from mid-August to mid-December, and will be a pilot project that is primarily an investigation of what would be required to establish an archive. While I will be digitizing some media for the project, the primary objective is to gather data for a thorough report that will outline the policies and procedures for starting and maintaining a permanent archive along with the projected cost of maintaining it. I am hopeful that the my report from the pilot project will result in a decision by the organization to take on the full archive, but even if they chose not to, I will have set out to build a working archive and have the pilot project to show for it as well as the skills to continue with my own archive.

One of my biggest questions was were to build the archive website. I am running a testbed on and internet accessible server to learn the platform I have chosen to build the archive on, but I wanted to build the pilot project on an internal machine, something not hosted by a provider to allow for complete control and an opportunity to try and break it. I decided to run the pilot on a Raspberry Pi 5 with 8GB of memory and a 1TB SSD. I am familiar with running servers on the Pi platform and keeping them secure in a production environment which will help reduce IT needs. I have an isolated sandbox and can tunnel into the server which reduces the organizational expense to all but naut. I am used to taking on the IT/IS responsibilities for projects of this scale so this was a no brainer. In the runup to the project’s start date I am working on familiarizing myself with the inner workings of the Omeka platform and how it handles the metadata and customization.

That’s all for now,
~Jon

Cataloging

Over the summer I have been evaluating my studies and potential paths forward academically and professionally as well as considering optimal locations to pursue these paths. My proclivity towards physical objects leads me towards the public history side of my interests. I find the prospect of part-time teaching while working in an historical archive or museum very appealing. I am also drawn to the idea of digitizing an organization’s assets and producing expanded presentations of digitized assets. Expanding an objects presence in an archive might include multiple photographic representations, 3D scans, video, audio, text transcriptions, documentary editing, and a host of other representations. A printed pamphlet might lead to a dozen representations each needing to be preserved, cataloged, archived, and made accessible to a range of consumers.

Cataloging of assets can be an overlooked, or at least under appreciated element in the process of public history training which can be seen particularly in the community museum or cultural archive environment where volunteers make up the majority of practitioners in the typical facility. Public history is a nebulous term in general and spans so many disciplines that one could not expect to be a subject matter expert on any one area in a single graduate program, it is a generalist degree. While cataloging and finding aids are discussed and their importance expressed, no detailed attention is placed on cataloging models or how to establish or improve a system, presumably to leave the details up to to on-the-job-training. OJT is fine for institutional knowledge, but too little attention is placed on the mechanics of cataloging in a typical public history program. A detailed appreciation of cataloging comes from library sciences. I have come to this conclusion after years of casually trying to establish an archive cataloging system for a family genealogy collection and personal mixed media library.

The last six years of academics have seen my personal library grow in the measure of yards-per-semester with an extra yard or so on non-academic interests over the course of each year, and while my interest in finding a mixed media friendly cataloging system has yielded some possibilities, the search to find an institutional level solution has not resulted in success. This summer I decided to actively search and research for a mixed media Integrated Library Management System (ILS/LMS) with built-in cataloging features. Thus far I have added another half-yard of books and come to a handful of conclusions. With regard to the ILS, to really explore this I will need to build a Linux, Apache, MySQL, PHP (LAMP) server for each of the perspective ILS options of which two stand out as likely candidates, Koha and Evergreen. Another is that I will still need to manually catalog code each entry which means deciding between Dewey Decimal Catalog (DDC) and Library of Congress Catalog (LOC) classification numbers. I will likely use Library of Congress Subject Headings (LCSH) either way. Lastly I have come to the conclusion that library science reference materials are bloody expensive.

Thus far the so-called guides on classification are good at high-level descriptions of the basic mechanics of classification but not so good at the nuts and bolts and the core organizational and metadata guidance like MARC, RDA, AACR2 which at first glance are arcane and exceedingly complex will take time to digest and internalize. The current DDC index, all four volumes, is way too expensive for the non-institutional curious researcher. A subscription to the LOC index is $375 and WebDewey is $391. The advantage with Dewey is I can order the print version and amortise that over 10 years if ordered once a decade, presently the four volume set would run $520.

So, where does this leave me in regard to cataloging? There are two main considerations to resolve, the first is the ILS. I can build a LAMP server on a single-board computer with PoE and SSD-bootable capabilities for $250-350 which should work fine as an evaluation server. This would give me the necessary data to determine the specifications and cost of a production system. The second consideration is the classification system, DDC or LOC. At this time I am leaning towards DDC. I am already familiar with Dewey and having hardcopy, though less fun to search through, will provide years of access unlike an annual digital subscription. Regardless, the hardware is the easy part. I will work on the parts list and put together at least one LAMP server to test ILSs. The cataloging/classification piece of the puzzle will require some more research.