Resources
LibraryCloud Overview proposal - May 2013
Design Phase Technical and Requirements documentation (Google Documents Folder)
Projects under consideration
Guidelines
Here are some proposed guidelines for considering which projects to pursue when and with what level of commitment.
Categories:
A) Viability
(B) Value, short term
(C) Long term value
- Resources are available for it (A)
- Few extrinsic obstacles (copyright, privacy, etc.) (A)
- Can be accomplished relatively quickly (or is at least demonstrable) (A)
- Illustrates LCloud's potential (C)
- Is extensible, reusable, or provides a useful model for other projects (C)
- Addresses LCloud use case (B)
- Doesn't exist already (B)
- There is a customer who will use it (B)
- Sustainable/reuseable (C)
- Leverages work that is going on now (A)
A brief taxonomy
- A gimme: So low resource that it'd be silly not to do it, even if relatively low value
- Low-hanging: Low resource requirements, quick beta, illustrates the value of LCloud or is provocative
- High value: Central to other efforts and/or of significant value in itself. Needs serious dev and mgt resources.
Suggested projects
These are projects that have been suggested in various brainstorm-y sessions. (This list is incomplete. Add more.)
Project | Description (about a paragraph) | Current State (any existing capability that attempts to address the need, including ongoing projects) | Owner for further exploration | Owner for providing description | |
---|---|---|---|---|---|
1 | Interop homepage | mainly for internal comms and knowledge sharing, but also public | sketch | David W | David W |
2 | API for Aleph | Already available via PRESTO for bib records. If holdings or item data, that would be new. Also: PRESTO and the Lib. Innov. Lab's LCloud both provide APIs, with approx. 60% overlap. What is the right architecture and approach to providing a broadly useful set of APIs to items, usage, etc.? | PRESTO and Lib. Innov. Lab's LibraryCloud API. DPLA's API may also be relevant/useful | Bobbi, Corinna, David, Paul | David W |
3 | Project to get item-level descriptive metadata from EAD finding aids into Library Cloud to enable discovery and access with descriptive metadata from other sources | Encoded Archival Description (EAD) is the standard for describing archival collections. It mimics the analog practice of “finding aids” which describe content at the collection-level, providing background information and an inventory, which typically lists, and briefly describes, boxes and folders (and sometimes items within folders). In the online environment EADs do not play well with most other standard descriptive metadata for collections, which usually have descriptive metadata at the item level (e.g., a MARC record, an FGDC record or a VRA record for a book, a map or an image). Within a finding aid, the contextual information required to adequately describe an item is not found in the item-level label alone. Rather, it is found in the information above the item (in sequence and hierarchy), such as the collection title, date and description as well as labels for the series, box and folder. To enable discovery and access of these archival collections with items from other sources, we need to free the item-level metadata from EAD finding aids. | Current work by Michael Vandermillen to free item-level metadata described in EAD finding aids includes:
Other related activities include:
| Michael, Robin, Wendy EADComponentsProjectOverview.docx
| Wendy, Michael
|
4 | API for HOLLIS usage data | Library Innovation Lab is building an API for usage data using the Library of Congress classification outline (LC call-number taxonomy) and Harvard circulation data. How many works in this or that subject have been checked out, recalled, reserved, etc. over a given period. | The Innovation Lab will be adapting its current LibraryCloud item API to handle usage data. A new schema will be worked out and several API extensions will be implemented. We are using the Library of Congress classification outline to categorize aleph items' by subject (based on LC call number) and Cognos reporting to harvest usage metrics. The project is funded through Library Lab, and is slated for completion in fall 2013. | Paul | Paul |
5 | API for real-time availability data | Ability to query Aleph and obtain availability for items | Currently available via PRESTO. The base syntax is: http://webservices.lib.harvard.edu/ | Randy, Michael | |
6 | API for various non-book data, e.g. VIA, Harv. Geospatial Library | Ability to search and obtain detailed metadata records for catalogs with content specific schemas | There is a PRESTO API to retrieve VIA record data in MODS. There is no search API for VIA, although image records can be located through the existing PRESTO HOLLIS search API HGL is based on OpenGeoportal, which support solr search and retrieval of metadata records. I'm nor sure if an API exists to retrieve the native FGDC schema metadata record. | Randy | |
7 | Extend the API for DASH | ||||
8 | Collection building in Library Cloud | For curators, one of the benefits of the web is the ability to unite physically dispersed material through online digital collections. For web-based presentations, Harvard curators often create “virtual collections” by drawing together related content from multiple catalogs and/or collections. In addition, collaborations can also include non-Harvard content (e.g, Emily Dickinson Archive and Colonial Archive of North America). To do this, they need to be able to identify related content, select it and mark it as a collection (or “set”). Then, they need to be able to confine a search to retrieve a set and display it in a web-based collection presentation UI. To support this, the following collections functionality is needed in Library Cloud:
| Harvard’s web-based collection-building application, Virtual Collection (VC) currently supplies the following functionality, but is in need of an upgrade:
| Michael, Wendy | Michael, Wendy
|
9 | Bring in some non-Harvard data: e.g., Colonial NA Digitization, bib data from other libraries, metadata about some high-value Web sites | ||||
10 | Package up LCloud version of Omeka | Harvard’s special collections, archives, libraries, and museums are looking for web-based solutions to providing information about, and seamless delivery of, static images, textual works, sound recordings, and moving image materials in one place. Project would aim to offer special collections, libraries, archives, faculty, and students their own installation of Omeka on an LTS server using a “Harvard install” of Omeka. The packaged version would include a a plugin (or plugins) for importing metadata from Library Cloud (OAI harvest, CSV import plugin, or potentially a new import plugin if needed); in addition a wide variety of plugins available to Omeka users would be bundled. Future development may include a single portal to Harvard’s Omeka content, facilitating a number of interesting projects, such as time lining across repositories or doing geospatial work. The packaged version could be bundled with a set of data entry guidelines/recommended ways to enter content (and in which fields) to promote consistency of data across instances, as well as a plugin to get data back in to LibraryCloud.
| Omeka is in use by, or has been experimented with, a number of Harvard repositories (such as the Center for the History of Medicine, which has a robust system, is planning additional development this summer, and could lend expertise).This has required local development for improved functionality, something many repositories, students, and faculty do not have access to. With promotion, the University could build a substantial amount of content currently hidden to users and encourage the use of a system that could be centrally harvested and disseminated. Because the data is OAI-PMH compliant (Dublin Core, MODS, CDWA Lite), it could be harvested for display in a central system, such as Hollis, to bump up collection visibility. Related Omeka plugins: OAI-PMH Harvester, OAI-PMH Repository, Catalog Search | Emily, Michael, Jonathan | Emily, Michael, Jonathan |
11 | Linked Data project/exploration | Julie | |||
12 | Guidelines for how to make collections more interoperable | Identify and analyze metadata in both shared and local systems utilized by Harvard’s special collections and museum communities and author guidelines for creating and mapping metadata meeting local needs to broader metadata standards to facilitate data sharing (such as Dublin Core). The objective is not to point user communities to particular systems, but rather, consider how metadata could be mapped (and how, perhaps, metadata entry could be standardized) for the purposes of aggregation, and to encourage consistency in defining and populating database fields by providing specific “how to” examples. Additionally, the project would ask the community to consider how content in silos or locked in proprietary systems could otherwise be disseminated. | Because of the number and richness of systems at play, it would be best to pick one standard (Dublin Core for OAI-PMH harvesting?) and look at a variety of records for collections management systems/databases/etc. to get a sense of the magnitude of mapping and specific content guidelines informing those fields. A literature review is also needed. Of possible interest: Metadata for Special Collections in CONTENTdm: How to Improve Interoperability of Unique Fields Through OAI-PMH. | Wendy, Emily | |
13 | Resolution service: Feed it bib data and get back standard ids, etc. | Feed it an url and it get bib data. (This would help a problem for the CATCH Annotation Hub, as well as having broader utility.) | Paolo | ||
14 | Library collection case study/demonstration project: Harvard CNA (Colonial North America - a collaboration within Harvard repositories)) and Federation CANA (Colonial Archive of North America - an external collaboration that includes collections from Harvard, BPL, Mass. Hist. Soc. and Bibliothèque et Archives nationales du Québec (BAnQ). | As cases studies/ demonstration projects, the CNA and CANA digital projects would address many of the challenges posed by projects in this list:
In addition, the CNA and CANA project workflows need to support both retrospective and prospective work, which may be a good test of 2 data transfer workflows. (If there is interest in this, Wendy just needs to confirm approval from the CNA Planning Committee) | Jim, Wendy | Wendy | |
15. | Integrating open Web materials about books | E.g., NPR's records of its on-air stories include a tag if the story concerns a book, and many also include an ISBN number. These records could be tied to Aleph records, and could be extended via uniform title to editions other than the one with the ISBN. Then, from the Aleph record the NPR story could be tied to other books with the same LCSH's. | Lib. Innov. Lab already has worked with NPR to get their 16,000 book-related records. Talking with NYT and CBC also. Also has experimented with other open Web sources, including Wikipedia. The Lib. Innov. Lab is committed to exploring this at least for LibraryMist. | Paul D. | Paul D. |
16. | HBS-HOLLIS integration | Search for an HBS unit and see all the books the members of that unit have published. Search for a prof and see all of her/his books mashed up with other books on the same subjects in Hollis. Plus more. | Contract dev has created wireframes. The development is funded. | Library Innovation Lab | David W. |