LD4L-Labs Harvard Geospatial Library metadata conversion
from the Harvard LD4L-Labs Statement of work
3. Pilot linked data conversion, publication, and visualization of Harvard Geospatial Library (HGL) metadata. Working with the other project partners, Harvard will develop a BIBFRAME/LD4L profile; develop metadata conversion software to convert existing geospatial metadata records from the Harvard Geospatial Library and from Stanford (see the Stanford SOW in section 6.5) describing raster maps and vector map data layers into BIBFRAME; publish the RDF to the Harvard linked data endpoint; and integrate a beta of graph visualization software into the Harvard Geospatial Library or an Omeka virtual collection to assess end user value. This project will focus on converting a subset of OpenGeoMetadata metadata records from the Harvard Geospatial Library and Stanford (where they are now represented using the geospatial community standard Federal Geographic Data Committee (FGDC) schema, ISO 19139) into linked data descriptions using BIBFRAME/LD4L as a base ontology. Deliverables for the project would include: a BIBFRAME/LD4L profile for geospatial datasets; a set of mapping rules for FGDC geospatial metadata standards to the BIBFRAME/LD4L profile; reconciled linked data entities in the source metadata for Originators, Place and Theme keywords, and series works; a linked data triplestore with published descriptions; and a user interface for searching and visualizing geospatial dataset descriptions.
May 2018 Project Update
The second year of the Harvard LD4L-Labs geospatial metadata conversion project has focused on:
- Finalizing a target ontology for FGDC xml to LOD rdf data conversion based on: BIBFRAME, bibliotek-o, and the LD4P Cartographic Material Extension project's Geospatial and Cartographic Resources (GCRO).
- Building concordance files and reconciling (text to linked data URIs) for agent names, topic keywords, places of publication, and place keywords in the converted data sets
- Deploying a geospatial metadata conversion tool and converting a set of 8,800 Harvard Geospatial Library and 5,100 Stanford Earthworks geospatial metadata records to the target geospatial LOD ontology
- Loading the converted data into a Harvard geospatial data instance of VitroLib
Remaining project work will focus on:
Creating a Geospatial and Cartographic Resources Ontology (GCRO) + target ontology SHACL application profile (May-June 2018)
- Manual customization of the Harvard geospatial data instance of Vitrolib to support display, creation, and editing of geospatial and cartographic resource native linked data description (May-June 2018)
- Test cataloging of a selection of geospatial and cartographic resources using the GCRO + target ontology application profile in the Harvard geospatial data instance of VitroLib (May-June 2018)
- Completing a demo visualization tool for converted geospatial metadata collections (January-June 2018)
For more information and links to project documentation and files see:
- Linked Data for Production (LD4P) Cartographic Extension and Harvard Cartographic Materials project wikis
- Linked Data for Production (LD4P) GitHub Repository
Harvard Geospatial Metadata Converter
Conversion strategy and development plan - Harvard project work plan for conversion of Harvard Geospatial LIbrary (HGL) FGDC metadata records to linked data descriptions.
Geospatial metadata core fields for FGDC conversion (working document) - list of geospatial metadata concepts with supporting information and priority level for conversion
Harvard Core Conversion Mappings for LD4L Domain Projects (working document) - includes mapping of HGL FGDC elements to the LD4L/P target ontology
Test converter files - source FGDC metadata test files (xml) and expected output LD4L conversion files (ttl) used to test converter results by iteratively adding conversion elements
Geospatial and Cartographic Resources Ontology (GCRO) - The target ontology for the geospatial metadata conversion will be based on the LD4L/P BIBFRAME Extension (bibliotek-o) ontology and the Linked Data for Libraries Production (LD4P) Cartographic Materials Extension ontology recommendations.
VitroLib Customization Work
VitroLib for cartographic materials/geospatial metadata - Harvard has built a cartographic materials/geospatial metadata editing tool based on the Cornell VitroLib tool. Harvard customization work will support the additional requirements for creating linked data descriptions in the domain extensions (i.e.. custom data input forms for Spatial Extent - Bounding Box, Projection Data, & Cartographic Relief). Harvard has loaded a set of 8,800 Harvard Geospatial Library and 5,100 Stanford Earthworks converted geospatial metadata records into the Harvard geospatial data instance of VitroLib.
Marc McGee, Geospatial Metadata Librarian, mmcgee (at) fas (dot) harvard (dot) edu
Archived Project Updates
April 2017 Project Update
The first year of the Harvard LD4L-Labs geospatial metadata conversion project has focused on: identifying and building a target ontology for metadata conversion, developing a conversion strategy and plan, developing a converter tool, and beginning VitroLib customization work to support display, creation, and editing of geospatial linked data.
The Harvard geospatial metadata conversion project will use the LD4P/LD4L Labs BIBFRAME extension and Linked Data for Libaries Production (LD4P) Cartographic Materials Extension ontology recommendations as a target conversion ontology. Harvard has begun to customize the Cornell MARC to LD4P/LD4L Labs BIBFRAME extension converter tool to be able to convert Harvard Geospatial Library FGDC XML metadata records to the target ontology. Source and target conversion test files have been developed. Priority metadata elements for conversion have been identified, a mapping of the priority elements to the target ontology has been developed, and supporting target ontology files are in the process of being written. In addition, Cornell's VitroLib editor is being customized to support additional requirements for creating linked data descriptions of geospatial resources including a custom input form for capturing spatial extent/bounding box data. Upcoming work will focus on finishing the target ontology, further developing the FGDC to target ontology converter, writing an application profile for VitroLib, customizing VitroLib to support other parts of the developed ontology, reconciliation of entities in the source metadata, and converting the Harvard Geospatial Library metadata to the target ontology.