Harvard Wiki has been integrated with Group Services.
Wiki administrators: visit IT Help for an overview of the changes to managing groups in your wikis.
Skip to end of metadata
Go to start of metadata

DRAFT (subject to change)

UC1. As an LD4P partner I want to convert a MARC record to the LD4L Ontology as the basis for LD4P cataloging/extension work

  • Take MARCXML file as input, output LD4L RDF. No expectation of deduping across a collection of records, or of reconciliation (with the exception of URI minting strategies based on local data, e.g. URIs/identifiers supplied from the data).
  • Roughly equivalent to the LC BF converter core + postprocessing used in LD4L1 project with a new target ontology.  Expect better code with unit and regression tests however.
  • Expect this core structural conversion will be extensible to other input formats and output modifications (e.g. FGDC input, BF2 output).
  • This is what we have committed to produce by end 2017-03, based on availability of the LD4L Ontology by 2017-01-01. Rebecca will work on this with help from Jim, and in coordination with Josh's validation engine.

UC2. As an LD4P partner I want to convert HFA Data from Filemaker Pro to the LD4L Ontology as the basis for LD4P cataloging/extension work

  • Access HFA metadata directly from Filemaker db.
  • Expect to extended converter core from use case 1
    • Directly re-use core converter "element converters" for shared elements, such as TitleBuilder, CreatorBuilder, PublisherBuilder, etc, to generate RDF and manage URI minting, entities
    • Re-use or extend lower level converter methods to the greatest extent possible to generate RDF for extension elements
    • Re-use or extend lower level converter methods to the greatest extent possible to manage URI minting
  • Automated access to convert from live database or clone at a frequency TBD
  • Extend converter Core to recognized new format and model; output using both core and extension modeling
  • Specific vocabulary/entity reconciliation needs

UC3. As an LD4P partner I want to convert FGDC XML to the LD4L Ontology as the basis for LD4P cataloging/extension work

  • FGDC metadata is in XML, stored by and available to Harvard LTS
  • Expect to extended converter core from use case 1
    • Directly re-use core converter "element converters" for shared elements, such as TitleBuilder, CreatorBuilder, PublisherBuilder,, etc, to generate RDF and manage URI minting, entities
    • Re-use or extend lower level converter methods to the greatest extent possible to generate RDF for extension elements
    • Re-use or extend lower level converter methods to the greatest extent possible to manage URI minting
  • Automated access to convert from live XML at a frequency TBD
  • Extend converter Core to recognized new model; output using both core and extension modeling
  • Specific vocabulary/entity reconciliation needs
  • Some form of convenient wrapper around core converter (possibly with extensions) that handles a batch of records and provides helpful debugging output to identify records that fail while continuing to process other records.
  • Work on deduplication of local URIs necessary
  • Scope here is modest numbers of records as input for manual work so performance likely not and issue
  • *M records so performance is as issue
  • Manual work to help with deduplication and reconciliation is out-of-scope – expectation is that improvements would come from fixes to the source MARC and/or to authority data
  • Not actually part of LD4L Labs proposal but we would like to do it
  • Builds on 5, expect deduplication, reconciliation, performance.
  • Catalog has daily updates (therefore repeat complete conversion is not possible)
  • Must deal with updates! Expectation that doing this in a triplestore will require named graph for triples from each record, thus allowing delete and then add in order to update.

UC7. As an LD4P partner I want to convert a batch of MARC/XXX records to the BF2 Ontology as the basis for LD4P cataloging/extension work

  • Relies on ability to customize core to output BF2 instead of LD4L Ontology
  • Harvard assumes there will be BF2.0 Converters available from LC and vendors. (With all the work to be done and unanswered questions, LD4L should not specifically strive for this.)
  • No labels