Skip to end of metadata
Go to start of metadata

FGDC Converter on Github

FGDC Converter on Stanley

Strategy

  • Leverage the Cornell converter base code to create the FGDC converter
  • Extend the XML parser from the base converter to parse FGDC-XML
  • Use the Mapping Specification spreadsheet
  • Fields with marc equivalents
    • Use the base converter to generate RDF directly for each field
    • Additional mapping code may need to be written, eg. for language codes from marc vs language default (English) for FGDC
      • eg. originator in FGDC is not exactly the same as author in marc (1xx, 7xx)
  • Fields without marc equivalents
    • Extend the base converter to add new methods to generate RDF for each field
  • Include JUNIT tests for all fields
    • Create the tests as we develop the code
    • Every field should have a junit test
    • Marc M should provide test fixture data - inputs and outputs
  • Test with the FGDC test suite
  • Convert the FGDC corpus of 9000 records and load into the Vitrolib triple store

Questions

    1. What is the mapping definition that Cornell is using? What RDF patterns will be produced?
      https://docs.google.com/spreadsheets/d/1k664EP8PKKkDF1utbzdp_7Awx8Xdve63ChaSDiMra2A/edit?usp=sharing
    2. Which entity reconciliations will Cornell provide? People?
    3. What additional reconciliation would be specific to the FGDC converter? Would URIs be added to the fgdc ahead of time or would an AI algorithm have to do a match against an external authority?

FGDC Fields for Converter, in Priority Order

See Status in the geospatial tab of the overall field planning spreadsheet: https://docs.google.com/spreadsheets/d/1k664EP8PKKkDF1utbzdp_7Awx8Xdve63ChaSDiMra2A/edit#gid=1545790267

FGDC ElementConversion RequirementConversion PriorityMarc EquivalentLD4 Pattern
Local Identifier (layername)Required1NoneIdentifiers
TitleRequired2245 $aTitles
GeometryRequiredlocal conversion034 $d $e $f $gSpatial Extent
OriginatorRequired31xx, 7xxActivity
Publication DateRequired4260/264 $$cActivity
Publication PlaceRequired4260/264 $$aActivity
PublisherRequired4260/264 $$bActivity
Online resource (linkage)Required5856 $$ubf:electronicLocator
Theme KeywordRequired6650 $$aSubject
ISO Topic CategoryRequired6650 $$aSubject
Place KeywordsRequired7651 $$aCoverage
Resource TypeRequired8constant data: bf:Cartography and bf:DatasetWork subtypes
AbstractRequired9520 $$aAnnotations
PurposeRequired105xxAnnotations
Presentation FormRequired11008 / 33xInstance subtypes
EditionRequired12250 $$abf:editionStatement
     

 

Development plan

PhaseTaskDescriptionDependencyWhoTarget dateStatus/Notes
0Define FGDC fieldsDefine the list of FGDC fields that we expect the converter to handle for this grant marc3/15/17Done
0Define test cases

Add the following test xml files to the FGDC test suite

  1. 1_MinimalFGDC_Title_and_ID.xml  - minimal fgdc record (1 field)
  2. 2_MinimalFGDC_SpatialExtent.xml - minimal plus spatial extent

To be planned:

  • fgdc-record-2.xml (marc fields that are planned to be handled by the Cornell converter on April 1)
  • fgdc-record-3.xml (marc fields that are handled by the Cornell converter on May 1)
  • fgdc-record-4.xml (Final marc fields that are handled by the Cornell converter plus all geo specific fields)

fgdc-record-2.xml and later on base converter available field setmarc3/15/17

Done - 1_MinimalFGDC_Title_and_ID.xml

Done - 2_MinimalFGDC_SpatialExtent.xml

 Done - 3_CoreFGDC_Originator_and_Publisher_Activities.xml

Done - 4_CoreFGDC_Abstract_and_Purpose_Annotations.xml

Done - 5_Generated_CoreFGDC_Online_Resource_and_Edition.xml

Done - 6_CoreFGDC_Thematic_and_Place_Keywords.xml


 

 

1FGDC parserInstall base converter, and extend the XML parser from the base converter to parse 1_MinimalFGDC_Title_and_ID.xmlbase converter by 3/12david3/15/17Done
1Minimal converterExtend the base converter to process 1_MinimalFGDC_Title_and_ID.xml to generate 1_MinimalFGDC_Title_and_ID.rdf david4/1/17Done
1Minimal converter junit testsAdd junit tests for the fields in 1_MinimalFGDC_Title_and_ID.xml david4/3/17Done
1Approve 1_MinimalFGDC_Title_and_ID.rdfInspect and approve the rdf generated for 1_MinimalFGDC_Title_and_ID.xml marc4/4/17Done
1Deploy converter to stanleyDeploy the converter to stanley in a form such that marc can run it on new test data david4/6/17Done
1Minimal ontology in Vitro libLoad an ontology sufficient to describe 1_MinimalFGDC_Title_and_ID.xml marc4/6/17Done
1Minimal Vitro libLoad the 1_MinimalFGDC_Title_and_ID.rdf from processing 1_MinimalFGDC_Title_and_ID.xml into the VitroLIb triple store david4/7/17Done
1Approve 1_MinimalFGDC_Title_and_ID.rdf in VitrolibInspect and approve, using Vitrolib, the rdf generated for 1_MinimalFGDC_Title_and_ID.xml marc4/8/17Done
2Minimal+spatial_extent converterExtend the base converter to process 2_MinimalFGDC_SpatialExtent.xml to generate 2_MinimalFGDC_SpatialExtent.rdf david4/24/17Done
2Minimal+spatial_extent converter junit testsAdd junit tests for the fields in 2_MinimalFGDC_SpatialExtent.xml david Done
2

Approve

2_MinimalFGDC_SpatialExtent.rdf

Inspect and approve the rdf generated for 2_MinimalFGDC_SpatialExtent.xml marc Done
2Update converter to stanleyDeploy the converter to stanley in a form such that marc can run it on new test data david Done
2Minimal+spatial_extent ontology in Vitro libLoad an ontology sufficient to describe 2_MinimalFGDC_SpatialExtent.rdf marc Done
2Minimal+spatial_extent Vitro libLoad the 2_MinimalFGDC_SpatialExtent.rdf from processing 2_MinimalFGDC_SpatialExtent.xml into the VitroLIb triple store david Done
2Approve 2_MinimalFGDC_SpatialExtent.rdf in VitrolibInspect and approve, using Vitrolib, the rdf generated for 2_MinimalFGDC_SpatialExtent.xml marc Done
3Approve 3_Generated_CoreFGDC_Originator_and_Publisher_Activities.n3Inspect and approve the rdf generated for 3_CoreFGDC_Originator_and_Publisher_Activities.xml marc Done
  • No labels