Project purpose and goals

Library services rely on accurate metadata  to ensure that faculty, students, and researchers can discover, access, and use library materials needed for research and education.  Library metadata standards, strategies, and schemas are undergoing dramatic change due to the changing nature of library collections, data creation workflows, and evolving expectations for interoperability and reuse of metadata.  As Harvard Library embarks on a migration to a new library processing platform, the Access and Discovery Standing Committee spearheaded a project to enhance and upgrade metadata in Aleph, SFX, Verde to achieve two key objectives: 

1) optimize the ease and accuracy of metadata migration from Aleph to Alma

2) increase the quality and efficacy of resource sharing for patrons at Harvard and beyond 

 Alma MigrationOCLC Synchronization

Other Corrections

 (most will use automated scripts)

Target date for subproject completion  Dec. 2017Aug. 2017 Dec. 2017 
No. error conditions identified68  130 292
No. error conditions defined as priorities  100 
No. error conditions resolved 4 4 0
Estimated no. records impactedunknown 300,000 Some error conditions will impact all bibliographic and holdings records (approx. 30 million) 
Target for no. of record corrections  400,000 90,000
No. record corrections completed 11,00053,400 


Revised forecast for no. corrections  by end of project 400,00090,000 30 million records (all)

Weekly Update & Dashboard

June 28, 2017

Estimated number of Aleph bibliographic records to be corrected: 300,000

Modifications to Date

TypeEst. Count% by Batch




Item  1,546 62
Orders   1034



In the works:

See prior updates and project status for more details.


Phase / IssueStatus / % CompleteEst. No. of RecordsAleph Records Corrected
1: Paired field errors (subfield $$6) 


Bibs: 29,396


1: 'Sparse' records 


Bibs: 21,538


1: Missing/Problematic titles (245)


Bibs: 855


1: Invalid identifiers  Bibs: > complete db checkBibs: 0
1: Form of item conflict: Invalid combination LDR/06 & LDR/07



Bibs: 8,071 (revised est.)



1: Form of item conflict: Non-standard or obsolete codes in LDR/06 (Record Type)


Bibs: 15,413


1:Form of item conflict: Non-standard or obsolete codes in 008/23 (BK) Bibs: 87 0
1:Form of item conflict: Non-standard or obsolete codes in 008/23 (CF) Bibs: 61 0
1:Form of item conflict: Non-standard or obsolete codes in 008/23 (MX) Bibs: 3 0
 1:Form of item conflict: Non-standard or obsolete codes in 008/23 (MX) Bibs: 12 0
1:Form of item conflict: Non-standard or codes in 008/29 (VM)


Bibs: 42,229 0
1:Form of item conflict: Non-standard or obsolete codes in 008/29 (MP) Bibs: 20 0
Alma: Active holdings on deleted/suppressed bibs 

Bibs: 24.86%

Hol:  17.66%

Holdings: 15,575

Bibs: 16,031

2751 / 3625

1: Set lending byte 


13,588,541 - Unknown policy

      514,505 - Invalid values

1: Set reproduction byte  


13,669,916 – Unknown policy

     515,184 – Invalid values

1: Set retention byte 


  140,442 – Unknown policy

9,045,492 - Invalid values


0: LDR - Invalid length


Bibs: 376


0: LDR/05 (Record status) - Invalid codes


Bibs: 10,551


0: Orphan bibs


Bibs: 41,000


0:008 (Fixed field) missing


Bibs: 36


0: 008 (Fixed field) - Invalid length



Bibs: 918102
0: 008/06 (Date Type) -  Invalid codes



Bibs: 20,398


0: 008/15-17 (Language - Invalid & obsolete codes) - All formats except music/sound



Active Bibs: 8,554

0: 008/15-17 (Language - Invalid codes or blank) - Sound recordings


Bibs: 30,256  21,683
0: 008/21 (Music Parts) - Invalid codes Bibs: 1,146 (1,143 have obsolete code 'a')
0: 008/33 LitF (BK)  - Invalid codes Active Bibs: 21,858 contain invalid codes 1217 contain an obsolete code 0
0: 008/33 Alph (SE)  - Invalid codes Active Bibs:  406 0
0: 008/33 TMat (VM) - Invalid codesActive and suppressed bibs: 42,257 contain invalid or obsolete codes 0
0: 008/35-37 Place of publication - Invalid & obsolete codes

Active Bibs: 152,697

0: 260 Imprint missingJudaica Manual by unit
0: Obsolete 440 convert series statement to 490/830


Bibs: 1,161,189, Approx. 1.2m tags


Alma: Item Processing Status - invalid codes 



Items: 1


Alma: Item Status - invalid codes



Items: 22643
Alma: Items: Material Type - invalid codes



Items: 930



Alma: Order Status - invalid codes



Orders: 1010


Alma: Acquisition Method - invalid codes 


Orders:  23 23
Alma: Open Orders

Orders to be closed to be identified by ASWG as part of test load process.

Alma: RLN conversion

Holdings: 10,260. Debbie corrected 9,263 holdings. She will forward list of holdings that could not be resolved to contact at sublibraries for resolution.9,263
Alma: LDR07 with Harvard-defined '9'



Bibs: 17 17

Project Approach

This project is funded by the Harvard Library and will be managed by a project team from Library Technology Services. The project team will gather information from Harvard Library staff, the OCLC Data Sync Working Group, and Metadata Standards Working Group about current and historic coding practices and workflows in Aleph, SFX and Verde to identify and recommend metadata candidates for normalization and improvement, define and recommend remediation objectives and strategies, and complete remediation of prioritized data issues.

An oversight committee representing Library Technology Services, Access & Discovery Standing Committee, and Harvard Library ITS and Access Services will prioritize work and approve data mediation strategies.

 Objectives, strategies and recommendations will be informed by information gathering from experts across Harvard Library.  

  • Gather information on current and historic data and workflows
  • Conduct metadata analysis to identify candidates for correction
  • Prioritize candidate data elements to be corrected
  • Design and execute data correction processes - use automated processing as much as possible


  • A list of key areas to correct or improve, prioritized by impact on users
  • A communication strategy and dashboard to monitor progress
  • Final summary report

The project will be complete when accepted recommendations have been implemented and correction projects defined as highest priority have been completed.

Project Team

Lynn Stram, Metadata Migration Analyst, Library Technology Services (lead)
Corinna Baksik, Library Technology Services
Michael Edwards, Library Technology Services
Laura Morse, Library Technology Services
Allison Powers, Library Technology Services 
Additional Term Project Resources (tbd, Library Technology Services, Information and Technical Services) 

Oversight Committee

Michelle Durocher, Information and Technical Services
Laura Morse, Library Technology Services
Ken Peterson, Access Services
Tracey Robinson, Library Technology Services
Scott Wicks, Information and Technical Services
Suzanne Wones, Harvard Library


A preliminary schedule has been created, but may need adjustment based on outcome of the analysis, hiring patterns for project staff, and dependencies from the OCLC Data Synchronization Project: 

Information gathering, data review, development of database remediation project list.   May 2016 – September 2016

Finalize priorities for remediation, communication plan, and dashboard.  October 2016

Determine options for data remediation initiatives. November – December 2016

Execute data remediation.  December 2016 – April 2017

Prepare final report.  April  2017 – May 2017


Metadata Optimization Charter

Resource Links 

PowerPoint Slides from Public Presentations (October 2016)


Contact Team Members:

Milestone Timeline

May-September 2016Information Gathering
October 2016Finalize Phase 1 Priorities
November - December - 2016Define and Execute Phase 1 Remediation Options
January 2017Finalize Phase 2 Priorities
February 2017Define and Execute Phase 2 Remediation Options
February 2017Finalize Phase 3 Priorities
May 2017Define and Execute Phase 3 Remediation Options
May 2017Finalize Phase 4 Priorities
April-May 2017 Final Report
June 2017Define and Execute Phase 4 Remediation Options