1-4:30pm, Gutman Library basement. About 35 attendees.
We held two discussion sessions, first with four separate groups, and the second with five. The topics were suggested and determined by the attendees as the first order of business.
The following are rough notes taken on what the groups reported back. It was understood that these reports would not attempt to capture everything that had happened, but rather what the group most needed to know about the discussions. (Potentially actionable ideas are in purple.)
Organizational and legal roadblocks
Looked at legal agreements among providers. Library would benefit from more direct access to legal support wrt to information management and IP. Maybe a Library Law School fellowship, or more connections between the Library and Law School.
Need to revisit privacy and usage constraints/protections. In the real world, orgs are collecting info about people. Maybe we can think differently while still operating within the ethical landscape. We should think creatively about who we get at the table.
It'd help to improve cross university communication, and encourage people to do work that is not traditionally within their job description. Institute a formal project-sharing network, or enable people to work every week on projects not in their formal job description.
We shouldn't try to come up with one standard definition. We each define it from within the project and tech mindset we come to it with. Hence: Interoperabilty is discussion.
The different areas in which it exists: social, technical.
Different layers of achieving interop. You start out with one sense, but as the project progresses, its meaning changes.
Should we have have a super API, or a loose collection of more domain-specific APIs? Hierarchical?
Real humans don't use APIs. You need a UI to make what's discovered usable. May have different ways of exploring using the same API. How do you expose the richness that is useful to people? E.g., linked data is powerful but how do you expose it to people so they find it useful?
The illusion of interoperability. E.g., Bento boxes can look interoperable without being so, and that's fine.
// David Read has submitted his fairly detailed notes of the session itself. Thank you, David. Here they are:
- -Dublin Core was the first attempt at this. Maybe not that successful?
- -Looking for "the truth" What is Harvard's "authorities" name, ID, location, etc for a resource (of any type).
- -Linked data as the answer?
- -VIAF (International standard? for people names, corporate name authority)
- -User generated metadata tools are needed (bottom up as well as top down Curation)
- -Semantics are always going to be a problem with different fields having different meanings for terms
- -Several records for the same object issue
- -APIs at Harvard?
- -Semantic Web folks are already doing some of this metadata work
- -How do we get some common metadata where humans don’t have to be in the loop?
- -Example of metadata management challenge:
- 12 records in OCLC for the Hobbit: should be 1 record with 250 locations
- The different editions of Huckleberry Finn can be very important depending on the user
- -Need to be able to co-locate things for different audiences
- -Granularity in standards is difficult
- -Unique identifier is the key...
- -DOIs often don’t change for various formats of paper collections (this can be a problem)
- -Shared Canvas app and protocol?
- -Where are there author identifiers that could be considered authoritative at Harvard?
- -Chain of trust vs. Authority
- Creation of authority by pulling in resources to establish authority?
- -Need to be able to bring together research done to establish authority names with internal and external constituencies.
- -Example of challenge of Ontologies:
- Datasets for a cancer collection that uses 10 different ontologies for one type of cancer
- -Various levels of description needed: Container vs. domain (from wc3)
- -ORCID project?
- -Harvard Museums don’t have APIs currently
- -Library cloud core standard as a model?
- -How do you expose you data model?
- If you do expose you data and model with an API: how do I interpret you data?
- -Semantic web’s challenge with interpreting data/data models:
- Many embedded links needed to interpret data
- -Challenges of legacy records and metadata and obsolete hard/software: digital forensics and metadat
- -BIBFRAME project?
Interop with library non-standard data
For example, user-generated content. Could a UGC tool provide some level of integration across systems?
Should the discovery layer try to make sense of the non-standard data, or use ingestion routines?
Marc: Open Geoportal Group. Maybe front-end tool
See how commercial entities work with non-standard UG data. E.g., are Amazon's facets generated on the fly?
Linked Data and Science
Make URI's - maybe LTS identifiers for all descriptive records in Aleph, Via, OASIS, etc. id.lib - already building first version for DRS 2.
The Harvard Problem: Systems don't interoperate. More unconfs.
Extreme programming week and hackathons for bottom up interop.
Encourage people to use ORCID. How to do that? Maybe mandate from granting agency. Maybe outreach program to the faculty. Require it as part of the dissertation submssion process process. Make it normal.
Berkman does weekly hackathons.
More ABCD tutorials, like Unix for Librarians
Content delivery and interoperability
Rapid viewing of content as page images to make quick decisions
Image based searching someday
There's utility to low-quality scans
APIs to the content. E.g., International Image Interoperability Framework. Support standards. (Standard APIs is a relatively quick win)
Make available the 100,000s of jpg files in Law School.
Allow mass downloads?
- Fresh baked catalog records - tweet when done
- Stack trails - heatmap ????
- What's in HD?
- Swipe to get out of the library
- Course reserves unreleased - make them viewable and accessible
- Desperate recall - one button does either recall or borrow direct
- Relaxed recall - let me get it in the next month or so.
- Rendezvous recall - anonymously msg the borrower who has the book I want
- Provide a tool that enables users to enrich the content - crowdsourcing, descriptive metadata, OCR correction.
This is from a board to gather potential quick steps we could take
Interop workbench – take some sets of data, make them available ti researchers and developers, and provide tools to play with that data
Extreme programming week or hackathon
Take a subset or all of Holllis and put it into Bibframe to play with it.
Put the DPLA API on top of Hollis
Hollis match with HBS faculty research system RIS
RIS to catalyst to faculty finder to Hollis interop
Standards, tools, technologies we're pretty safe on betting on, perhaps investing some development around, etc.
- International Image Interoperability Framework
- Orchid IDs
- EAC- CPF