Harvard Wiki has been integrated with Group Services.
Wiki administrators: visit IT Help for an overview of the changes to managing groups in your wikis.
Skip to end of metadata
Go to start of metadata

Spreadsheets for digitization orders are based on templates, the process of filling in these spreadsheets is explored here (the various template options are presented in detail here).

Template Header Basics:


Spreadsheet specifics:

     Unique object IDs:

One the most significant elements of creating the spreadsheet is generating a file name for each item being imaged.  These file names must be expressed consistently, correctly and uniquely.  It should be noted that file names are often visible to the public.  Though they are administrative in nature, they also serve a citation function and are used to identify the work when paper copies are delivered to patrons in the case of suppressed material.

The first part of the file name is made up of the owning repository:

modbm- Modern Books & Manuscripts ( >1800)

pga- Printing & Graphic Arts

earbm- Early Books & Manuscripts (<1600)

hew- Harry Elkins Widener

htc- Harvard Theater Collection

hyde- Hyde Collection and Early Modern (1600-1800)

 wpr- Woodberry Poetry Room

trc- Theodore Roosevelt Collection

Owing to a limitation in Alpha, orders may list only one owning repository when the material may actually belong to multiple owning repositories.  There may be a note in “Special Comments” concerning ownership; be sure to review the material and record thoroughly to catch any misattributions.  Each repository must be correctly linked to their objects in an order so they are appropriately filed in the DRS. 

Following the owning repository code, an edited form of the call number is added to the file name.

e.g. A cartographical sketch-book of Siberia created in the late 1600s, part of the Leo Bagrow collection of maps, MS Russ 72, is filed under Hyde Collection.  The call number is made all lower case and spaces are represented by underscores:


Then, an item, folder, or volume number is added to the file name:

e.g. The sketch-book is identified as item six in the Bagrow collection:


Typically, each object requires a unique object ID.  Specify unique object IDs for each item within its own template header.  Note that manuscript material has a 300 page image limit per METS, even for complex objects.  To capture longer materials or collections in full, it may be necessary to create more than one object.  Break up collection in logical ranges if needed: 1- 100, 101-200, etc. or A-D, E-G, etc. and append the section information to the unique file ID, e.g.:


Note: The file name should only contain the substance of the call number, not additional location information. There are a few collections in which this is trickier than it sounds. For example, Lowell is a location designation, but it is formally part of the call number, so it would be included in the file name. The same for something like Horblit or HEW (which is then redundantly repeated with its repository code, hew_hew). Locations like Vault or the suffix of, for example, "23.1" (shelf locations seen with Incunabula) would not be included.

     File Names:

File names may be entered to describe the particular item being captured, or they may be automatically generated extensions from the unique object ID.

e.g.:      The whole of MS Russ 72 (6) is to be digitized.  The file names may be automatically generated in sequence from an initial entry of hyde_ms_russ_72_0001 in the first cell underneath “file name.”  The final number should always be expressed in a four digit series.

e.g.:      Two folios of MS Russ 72 (6) will be digitized.  Unique file names must be constructed to define the folios: hyde_ms_russ_f5_recto and hyde_ms_russ_f12_verso.

In the case that the material lacks obvious identification (no page numbers, a plate between pages with no numbering, etc.) a reasonable and replicable identifier should be created that will allow for future digital interfiling.


     Naming Miscellany:

Neither size designation (f/folio or pf/portfolio) nor location information (b/boxed or Lobby, etc.) is included in the citation or the file name, e.g.:

hyde_ms_russ_72_6  not  hyde_f_ms_russ_72_6

Asterisks are not included in the citation or the file name. 

Versions, a, b, c, etc., should be noted after the call number.  These must be included in the citation and file name for disambiguation purposes in the case that future versions of the work are also deposited. 

Underscores should be used in place of all other punctuation (periods, parentheses, etc.)

Non-standard characters should be replaced with the most analogous standard keyboard option, e.g.:

IC6 Sf579 649ℓ becomes earbm_ic6_sf579_649l where the ℓ is replaced by a lowercase L

     (Note: These characters should not be altered in the citation for the work.)

When designating the page or folio number, no underscore is required, e.g.:

modbm_ lmc8_m5745C_1824cp_p9  not   modbm_ lmc8_m5745C_1824cp_p_9

Any peculiarities about the capture in last part of the file name: fold_out, double_spread, illustration (if it is a partial page capture), etc.

Translate Roman numerals to Arabic for file names, particularly to disambiguate between v. for verso.  Include Roman numeral in page labels to represent pagination/foliation as it appears in the text.

     Optical character recognition (OCR):

Optical character recognition is rarely applied to OTC orders, however, it should be applied to any whole printed work past 1800.  Paperwork for the order will specify that OCR should be applied.  In this case, simply change "no" to "yes" in the cell adjacent to OCR in the spreadsheet header.

     Noting Restrictions on Spreadsheets:

Though there is a field for indicating restrictions in the spreadsheet, Houghton policy is typically to enter a public “P” value for this cell to allow for quality control review of material upon its return from Imaging.  To restrict material to Harvard ID holders indicate *Restrict to Harvard Access Only* in cell C1.

      Object Label / Titling Information:

All data entered in this field must be presented on the same line (no returns with the "enter" key).  Any broken lines will result in incomplete transfer of the object label.

Association copies: Require additional note preceding call number to indicate the previous owner: e.g. Dickinson Family Library Copy.  EDR 200.  or Herman Melville Copy.  AC85 M4977 Zz891s2.

  • No labels