HOW TO SQUEEZE ART OBJECT SUBJECT METADATA OUT OF SCHOLARLY TEXTS

The Basics

 

  1. Start with a collection that
    1. has descriptive cataloging information at the item level and
    2. includes links or potential links to (digital) (art) objects

  2. Identify or create a target object ‘authority list’ for the collection being cataloged

  3. Select, scholarly texts that provide rich description at the item level for objects in the image collection; scan and encode with barebones TEI markup;

  4. Process each scholarly text to identify with a high degree of accuracy each mention of a target art object (TOI)

  5. Parse each text to identify noun phrases and other likely metadata-bearing content

  6. Process each text to identify and correlate all text blocks (sentences, paragraphs, pages, chapters, footnotes, captions, etc.) that appear to refer to specific target objects (SEGMENTATION)

  7. Identify the important phrases and vocabulary that can be used as metadata using:
  8. Format and tag metadata, incorporate it into the corresponding descriptive item records with links to art objects, load records into a bibliographic and/or image search system