The subcommittee of the Libraries' Archives Committee charged with developing a uniform structure for Columbia finding aids on the Web began with the assumption that it was desirable to create finding aids that had a relatively consistent look and feel, despite the fact that they might emanate from different parts of the University and describe very different collections. However, because they must follow the structure of the archival files, finding aids differ considerably in complexity, length, and degree of detail. Because of the limited flexibility of HTML, it is not possible to produce a single template that is suitable for all finding aids. We have therefore devised a model which is designed to produce finding aids that may vary in the nature of their contents or descriptive lists but are uniform in their visual structure. People wishing to encode a finding aid in HTML should therefore examine the models carefully to see which is closest to their own in the structure of the contents list and then should adapt that model to their specific needs. (NOTE: It is recommended that anyone attempting to encode a finding aid have both a knowledge of the principles of archival arrangement and some familiarity with HTML).
HTML encoded finding aids can be created from existing word-processing files. They can also be created directly as part of archival processing. Some people have found it satisfactory to scan typed documents into Microsoft Word files and then encode those documents using the appropriate tools. In all cases, some tweaking or editing will be necessary.
http://archive.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html
).
In order to handle the tabular data, knowledge of
HTML tables
(http://home.netscape.com/assist/net_sites/tables.html
)
is desirable, but there are great conversion tools that would let you
convert existing spreadsheet tables into HTML without it.
Finally, to create the desirable online Finding Aid layout with Navigator,
Title page, etc., knowledge of
Frames
(http://home.netscape.com/assist/net_sites/frames.html
)
is needed. Note that this part could be done for you as part of support
from LSO.
http://www.columbia.edu/cu/libraries/indiv/rare/guides/
) for
examples of online Finding Aids created by following the guidelines in
this document.
uglov@columbia.edu
). It could
include:
A single online Finding Aid at Columbia consists of at least 4 parts (files): frameset [F], title page [T], main page(s) [M], navigator [N], all appearing in the same web browser window at different positions and times. All the files related to a single online Finding Aid reside in the same directory exclusively devoted to that Finding Aid | F
|
index.html
.
This file brings the rest of the files comprising the online Finding Aid
together in one browser window using <FRAMESET> and <FRAME> tags,
and it repeats the contents of the Navigator so that the frames-incapable
browsers (such as Lynx) can still be used to view the Finding Aid. It also
can contain preliminary AMC information.
title.html
. This file is a document giving author &
title information of the collection, the repository, and possibly the
author's photo. This is the page users will see
first in the main from a of the frameset when they get to the Finding
Aid.
main.html
, if many - invent your own naming
convention. When the user clicks on the links in Navigator, he/she is
navigated to a particular place in the Main page(s).
You have | Do this | |
---|---|---|
1 | Finding Aid in machine-readable form (word-processor, spreadsheet, etc.) | convert to HTML: see the Complexity Classification for instructions |
2 | Finding Aid in print only | Use scanner when possible to OCR the material. Process the results either into HTML directly (observing guidelines for HTML coding below), or into a non-HTML machine-readable form (word-processor, spreadsheet, etc.), then convert: see the Complexity Classification for instructions |
3 | Finding Aid does not exist, need to create online Finding Aid from scratch | Two options:
|
In general, begin by separating files by their type. Convert each file separately: see
For the purposes of word-processor - like documents, Finding Aid HTML encoding is based on a simple tag set which is associated with typical finding aid elements. Tags are used to identify the title of the collection, the series and sub-series entries, box or container numbers, notes, personal and corporate names, and contents description. The latter two lists - personal and corporate names, and contents -contain the substantive information in the finding aid, and are the most important part of it.
For the purposes of a word-processor - like documnts, Finding Aid HTML encoding is based on a simple tag set which is associated with typical finding aid elements:
The heart of the encoding system is a series of two lists--a definition list <DL><DD></DL> for all contents and an unordered list <UL><LI></UL> for all lists of personal/corporate names.
After the container is described, such as:
<H3>Box 1<H3>
the contents of the container is described,
<DL><DD>Put description here</DL>
The important, and tricky, part of this system of tagging is multiple indents. Each indent is indicated by a completely new <DL><DD></DL> structure. Hence, it is entirely possible to have multiple, nested <DL><DD></DL> structures. Given this minor complication it is necessary to remember where you are in any particular structure and to CLOSE OFF each structure when it is finished.
Here is an example:
Box 35 Offprints by Merton: The Christmas Sermons of Bl. Guerric The Climate of Monastic Prayer Conversatio Morum Examination of Conscience For a Renewal of Eremitism in the Monastic State Liturgical Renewal The Pasternak Affair in Perspective La vida solitaria The Zen Koa Offprints and articles relating to Merton Pamphlets: Miscellaneous 2 folders Monasteries The Monk in the Diaspora Thomas Merton Books, Fall 1988 (catalog). Two Articles by Thomas Merton Tearsheets from the Columbia Yearbook, 1937 Tearsheets from "The Jester" Vespers Funeral Mass & Burial Mass for Thomas Merton
1st indent (open) <DL><DD> 2nd indent (will be closed) <DL><DD></DL> 1st indent (cont.) <DD> 2nd indent (new, open) <DL><DD> 3rd indent (will be closed) <DL><DD></DL> 2nd indent (cont., closed) <DD></DL> 1st indent (empty, closed) </DL>
<h3>Box 35</h3> <dl> <dd>Offprints by Merton: <dl> <dd>The Christmas Sermons of Bl. Guerric <dd> The Climate of Monastic Prayer <dd>Conversatio Morum <dd>Examination of Conscience <dd>For a Renewal of Eremitism in the Monastic State <dd>Liturgical Renewal <dd>The Pasternak Affair in Perspective <dd>La vida solitaria <dd>The Zen Koa </dl> <dd>Offprints and articles relating to Merton <dl> <dd>Pamphlets: <dl> <dd>Miscellaneous 2 folders <dd>Monasteries <dd>The Monk in the Diaspora <dd>Thomas Merton Books, Fall 1988 (catalog). <dd>Two Articles by Thomas Merton </dl> <dd>Tearsheets from the Columbia Yearbook, 1937 <dd>Tearsheets from "The Jester" <dd>Vespers Funeral Mass & Burial Mass for Thomas Merton </dl> </dl>
The tagging for lists of personal/corporate names is quite simple. Each list must start with a <UL> and each name a <LI> e.g.
<H3>Box 1</H3> <UL> <LI>name <LI>name </UL>
The overall structure of a finding aid is
<H1>TITLE<H1> <H2>SERIES</H2> <H3>Box or container number</H3> <DL><DD>Contents</DL> OR <UL><LI></UL> <H2>SUB-SERIES</H2> <H3>Box or container number</H3> <DL><DD>Contents<DL> OR <UL><LI></UL>Tags such as <b>, <i>, <BLOCKQUOTE> can be used just about anywhere in the structure. However, series titles should not come after Box numbers.
Generally, there are 2 ways to do the conversion:
rtftohtml filename
With regards to results, all of the above recipes are equally mediocre: they provide output littered with unneeded font modification tags while ignoring many essential formatting features (a reminder: write straight in HTML if you are writing from scratch). You should pick the one that is logistically more suitable.
NOTE: the resulting HTML document should not exceed 50K, otherwise it will take conspiciously long time to open in a web browser. Split the file in several if needed.
Once you've got that: