GUIDELINES FOR MARKUP OF VIRTUAL
READING ROOM TEXTS
The overall markup guidelines
for the Columbia Virtual Reading
Room (VRR) project shall be those
found in the following documents,
to the extent that they apply
to the specific types of texts
included in this contract.
Within the context of those standards
and guidelines, the following
editorial principles should be
applied and the specific markup
subset defined below used for
VRR documents.
|
I. General Editorial
Principles
��������������� 1.���
Subdivisions of� the front,
body, and back should be numbered
<div> tags,� <div1>�
representing the largest subdivision,
<div2> the next largest,
and so on.� (There should
be no <div0> or plain
<div> tags.)
��������������� 2.��
�All <div> subdivisions
should have an appropriate
"type" attribute
(e.g.,� type="chapter"
or type="scene").
��������������� 3.���
All repeating <div>
subdivisions (chapters, books,
acts, scenes, etc.should be�
given an "n" attribute
(e.g.,� <div1 type="act"
n="2">).�� Subdivisions
that are part of larger subdivisions
should� be given a value of
"n" that reflects
their� hierarchical position,
with the number(s) of larger
subdivision(s) followed� by
(a) decimal point(s), and
then the ordinal number of
the current subdivision.�
For example, n="3.2"
would be the attribute of
scene two of act� three in
a play, and n="1.4.3"
the attribute of a chapter
3 of book 4 of volume one.
��������������� 4.��
Every major divisional marker
must be immediately followed
by a� <head> tag, even
if there is no information
to put inside the <head>
tag.
��������������� 5.���
All other features explicitly
numbered in the text (e.g.,
notes, numbered lines� in
a poem, etc.) should also
be given an "n"
attribute.
��������������� 6.���
Page numbers should be replaced
with a <pb> tag, with
the number, as it is presented
on the page, made the value
of the "n" attribute
(e.g. <pb n="iii"
/>).�� Include a <pb>
tag� for unnumbered missing
pages as well, with the implied�
number in brackets
as the value of the "n"
attribute (e.g., <pb n="[17]"
/>).
��������������� 7.���
The <pb> tags should
always be placed at the top
rather than the bottom of
the page.� They should fall
within a <div> tag,
not between <div> tags.
��������������� 8. ���Accented
and special characters not
part of the simple ASCII character
set should be presented as
entity references using the
ISO88591 Latin character entities
as found on the chart at http://etext.lib.virginia.edu/tei/iso88591.html.�
Thus, "�" would
be rendered as "â"�
(without the quotation marks),
and "�" as "é".�
Ampersands occuring in� the
text should also be rendered
as entity reference "&".
��������������� 9. ��Tables
of contents should be marked
up and linked to the appropriate
location in text.
��������������10. ��Elements
that are not converted (e.g.,
formulas, tables) should be
designed by the <gap>
element, with the omitted
element characterized.
II. MARKUP OF THE
TEXT ITSELF (NOT INCLUDING
TEI HEADER)
A.
List
of Tags To Be Used
|
| Req |
Element |
Attributes Req
If Available
|
Notes |
| R/A |
<argument> |
none |
- for any abstracts,
precis, or listing of contents
that may appear at the beginning
of a chapter (but not the overall
table of contents for a book,
which should be its own <div>-tagged
section in the front or back matter
|
| R/A |
<back> |
none |
-- marks material
usually found at the back of works,
such as notes, advertisements,
indices. The content model of
back matter is identical to that
of front matter |
| R/A |
<bibl> |
none |
-- bibliographic reference;
use only in notes or bibliographies;
no markup of components within
this element required |
| REQ |
<body> |
none |
-- The wrapper for
the main part of the <text>excluding
<front> and <back>
matter |
| R/A |
<byline> |
none |
-- the primary statement
of responsibility given for a
work on its title page or at the
head or end of the work. |
| R/A |
<closer> |
none |
-- for letters (no
further breakdown required, other
than <lb> tags to note line
breaks) |
| R/A |
<div1>,
<div2> |
[see Sec I above] |
[see Sec I above] |
| R/A |
<docAuthor>
�������� |
none |
-- contains an edition
statement as presented on a title
page of a document [in <titlepage>] |
| R/A |
<docDate>�
���� |
none |
-- contains the date
of the document, as given (usually)
on the title page [in <titlepage>]
|
| R/A |
<docEdition> |
none |
-- contains an edition
statement as presented on a title
page of a document [in <titlepage>] |
| R/A |
<docImprint> |
none |
-- contains the imprint
statement (place and date of publication,
publisher name), as given (usually)
at the foot of a title page [in
<titlepage>] |
| REQ |
<docTitle> |
none |
-- contains the title
of a document, including all its
constituents, as given on a title
page Must be divided into <titlePart>
elements [in <titlepage>] |
| R/A |
<epigraph> |
none |
-- a quotation or
citation at the beginning of a
work, a section or chapter, or
on a title page. Often indicates
a sentiment, moral, or mood. This
should contain bibliographical
information if possible |
| R/A |
<figure>
� |
none |
-- indicates
the location of a graphic, illustration,
or figure; Any caption or description
present in the text is coded as
<head> within <figure>;
otherwise the element is empty.
(?) |
| R/A |
<front>
|
none |
-- the wrapper for
all prefatory material (contents,
introduction, preface, etc) |
| R/A |
<head> |
|
-- contains
any heading, for example, the
title of a section, or the heading
of a list or glossary.
[see Sec I above about inclusion
with every major <div>] |
| R/A |
<hi>� |
rend="italics",
rend="bold" |
-- marks a word or phrase as
graphically distinct from
the surrounding text, for
reasons concerning which no
claim is made;use with attribute
rend="italics" or
rend="bold"
|
| R/A |
<item> |
none |
-- ontains one component
of a list |
| R/A |
<l> |
"n"
attribute only when numbering
is already present in text |
-- lines of verse
or verse drama. Must occur inside
of <lg> tag. |
| R/A |
<lb />� |
none |
--marks line break
in non-verse text.� Do not use
for ends of lines in regular paragraphs,
or for items in lists, but only
where essential to preserve meaning
through arrangement of text, e.g.
on title page, in a <head>
element, or in the <opener>
or <closer> of a letter
|
| R/A |
<lg> |
none |
-- contains a group
of verse lines functioning as
a formal unit e.g. a stanza, refrain,
verse paragraph, etc; used in
verse or verse drama |
| R/A |
<list> |
none |
-- contains any sequence
of items organized as a list,
whether of numbered, bulletted,
or other type |
| R/A |
<note> |
n=[note # if present]
place="foot",
place="end",
place="inline"
|
-- contains a note
or annotation, with attributes
to indicate the location and note
number, if present |
| R/A |
<opener>��� |
none |
-- |
| R/A |
<p> |
none |
-- marks paragraphs
in prose |
| R/A |
<pb> |
n=[page number].
See also Sec I above. |
-- |
| R/A |
<ptr>� |
target=[footnote
number] |
-- for footnote numbers
in the body of the text (not the
numbers attached to the note itself)
|
| R/A |
<sp> |
none |
-- |
| R/A |
<speaker>
|
none |
-- |
| R/A |
<stage>
|
none |
-- |
| REQ |
<teiheader>
�� |
-- |
-- for elements of
header, see separate list |
| REQ |
<TEI.2>
|
-- |
-- top-level element
for a TEI text |
| REQ |
<text> |
none |
-- |
| REQ |
<titlePage> |
none |
-- |
| REQ |
<titlePart> |
type="main",
type="sub", or
type="alt" |
-- in <titlepage>.
Distinguishes between main and
subtitle. Should include
type attribute (values: "main",
"sub", "alt"
for main, subtitle, and alternate
title) |
|
B.
Examples
of Text Tagging
The following examples (slightly
modified versions of ones
in� UVA's Electronic Text
Center's Guide to Document
Preparation) indicate how
these tags should be applied
to various types of text
1. PROSE
Example a.� Partial
markup for a multi-volume
work
����� <TEI.2>
� ����[TEI header information
goes here]
����� <text>
����� <body>
����� <head>Wuthering
Heights</head>
</body>
����� <div1 type="volume"
n="1">
����� <head>Volume I.</head>
����������� <div2 type="chapter"
n="1.1">
����������� <head>Chapter
I.</head>
�������� ���[TEXT OF CHAPTER
ONE, VOLUME ONE GOES HERE]
����������� </div2>
����������� <div2 type="chapter"
n="1.2">
����������� <head>Chapter
II.</head>
����������� [TEXT OF CHAPTER
TWO, VOLUME ONE GOES HERE]
����������� </div2>
����� </div1>*
[*NOTE: The <div1>
closes only when the end of
the "Volume" has
actually been reached.]
����� <div1 type="volume"
n="2">
����� <head>Volume II.</head>
����������� <div2 type="chapter"
n="2.1">
����������� <head>Chapter
I.</head>
����������� [TEXT OF CHAPTER
ONE, VOLUME TWO GOES HERE,
ETC...]
����������� </div2>
����������� <div2 type="chapter"
n="2.2">
����������� [AGAIN, "<div2
type="chapter" n="2.xx",
WILL CONTINUE UNTIL THE END
OF VOLUME TWO...]
����������� </div2>
����� </div1>
����� </body>
����� </text>
����� </TEI.2>
Example b.� Markup
of a chapter, showing placement
of page break
����� <div1 type="chapter"
n="1">
����� <head> Marley's
Ghost </head>
����� <pb n="9"
/>
����� <p>Marley was
dead, to begin with. There
is no doubt whatever about
that. The register of his
burial was signed by the clergyman,
the clerk, the undertaker,
and the chief mourner. Scrooge
signed it. And Scrooge's name
was good upon 'Change, for
anything he chose to put his
hand to.</p>
����� [�additional text here
�]
����� </div1>
2. VERSE
Example a.� One of a collection
of poems
<div1 type="fit"
n="1">
����� <head> Fit the First:
THE LANDING </head>
����� <pb n="45"
/>
����������� <lg type="stanza">
����������� <l>"Just
the place for a Snark!"
the Bellman cried,</l>
����������� <l rend="indent">As
he landed his crew with care;</l>
����������� <l>Supporting
each man on the top of the tide</l>
����������� <l rend="indent">By
a finger entwined in his hair.</l>
����������� </lg>
����������� <pb n="46"
/>
������ �����<lg type="stanza">
����������� <l>"Just
the place for a Snark! I have
said it twice:</l>
����������� <l rend="indent">That
alone should encourage the crew.</l>
����������� <l>Just the
place for a Snark! I have said
it thrice:</l>
����������� <l rend="indent">What
I tell you three times is true."</l>
����������� </lg>
����������� [ETC....]
����� </div1>
3. DRAMA
Example 1. �Tagging of
parts of the text of King Lear.
<text >
<front>
����� <div1 type="Dramatis
Personae"><head>Dramatis
Personae</head>
����������� <list>
����������� <item>LEAR king
of Britain </item>
����������� <item>KING OF
FRANCE</item>
����������� <item>DUKE OF
BURGUNDY</item>
����������� <item>DUKE OF
CORNWALL</item>
����������� <item>DUKE OF
ALBANY</item>
����������� <item>EARL OF
KENT</item>
����������� <item>EARL OF
GLOUCESTER</item>
����������� <item>EDGAR
son to Gloucester.</item>
����������� <item>EDMUND
bastard son to Gloucester.</item>
����������� <item>CURAN
a courtier.</item>
����������� <item>Old Man
tenant to Gloucester.</item>
����������� <item>Doctor</item>
����������� <item>Fool</item>
����������� <item>OSWALD
steward to Goneril.</item>
����������� <item>A Captain
employed by Edmund. </item>
����������� <item>Gentleman
attendant on Cordelia. </item>
����������� <item>A Herald.</item>
����������� <item>Servants
to Cornwall.</item>
����������� <item>GONERIL,
REGAN, CORDELIA } daughters to
Lear.</item>
����������� <item>Knights
of Lear's train, Captains, Messengers,
Soldiers, and Attendants</item>
����������� </list>
����� <p><stage>Scene:
Britain.</stage></p>
����� </div1>
����� </front>
����� <body>
����� <div1 type="act"
n="1">
����� <head>Act 1</head>
����������� <div2 type="scene"
n="1.1">
����������� <head>Scene
1</head>
<p><stage>King
Lear"s palace.</stage></p>
<p><stage>Enter KENT,
GLOUCESTER, and EDMUND</stage></p>
<sp><speaker>KENT</speaker>
����������� <p>I thought
the king had more affected the
Duke of Albany than Cornwall.
</p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>It did always
seem so to us: but now, in the
division of the kingdom, it appears
not which of the dukes he values
most; for equalities are so weighed,
that curiosity in neither can
make choice of either"s
moiety. </p></sp>
<sp><speaker>KENT</speaker>
����������� <p>Is not this
your son, my lord? </p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>His breeding,
sir, hath been at my charge: I
have so often blushed to acknowledge
him, that now I am brazed to it.</p></sp>
<sp><speaker>KENT</speaker>
����������� <p>I cannot
conceive you.
����������� </p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>Sir, this
young fellow"s mother
could: whereupon she grew round-wombed,
and had, indeed, sir, a son �for
her cradle ere she had a husband
for her bed. Do you smell
a fault?</p></sp>
����������� [ETC......]
����������� </div2>
����� </div1>
4. LETTER
<TEI.2>
<teiHeader>
[TEI Header goes here]
</teiHeader>
<text>
<body>
<div1 type="letter">
<pb n="1" />
<opener>
Manassas�junction <lb />
Oct. 8th 1861<lb
/>
<lb />
Dear Cousin
</opener>
<p>I write afew lines
this� morning to inform you
that I am well�at this time
and hopeing that it may find
you all injoying the same�
blesing.� The health of our
company is better at this
time than it has�bin for some
time. </p>
<p>I have no news of�intrust
to write to you.� It is thought
that we�will have a battle
in a few days. Its reported
that thay was fighting�yesterday
at fawls Church.�� I dont�know
wether it was so or not.�
One of the Danville Grays
was upto see us last night.�
He said the yankees was in
four�miles of� them.� Thay
are stationed at Farfax Court
House six miles a head of�us.�
It is thought that we will�
have a�verry hard battle when
it does come off, I received
a letter from� Addie <ptr
target="1"> [1]</ptr>
last eavning.� It afforded
me�great pleasure to hear
that he was�improveing so
fast.</p>
<p>I will ad no more
at�present so good bye.</p>
<pb n="2" />
<closer>
Write soon to<lb />
your affectionate Cousin<lb
/>
<lb />
James Booker
</closer>
</div1>
</body>
<back>
<div1 type="notes">
<head>Notes</head>
<note n="1">[1]
"Addie"
probably refers to Drury Addison
Blair (1839-1864), the Bookers"
cousin. Blair joined Company
D when it was formed in May
of 1861, but was discharged
due to chronic bronchitis
in August of 1861 (Gregory
81). See James Booker's letter
of July 14, 1861, in which
"A. Blair"
includes a postscript to Chloe
Unity Blair. </note>
</div1>
</back>
</text>
</TEI.2>
|
III. TAGGING OF THE TEI HEADER
A. List of Tags to Include (with an indication of the larger elements
within which they occur)
| <fileDesc>
|
|
|
|
| |
<titleStmt> |
� |
|
| |
|
<title>� |
transcribe
main title from title
page, followed by
bracketed phrase:[an
electronic transcription] |
| |
|
<author> |
last
name, first name (transcribe
from title page |
| |
<respStmt> |
|
|
| |
|
<resp> |
|
| |
|
<name>
|
include
tags for both creation
of�electronic text
and creation of TEI
markup |
| |
<extent>� |
�� |
approximate
size in bytes |
| |
<publicationStmt> |
|
|
| |
|
<publisher> |
"Columbia
University Libraries"
|
| |
|
<pubPlace>�
|
"New
York" |
| |
|
<date>�
|
current
year |
| |
<seriesStmt> |
� |
"Columbia
Virtual Reading Room
Texts" |
| |
<sourceDesc>
|
|
|
| |
|
<biblFull> |
transcribe
info below from title
page |
| |
|
<titleStmt> |
|
| |
|
<title> |
|
| |
|
<author> |
last
name, first name |
| |
|
<respStmt> |
this
area for editor, translator
rather than main author
|
| |
|
<resp> |
|
| |
|
<name> |
|
| |
|
<extent> |
pagination
of print original |
| |
|
<publicationStmt>
|
|
| |
|
<pubPlace> |
|
| |
|
<publisher> |
|
| |
|
<date> |
|
| |
|
|
etc |
B. SAMPLE HEADER FOR MACHIAVELLI'S PRINCE
<teiHeader>
���� <fileDesc>
�������� <titleStmt>
������������ <title>
����������������� The Prince [a machine-readable transcription]
������������ </title>
������������ <author>
����������������� Machiavelli, Niccolo
������������ </author>
������������ <respStmt>
����������������� <resp>
���������������������� Creation of machine-readable version:
����������������� </resp>
��������� ��������<name>
���������������������� Core Curriculum Office, Columbia University
����������������� </name>
����������������� <resp>
���������������������� Conversion to TEI.2-conformant markup:
����������������� </resp>
����������������� <name>
�������� ��������������Virtual Reading Room Project, Columbia
���������������������� University Libraries
����������������� </name>
������������ </respStmt>
�������� </titleStmt>
�������� <extent>
������������ ca. 220 Kb
�������� </extent>
�������� <publicationStmt>
������������ <publisher>
���������������� Columbia University Libraries
������������ </publisher>
������������ <pubPlace>
���������������� New York
������������ </pubPlace>
������������ <date>
���������������� 2000
������������ </date>
�������� </publicationStmt>
�������� <seriesStmt>
������������ <p>
���������������� Columbia Virtual Reading Room Texts
������������ </p>
�������� </seriesStmt>
�������� <sourceDesc>
������������ <biblFull>
���������������� <titleStmt>
�������������������� <title level="m">
������������������������ The prince
�������������������� </title>
�������������������� <author>
������������������������ Niccolo Machiavelli
�������������������� </author>
�������������������� <respStmt>
������������������������ <resp>
����������������� �����������Editor and Translator
������������������������ </resp>
������������������������ <name>
���������������������������� David Wootton
������������������������ </name>
�������������������� </respStmt>
����������������� </titleStmt>
���� ������������<extent>
�������������������� xlvi, 83 p.
����������������� </extent>
����������������� <publicationStmt>
�������������������� <publisher>
������������������������ Hackett Pub. Co.
�������������������� </publisher>
�������������������� <pubPlace>
�������� ����������������Indianapolis
�������������������� </pubPlace>
�������������������� <date>
������������������������ c1995
�������������������� </date>
����������������� </publicationStmt>
������������ </biblFull>
�������� </sourceDesc>
���� </fileDesc>
� </teiHeader>
|
|