GUIDELINES FOR MARKUP OF VIRTUAL READING
ROOM TEXTS
The overall markup guidelines for the Columbia Virtual Reading
Room (VRR) project shall be those found in the following documents,
to the extent that they apply to the specific types of texts
included in this contract.
Within the context of those standards and guidelines, the following
editorial principles should be applied and the specific markup
subset defined below used for VRR documents.
|
I. General Editorial Principles
��������������� 1.��� Subdivisions of� the front, body,
and back should be numbered <div> tags,� <div1>�
representing
the largest subdivision, <div2> the next largest, and
so on.� (There should be no <div0> or plain <div>
tags.)
��������������� 2.�� �All <div> subdivisions
should have an appropriate "type" attribute (e.g.,�
type="chapter" or type="scene").
��������������� 3.��� All repeating <div> subdivisions
(chapters, books, acts, scenes, etc.should be� given an "n"
attribute (e.g.,� <div1 type="act" n="2">).��
Subdivisions that are part of larger subdivisions should�
be given a value of "n" that reflects their� hierarchical
position, with the number(s) of larger subdivision(s) followed�
by (a) decimal point(s), and then the ordinal number of the
current subdivision.� For example, n="3.2" would
be the attribute of scene two of act� three in a play, and
n="1.4.3" the attribute of a chapter 3 of book 4
of volume one.
��������������� 4.�� Every major divisional marker
must be immediately followed by a� <head> tag, even
if there is no information to put inside the <head>
tag.
��������������� 5.��� All other features explicitly
numbered in the text (e.g., notes, numbered lines� in a poem,
etc.) should also be given an "n" attribute.
��������������� 6.��� Page numbers should be replaced
with a <pb> tag, with the number, as it is presented
on the page, made the value of the "n" attribute
(e.g. <pb n="iii" />).�� Include a <pb>
tag� for unnumbered missing pages as well, with the implied�
number in brackets as the value of the "n"
attribute (e.g., <pb n="[17]" />).
��������������� 7.��� The <pb> tags should always
be placed at the top rather than the bottom of the page.�
They should fall within a <div> tag, not between <div>
tags.
��������������� 8. ���Accented and special characters
not part of the simple ASCII character set should be presented
as entity references using the ISO88591 Latin character entities
as found on the chart at http://etext.lib.virginia.edu/tei/iso88591.html.�
Thus, "�" would be rendered as "â"�
(without the quotation marks), and "�" as "é".�
Ampersands occuring in� the text should also be rendered as
entity reference "&".
��������������� 9. ��Tables of contents should be marked
up and linked to the appropriate location in text.
��������������10. ��Elements that are not converted
(e.g., formulas, tables) should be designed by the <gap>
element, with the omitted element characterized.
II. MARKUP OF THE TEXT ITSELF (NOT INCLUDING TEI
HEADER)
A.
List of Tags To Be Used
|
| Req |
Element |
Attributes Req
If Available
|
Notes |
| R/A |
<argument> |
none |
- for any abstracts, precis, or listing of contents
that may appear at the beginning of a chapter (but not the overall
table of contents for a book, which should be its own <div>-tagged
section in the front or back matter |
| R/A |
<back> |
none |
-- marks material usually found at the back of
works, such as notes, advertisements, indices. The content model
of back matter is identical to that of front matter |
| R/A |
<bibl> |
none |
-- bibliographic reference; use only in notes
or bibliographies; no markup of components within this element
required |
| REQ |
<body> |
none |
-- The wrapper for the main part of the <text>excluding
<front> and <back> matter |
| R/A |
<byline> |
none |
-- the primary statement of responsibility given
for a work on its title page or at the head or end of the work.
|
| R/A |
<closer> |
none |
-- for letters (no further breakdown required,
other than <lb> tags to note line breaks) |
| R/A |
<div1>, <div2> |
[see Sec I above] |
[see Sec I above] |
| R/A |
<docAuthor> �������� |
none |
-- contains an edition statement as presented
on a title page of a document [in <titlepage>] |
| R/A |
<docDate>� ���� |
none |
-- contains the date of the document, as given
(usually) on the title page [in <titlepage>] |
| R/A |
<docEdition> |
none |
-- contains an edition statement as presented
on a title page of a document [in <titlepage>] |
| R/A |
<docImprint> |
none |
-- contains the imprint statement (place and date
of publication, publisher name), as given (usually) at the foot
of a title page [in <titlepage>] |
| REQ |
<docTitle> |
none |
-- contains the title of a document, including
all its constituents, as given on a title page Must be divided
into <titlePart> elements [in <titlepage>] |
| R/A |
<epigraph> |
none |
-- a quotation or citation at the beginning of
a work, a section or chapter, or on a title page. Often indicates
a sentiment, moral, or mood. This should contain bibliographical
information if possible |
| R/A |
<figure> � |
none |
-- indicates the location of a graphic,
illustration, or figure; Any caption or description present
in the text is coded as <head> within <figure>;
otherwise the element is empty. (?) |
| R/A |
<front> |
none |
-- the wrapper for all prefatory material (contents,
introduction, preface, etc) |
| R/A |
<head> |
|
-- contains any heading, for example,
the title of a section, or the heading of a list or glossary.
[see Sec I above about inclusion with every major <div>] |
| R/A |
<hi>� |
rend="italics", rend="bold" |
-- marks a word or phrase as graphically distinct from the
surrounding text, for reasons concerning which no claim is
made;use with attribute rend="italics" or rend="bold"
|
| R/A |
<item> |
none |
-- ontains one component of a list |
| R/A |
<l> |
"n" attribute only when numbering
is already present in text |
-- lines of verse or verse drama. Must occur inside
of <lg> tag. |
| R/A |
<lb />� |
none |
--marks line break in non-verse text.� Do not
use for ends of lines in regular paragraphs, or for items in
lists, but only where essential to preserve meaning through
arrangement of text, e.g. on title page, in a <head> element,
or in the <opener> or <closer> of a letter |
| R/A |
<lg> |
none |
-- contains a group of verse lines functioning
as a formal unit e.g. a stanza, refrain, verse paragraph, etc;
used in verse or verse drama |
| R/A |
<list> |
none |
-- contains any sequence of items organized as
a list, whether of numbered, bulletted, or other type |
| R/A |
<note> |
n=[note # if present]
place="foot",
place="end",
place="inline"
|
-- contains a note or annotation, with attributes
to indicate the location and note number, if present |
| R/A |
<opener>��� |
none |
-- |
| R/A |
<p> |
none |
-- marks paragraphs in prose |
| R/A |
<pb> |
n=[page number]. See also Sec I above. |
-- |
| R/A |
<ptr>� |
target=[footnote number] |
-- for footnote numbers in the body of the text
(not the numbers attached to the note itself) |
| R/A |
<sp> |
none |
-- |
| R/A |
<speaker> |
none |
-- |
| R/A |
<stage> |
none |
-- |
| REQ |
<teiheader> �� |
-- |
-- for elements of header, see separate list |
| REQ |
<TEI.2> |
-- |
-- top-level element for a TEI text |
| REQ |
<text> |
none |
-- |
| REQ |
<titlePage> |
none |
-- |
| REQ |
<titlePart> |
type="main",
type="sub", or
type="alt" |
-- in <titlepage>. Distinguishes between
main and subtitle. Should include type attribute (values:
"main", "sub", "alt" for main,
subtitle, and alternate title) |
|
B.
Examples of Text Tagging
The following examples (slightly modified versions of ones
in� UVA's Electronic Text Center's Guide to Document Preparation)
indicate how these tags should be applied to various types
of text
1. PROSE
Example a.� Partial markup for a multi-volume work
����� <TEI.2>
� ����[TEI header information goes here]
����� <text>
����� <body>
����� <head>Wuthering Heights</head>
</body>
����� <div1 type="volume" n="1">
����� <head>Volume I.</head>
����������� <div2 type="chapter" n="1.1">
����������� <head>Chapter I.</head>
�������� ���[TEXT OF CHAPTER ONE, VOLUME ONE GOES HERE]
����������� </div2>
����������� <div2 type="chapter" n="1.2">
����������� <head>Chapter II.</head>
����������� [TEXT OF CHAPTER TWO, VOLUME ONE GOES HERE]
����������� </div2>
����� </div1>*
[*NOTE: The <div1> closes only when the end of the
"Volume" has actually been reached.]
����� <div1 type="volume" n="2">
����� <head>Volume II.</head>
����������� <div2 type="chapter" n="2.1">
����������� <head>Chapter I.</head>
����������� [TEXT OF CHAPTER ONE, VOLUME TWO GOES HERE,
ETC...]
����������� </div2>
����������� <div2 type="chapter" n="2.2">
����������� [AGAIN, "<div2 type="chapter"
n="2.xx", WILL CONTINUE UNTIL THE END OF VOLUME
TWO...]
����������� </div2>
����� </div1>
����� </body>
����� </text>
����� </TEI.2>
Example b.� Markup of a chapter, showing placement
of page break
����� <div1 type="chapter" n="1">
����� <head> Marley's Ghost </head>
����� <pb n="9" />
����� <p>Marley was dead, to begin with. There is no
doubt whatever about that. The register of his burial was
signed by the clergyman, the clerk, the undertaker, and the
chief mourner. Scrooge signed it. And Scrooge's name was good
upon 'Change, for anything he chose to put his hand to.</p>
����� [�additional text here �]
����� </div1>
2. VERSE
Example a.� One of a collection of poems
<div1 type="fit" n="1">
����� <head> Fit the First: THE LANDING </head>
����� <pb n="45" />
����������� <lg type="stanza">
����������� <l>"Just the place for a Snark!"
the Bellman cried,</l>
����������� <l rend="indent">As he landed his
crew with care;</l>
����������� <l>Supporting each man on the top of the tide</l>
����������� <l rend="indent">By a finger entwined
in his hair.</l>
����������� </lg>
����������� <pb n="46" />
������ �����<lg type="stanza">
����������� <l>"Just the place for a Snark! I have
said it twice:</l>
����������� <l rend="indent">That alone should
encourage the crew.</l>
����������� <l>Just the place for a Snark! I have said
it thrice:</l>
����������� <l rend="indent">What I tell you
three times is true."</l>
����������� </lg>
����������� [ETC....]
����� </div1>
3. DRAMA
Example 1. �Tagging of parts of the text of King Lear.
<text >
<front>
����� <div1 type="Dramatis Personae"><head>Dramatis
Personae</head>
����������� <list>
����������� <item>LEAR king of Britain </item>
����������� <item>KING OF FRANCE</item>
����������� <item>DUKE OF BURGUNDY</item>
����������� <item>DUKE OF CORNWALL</item>
����������� <item>DUKE OF ALBANY</item>
����������� <item>EARL OF KENT</item>
����������� <item>EARL OF GLOUCESTER</item>
����������� <item>EDGAR son to Gloucester.</item>
����������� <item>EDMUND bastard son to Gloucester.</item>
����������� <item>CURAN a courtier.</item>
����������� <item>Old Man tenant to Gloucester.</item>
����������� <item>Doctor</item>
����������� <item>Fool</item>
����������� <item>OSWALD steward to Goneril.</item>
����������� <item>A Captain employed by Edmund. </item>
����������� <item>Gentleman attendant on Cordelia. </item>
����������� <item>A Herald.</item>
����������� <item>Servants to Cornwall.</item>
����������� <item>GONERIL, REGAN, CORDELIA } daughters
to Lear.</item>
����������� <item>Knights of Lear's train, Captains, Messengers,
Soldiers, and Attendants</item>
����������� </list>
����� <p><stage>Scene: Britain.</stage></p>
����� </div1>
����� </front>
����� <body>
����� <div1 type="act" n="1">
����� <head>Act 1</head>
����������� <div2 type="scene" n="1.1">
����������� <head>Scene 1</head>
<p><stage>King
Lear"s palace.</stage></p>
<p><stage>Enter KENT, GLOUCESTER, and EDMUND</stage></p>
<sp><speaker>KENT</speaker>
����������� <p>I thought the king had more affected the
Duke of Albany than Cornwall. </p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>It did always seem so to us: but now, in
the division of the kingdom, it appears not which of the dukes
he values most; for equalities are so weighed, that curiosity
in neither can make choice of either"s moiety. </p></sp>
<sp><speaker>KENT</speaker>
����������� <p>Is not this your son, my lord? </p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>His breeding, sir, hath been at my charge:
I have so often blushed to acknowledge him, that now I am brazed
to it.</p></sp>
<sp><speaker>KENT</speaker>
����������� <p>I cannot conceive you.
����������� </p></sp>
<sp><speaker>GLOUCESTER</speaker>
����������� <p>Sir, this young fellow"s mother
could: whereupon she grew round-wombed, and had, indeed, sir,
a son �for her cradle ere she had a husband for her bed. Do
you smell a fault?</p></sp>
����������� [ETC......]
����������� </div2>
����� </div1>
4. LETTER
<TEI.2>
<teiHeader>
[TEI Header goes here]
</teiHeader>
<text>
<body>
<div1 type="letter">
<pb n="1" />
<opener>
Manassas�junction <lb />
Oct. 8th 1861<lb />
<lb />
Dear Cousin
</opener>
<p>I write afew lines this� morning to inform you that
I am well�at this time and hopeing that it may find you all
injoying the same� blesing.� The health of our company is
better at this time than it has�bin for some time. </p>
<p>I have no news of�intrust to write to you.� It is
thought that we�will have a battle in a few days. Its reported
that thay was fighting�yesterday at fawls Church.�� I dont�know
wether it was so or not.� One of the Danville Grays was upto
see us last night.� He said the yankees was in four�miles
of� them.� Thay are stationed at Farfax Court House six miles
a head of�us.� It is thought that we will� have a�verry hard
battle when it does come off, I received a letter from� Addie
<ptr target="1"> [1]</ptr> last eavning.�
It afforded me�great pleasure to hear that he was�improveing
so fast.</p>
<p>I will ad no more at�present so good bye.</p>
<pb n="2" />
<closer>
Write soon to<lb />
your affectionate Cousin<lb />
<lb />
James Booker
</closer>
</div1>
</body>
<back>
<div1 type="notes">
<head>Notes</head>
<note n="1">[1] "Addie"
probably refers to Drury Addison Blair (1839-1864), the Bookers"
cousin. Blair joined Company D when it was formed in May of
1861, but was discharged due to chronic bronchitis in August
of 1861 (Gregory 81). See James Booker's letter of July 14,
1861, in which "A. Blair" includes a postscript
to Chloe Unity Blair. </note>
</div1>
</back>
</text>
</TEI.2>
|
III. TAGGING OF THE TEI HEADER
A. List of Tags to Include (with an indication of the larger elements
within which they occur)
| <fileDesc> |
|
|
|
| <titleStmt> |
� |
|
|
| |
<title>� |
|
transcribe main title from
title page, followed by bracketed phrase:[an electronic
transcription] |
| |
<author> |
|
last name, first name (transcribe
from title page |
| <respStmt> |
|
|
|
| |
<resp> |
|
|
| |
|
<name> |
include tags for both creation
of�electronic text and creation of TEI markup |
| <extent>� |
�� |
|
approximate size in bytes
|
| <publicationStmt> |
|
|
|
| |
<publisher> |
|
"Columbia University
Libraries" |
| |
<pubPlace>� |
|
"New York" |
| |
<date>� |
|
current year |
| <seriesStmt> |
� |
|
"Columbia Virtual
Reading Room Texts" |
| <sourceDesc> |
|
|
|
| |
<biblFull> |
|
transcribe info below from
title page |
| |
|
<titleStmt> |
|
| |
|
<title> |
complete title statement from
t.p. |
| |
|
<author> |
complete author statement
from t.p. |
| |
|
<respStmt> |
this area for editor, translator,
etc. rather than main author |
| |
|
<resp> |
|
| |
|
<name> |
|
| |
|
<extent> |
pagination of print original |
| |
|
<publicationStmt> |
publication information
as it appears on title page, etc. |
| |
|
<pubPlace> |
|
| |
|
<publisher> |
|
| |
|
<date> |
|
B. SAMPLE HEADER FOR MACHIAVELLI'S PRINCE
<teiHeader>
���� <fileDesc>
�������� <titleStmt>
������������ <title>
����������������� The Prince [a machine-readable transcription]
������������ </title>
������������ <author>
����������������� Machiavelli, Niccolo
������������ </author>
������������ <respStmt>
����������������� <resp>
���������������������� Creation of machine-readable version:
����������������� </resp>
��������� ��������<name>
���������������������� Core Curriculum Office, Columbia University
����������������� </name>
����������������� <resp>
���������������������� Conversion to TEI.2-conformant markup:
����������������� </resp>
����������������� <name>
�������� ��������������Virtual Reading Room Project, Columbia
���������������������� University Libraries
����������������� </name>
������������ </respStmt>
�������� </titleStmt>
�������� <extent>
������������ ca. 220 Kb
�������� </extent>
�������� <publicationStmt>
������������ <publisher>
���������������� Columbia University Libraries
������������ </publisher>
������������ <pubPlace>
���������������� New York
������������ </pubPlace>
������������ <date>
���������������� 2000
������������ </date>
�������� </publicationStmt>
�������� <seriesStmt>
������������ <p>
���������������� Columbia Virtual Reading Room Texts
������������ </p>
�������� </seriesStmt>
�������� <sourceDesc>
������������ <biblFull>
���������������� <titleStmt>
�������������������� <title level="m">
������������������������ The prince
�������������������� </title>
�������������������� <author>
������������������������ Niccolo Machiavelli
�������������������� </author>
�������������������� <respStmt>
������������������������ <resp>
����������������� �����������Editor and Translator
������������������������ </resp>
������������������������ <name>
���������������������������� David Wootton
������������������������ </name>
�������������������� </respStmt>
����������������� </titleStmt>
���� ������������<extent>
�������������������� xlvi, 83 p.
����������������� </extent>
����������������� <publicationStmt>
�������������������� <publisher>
������������������������ Hackett Pub. Co.
�������������������� </publisher>
�������������������� <pubPlace>
�������� ����������������Indianapolis
�������������������� </pubPlace>
�������������������� <date>
������������������������ c1995
�������������������� </date>
����������������� </publicationStmt>
������������ </biblFull>
�������� </sourceDesc>
���� </fileDesc>
� </teiHeader>
|
|