Technical Recommendations for Digital Imaging Projects

Prepared by the Image Quality Working Group of
ArchivesCom, a joint Libraries/AcIS committee.

Last Revised April 2, 1997

URL: http://www.columbia.edu/acis/dl/imagespec.html


Table of Contents
  1. Introduction
  2. Quick Guide
  3. Explanations and Definitions
  4. References
  5. Sample Sites
  6. Campus Display Resources


Introduction

This document provides recommendations for image quality, file formats, and other capture and storage issues when converting paper, photographic and other physical materials into digital form. Additional documents on the selection of materials for digitization, on how to describe and index the materials being digitized, and on digital library access mechanisms will be added in the future.

These documents are intended for use by faculty, library and computing staff as a guideline for image presentation using the Digital Library. Recommendations have been made based on lessons learned here at Columbia University as well as elsewhere. We recommend you speak to a Library or AcIS staff member (AcIS Help Desk, 854-1919, or e-mail to consultant@columbia.edu) before beginning an imaging project.

The Quick Guide provides a brief overview of file formats, resolutions, pixel depth, etc., with specific recommendations for conversion based on the type of the original documents.

The Technology Summary provides additional definitions, technical details, and capabilities available today to display images using the current Columbia Digital Library resources.

References included at the end of this section provide much greater detail on digital image conversion, file formats, and pointers to sample projects.

Sample Sites include pointers to on-campus projects that provide file format and presentation examples for on-screen and printing purposes.

Advantages of Digital Images

Storing two-dimensional materials in digital formats offers a number of advantages. (Ester, 1996, pp. 2-4)


Quick Guide

Different original media types will require different digital conversion techniques as well as different file storage formats. This is an area that is evolving, as both conversion techniques improve (better scanners and digital cameras) and as new file formats develop. The following chart represents a set of recommendations derived from national digital library recommendations (Reilly, 1996) and the Columbia large maps project (Gertz, 1994-1995, and Gertz 1996).

Media Type

Conversion Method

Resolution

Archive File Format

Screen Presentation Format

Print Presentation Format

Black & White Text Document

Flatbed Scanner or Digital Camera

1-bit,
600 dpi

TIFF w/CCITT Fax 4 Compression

GIF, 4-bit, 120 to 200 dpi

Acrobat (PDF), 1-bit, 300 or 600 dpi

Illustrations, Maps, Manuscripts, etc.

Flatbed Scanner or Digital Camera

8-bit grayscale or 24-bit color, 200 to 300 dpi

TIFF

Multiple JPEG, 24-bit, 512x768, 1024x1536, 2048x3072, Quality Level 50

JPEG, 24-bit, 2048x3072, Quality Level 50-100

3-dimensional objects to be represented in two-dimensions

Digital Camera

24-bit color, 200 to 300 dpi

TIFF

Multiple JPEG, 24-bit, 512x768, 1024x1536, 2048x3072, Quality Level 50

JPEG, 24-bit, 2048x3072, Quality Level 50-100

35mm Black&White & Color slide or negative

PhotoCD or Slide Scanner

24-bit, 2048x3072

PhotoCD or TIFF

Multiple JPEG, 24-bit, 512x768, 1024x1536, 2048x3072, Quality Level 50

JPEG, 24-bit, 2048x3072, Quality Level 50-100

Medium to Large Format photograph, slide, negative, transparency or color microfiche

ProPhotoCD or Drum Scanner

24-bit, 4096x6144

PhotoCD or TIFF

Multiple JPEG, 24-bit, Quality Level 50

JPEG, 24-bit, 4096x6144, Quality Level 50-100

Black & White Microfilm

Microfilm Scanner

1-bit 600 dpi

TIFF w/ Fax 4

GIF, 4-bit, 120 to 200 dpi

PDF, 1-bit, 300 or 600 dpi

8-bit, 300 dpi

TIFF

GIF, 8-bit 120 to 200 dpi

PDF, 8-bit, 300 or 600 dpi

 

 

 

 

 

 

 

Explanations and Definitions

Use of Film Intermediaries

Scanning can be done directly from the item or a film intermediary can be made and scanned. Film intermediaries include most commonly 35 mm slides, 4 x 5 transparencies, microfilm, and single-frame microfiche. If properly made and stored, the film intermediary can act as a preservation copy of the item.

The quality of the intermediary will have a direct impact on the quality of the digital image. If the intermediary is poorly made, scratched, faded, or out of focus, the scanned image will be inferior. If the intermediary is of high quality, the scanned image will normally also be high quality. It is best to use camera negatives whenever possible. Every time a slide or other type of film is duplicated, it loses detail and resolution, and the resulting scan is poorer quality.

In general, it is better to work from a negative than from a positive not only because of generational loss but because the negative provides a smoother curve in the dynamic range, so that highlights and shadows are handled better (Ester, 1996).

Recommendations

Image Quality for Permanent/Archival Capture

When converting an original to digital form, a high-quality archival digital image should be created which "safeguards the long-term value of images and the investment in acquiring them" (Ester 1996, p11). For presentation, other images may be copied from this archival quality image and stored in different formats and quality levels, the most common being on-screen and printer presentation formats. The following sections describe our recommendations for archival quality capture.

Tonality (pixel depth or bit-depth)

Bit-depth concerns the number of bits used to convey tonality for each pixel; that is, black and white, gray- scale, or color. In general, the more bits per pixel, the larger the file size.

Resolution (dots per inch)

In digital images, resolution typically refers to the number of horizontal and vertical pixels that make up the image. For example, 512x768 refers to 512 pixels across by 768 pixels down. DPI refers to dots per inch, which typically refers to the number of pixels per inch stored by the digital file. DPI is used in several ways. It refers to the number of pixels or dots captured per inch from the original material. It also is used to describe the number of pixels per inch on computer displays and the output quality of printers. These two senses are NOT the same. In this section, we are referring only to capture, which provides us with an effective number of dots per inch relative to the original. Note that when using film intermediaries, careful calculations must be made in order to determine the effective dpi of the source material. For instance, a document 10" across scanned at 600 dpi requires 6000 pixels. If the document is reproduced as a microfiche with an image that is 4" across, it will take 1500 dpi to achieve the same 6000 pixels and the same level of resolution. The Large Maps Report (Gertz 1994-5) goes into this in much greater detail.

Selection of the optimum resolution starts with a determination of what is the smallest meaningful element that must be legible in the end product. When dealing with textual materials, this determination is relatively easy: find the smallest letter, numeral, diacritic, or symbol that must be clearly distinguished. In printed books the smallest textual element is often the superscript footnote numbers or letters with diacritics; with handwritten documents there is a great deal of variation. It is much more difficult to determine what the smallest meaningful element in a photograph or artwork is. In part it depends on who will use the scanned image and in what way. A non-specialist may look at a landscape photograph casually, while a geologist may need to be able to distinguish the stratigraphy of the cliff in the background.

Legibility results from a combination of resolution and bit depth. Resolution concerns the number of pixels or dots per inch (dpi) -- the more pixels, the more detail is captured. Note that the higher the resolution, the larger the file size.

Pixel depth complicates this simple relationship, because an 8-bit pixel captures more information than a 1-bit pixel, and a 24-bit pixel captures even more. This means that it may be possible to use lower resolution with gray-scale and color than with bi-tonal to achieve the same degree of legibility.

Recommendations

File Formats (based on Reilly, 1996, and Gertz, 1996)

We recommend the following image formats for archival storage and for presentation purposes:

Storage Issues

Digital image file formats may require a great deal of physical storage, especially full-color files intended for archival storage purposes. The chart below compares archival and presentation file formats, showing how the use of compression can greatly reduce the amount of space needed to store presentation quality images. The file sizes are estimates for 35mm color slides or negatives:

File Format

Resolution, bit-depth

File size

TIFF

2048x3072, 24-bit

18,000 Kilobytes

PhotoCD

2048x3072, 24-bit

4,000 Kilobytes

JPEG

2048x3074, 24-bit

400 Kilobytes (medium quality)

Conversion Methods

Regardless of whether digital conversion is done in-house or outsourced, great care should be taken to ensure that the conversion process is done properly and that it results in uniform, high-quality digital files. If the conversion process is outsourced, the vendor should provide sample results, and all work should be inspected for quality by in-house staff. If the work is to be done in-house, it is important to read the references below which include information about scanning, photography, and quality control methods.

Gray and Color Standard Bars

These bars are narrow strips which contain shades of gray from white to black or standard color blocks, plus an inch or meter scale. Their purpose is (1) to give the viewer the scale of the item and (2) to allow the scanner and the viewer to calibrate equipment to permit best possible viewing and printing with accurate color.

When scanning from original objects in gray-scale, the gray standard bar should be included with every scan; when scanning from originals in color, both gray and color bars should be used, since the color bar is used for color accuracy while the gray bar is used to deal with highlights and shadows. When scanning from film intermediaries, the slides or transparencies should be shot with color and gray standard bars in the same way as the originals. Placement of the bars should be consistent to allow them to be automatically cropped out of derived images for certain display purposes, and to minimize the amount of space they consume.

If color accuracy is critical, computer equipment with color accurate displays is also needed.


References

Ester, Michael. Digital Image Collections: Issues and Practice. Washington, D.C. Commission on Preservation and Access, 1996. To order a copy, see
http://www-cpa.stanford.edu/cpa/publist.html

Gertz, Janet. Oversize Color Images Project, 1994-1995. Washington, D.C. Commission on Preservation and Access, 1995. HTML version found at
http://www.columbia.edu/dlc/nysmb/reports/phase1.html

Gertz, Janet. Oversize Color Images Project Phase II, 1996. HTML version found at
http://www.columbia.edu/dlc/nysmb/reports/phase2.html

Reilly, James M. Recommendations for the Evaluation of Digital Images Produced from Photographic, Microphotographic, and Various Paper Formats. Washington, D.C. Library of Congress National Digital Library Project, 1996. PDF copy found at
http://lcweb2.loc.gov/ammem/ipirpt.html


Sample Sites

As part of the Oversized Images Project, over 800 pages of text were digitized from microfilm and placed online. The following links display a sample page in archival TIFF, screen presentation GIF, and print presentation PDF file formats.
http://www.columbia.edu/dlc/nysmb - New York State Museum Bulletins Project
GIF Screen Presentation Format - Bulletin 80, Page 134
PDF Print Presentation Format - Bulletin 80, Page 134 (300dpi, 1-bit)
TIFF Archive Format - Bulletin 80, Page 134 (600dpi, 1-bit)
The Museum Educational Site Licensing Project includes examples of physical objects photographed then digitized for screen presentation purposes.
http://www.columbia.edu/dlc/mesl - Museum Educational Site Licensing Project
Textile blanket - GIF Thumbnail presentation with three JPEG resolutions available.
The image reserve collection for Art History was converted to digital form from 35mm slides. An outside vendor provided JPEG presentation formats for the first two examples, the third example was converted to presentation formats from PhotoCD.
http://www.columbia.edu/cu/arthistory/courses/huma-c1121/ - Art History Image Reserve Collection
Michelangelo's David - GIF thumbnail and two JPEG presentation images, from PhotoCD.
Hunters in the snow, Bruegel - GIF thumbnail and two JPEG presentation images, derived from PhotoCD.
Kaufman House, Frank Lloyd Wright - GIF thumbnail and and four presentation images from PhotoCD.
The Digital Scriptorium Project includes examples of medieval and renaissance manuscripts photographed and then digitized for archival and screen presentation purposes.
http://www.columbia.edu/cu/libraries/indiv/rare/images/ - Medieval and Renaissance Manuscripts
Plimpton MS 88 (frag.), verso detail - GIF Thumbnail presentation with three JPEG resolutions available.
Plimpton MS 21, f. 50v - GIF Thumbnail presentation with three JPEG resolutions available.

Campus Display Resources

There are a number of on-campus resources available for viewing digital images. JPEG, GIF, and PDF formats may be displayed on most public-access ColumbiaNet stations, and all public-access personal computers and workstations. In addition, software is readily available to students to view these images from their residential hall computer attached to the campus network.

Machine Type

Quantity

Resolution

Bit-Depth

File Format Support

ColumbiaNet Stations

140

1024x768 to 1280x1024

8-bit

GIF, JPEG, PDF

AcIS HP Workstations

70

1024x768

8-bit

GIF, JPEG, PDF, TIFF

AcIS Power Macintoshes

160

640x480 to 1280x1024

16-bit, 24-bit

GIF, JPEG, PDF, TIFF

AcIS Printer Stations (Jake)

22

600dpi printer resolution

8-bit gray

PostScript (all formats using web browsers, helper apps)