Background:

The Perlin Papers are a collection of about 250,000 pages that relate to the investigation, trial, and execution of Julius and Ethel Rosenberg. In 1951, the Rosenbergs were found guilty of conspiracy to violate the Espionage Act of 1917 and were subsequently executed in 1953. The papers were declassified in the 1970's because of an effort by both of the Rosenberg children, Michael and Robert Meeropol, and their lawyer, Marshall Perlin (a Columbia Law School graduate from 1942), citing the Freedom of Information Act. The papers were then given to Columbia University's Law School as a gift, in hopes of better circulating the documents.

For more information regarding the history of the trial, click here.

Picture of Ethel and Julius Rosenberg

Ossining train station, the day of the Rosenberg executions

Current Project:

In the early 1990's about 150,000 of the 250,000 pages were turned into images viewable by the computer using a scanner, under the direction of Professor James Hoover in the Columbia University Law School. This project was originally part of the JANUS digital library project at Columbia University, funded by the Stragetic Investment Fund of the Office of the Vice-Provost. Willem Scholten, now director of the Center for Technology in the Public Library, initiated the scanning project. Over the past few years, scanning has continued but the Perlin Papers project became dormant.

Currently, the Center for Research on Information Access (CRIA), headed by Professor Judith Klavans, is reviving the project and is now attempting to finish the project that was started about 6 years ago. David Millman, Manager of Digital Library projects at Columbia's Academic Information Systems division, and Brian Donnelly, Instructional Services Librarian at the Law Library, are working with Kris Concepcion, a senior in the Department of Computer Science, to bring the papers to the public.

The Medium:

We are attempting to create a searchable index of the Perlin Papers via the World Wide Web. The main task is two-fold, convert the TIFF files (resolution of 3392 x 4400) to GIF files so web browsers are able to view the images, and use Optical Character Recognition techniques to create text files of these images so the search engine that will be used, is able to efficiently search through the huge amount of data that is available.

Click here to view the recent error report on OCR text that we created.

Click here to view the test-suite that will be sent to Lucent-Bell Laboratories for OCR testing.

Click here to view the sample Nuremberg Trial data that was used as a preliminary test for the Perlin Paper Project.

The Papers:

Please note that the OCR text links do not work as of yet.

Volume 27 (24 pages)
Volume 28 (77 pages)
Volume 29 Part 1 (46 pages)
Volume 29 Part 2 (39 pages)
Volume 29 Part 3 (16 pages)
Volume 30 (40 pages)
Volume 31 (95 pages)
Volume 32 (6 pages)
Volume 33 (4 pages)
Volume 34 (30 pages)
Volume 35 (40 pages)
Volume 36 Part 1 (26 pages)
Volume 36 Part 2 (60 pages)
Volume 37 Part 1 (15 pages)
Volume 37 Part 2 (51 pages)
Volume 38 Part 1 (35 pages)
Volume 39 (57 pages)
Volume 40 (124 pages)
Volume 41 (132 pages)
Volume 42 (107 pages)
Volume 43 (20 pages)

Welcome to the Perlin Papers On-line

Background:

Current Project:

The Medium:

The Papers:

Comments to: Whitney Bagnall last updated: Apr. 5, 2006

Comments to: Whitney Bagnall
last updated: Apr. 5, 2006