Columbia
Escutcheon Columbia University Libraries Digital Program
Advanced Papyrological Information System (APIS)


          Path: Digital Library Projects  :  APIS :  Tools  :  Validator  :  3.0 Design  :  3.2.0 I/O
  

3.2.1 Input:

  • The validator application requires two input parameters and their associated option flag. Either -f [filename] or -l [filename] is required along with the institution, -i [institution]. The -f indicates a single file path, while -l indicates the file path for a loader file. A loader file includes multiple file paths and is used in batch processing
    e.g. validator.pl -l myLoaderFile -i columbia

3.2.2 Options:

  • -f [input filename], allows for the validation of a single file
    e.g. validator.pl -f inputFile1 -i columbia
  • -l [loader filename], option validates multiple files as specified in the loader file
    e.g. validator.pl -l loaderFile1 -i columbia
  • -m [new master filename], the supplied filename is used instead of the default masterfile filename
    e.g. validator.pl -f inputFile1 -i columbia -m masterFile1.txt
  • -i [institution], Institution specific settings and checks will be based on this required entry. This allows for institution specific exceptions
    e.g. validator.pl -f inputFile1 -i columbia
  • -e [directory], this replaces the default directory (the root of validator.pl) for the error files. the end of the directory string must terminate fith a forward slash '/'.
    e.g. validator.pl -f inputFile1 -i columbia -e /errors/today/
  • -d, this option turns on the debugging printing. It displays the validation and groupings hashes to the screen. This is useful for debugging purposes with new code as well as viewing changes to the dat files. It also allows for a more human readable version of the data behind the scenes.
    e.g. validator.pl -f inputFile1 -i columbia -d

3.2.3 Output:

  • error file - file which stores the errors for the current record being validated. records that contain no errors will not have an associated error file.
    e.g. error_sampleFile3_842002_11320.txt
  • master file - the master list of all the records that are currently being validated. Each record (separate input file) has its own line and validation status. It states whether the record passes and how many errors.
    e.g. masterFile_loaderFile2_842002_11320.txt
  • new loader file - all records that pass validation are written to this file. this file can then inturn be used to continue the process work flow only for those records that passed.
    loaderFile2_new_842002_11320.txt

3.2.4 Data File Input:

  • definition file - file to describe the tags and their properties, to include items such as required, relationships, and validation type
    for apis: APISdefinitions.dat
  • error code file - list of the error codes and their descriptions. These descriptions show up in the validation error file
    for apis: ErrorCodes.dat
  • Institution codes - columbia defined institution codes, e.g. 'columbia', 'princeton', and 'duke'. This file lists all acceptable schools and their exceptions file (a modified definition file).
    InstitutionCodes.dat
  • modified LC language codes - additional language codes needed that are not specified in LC Marc
    LClanguageCodes.dat
  • LC institution codes - codes from LC Marc that are used in the input files
    LCinstitutionCodes.dat

    *LC refers to the the Library of Congress Marc codes


 


Columbia Libraries    Digital Program
Last revision: 09/25/03
© Columbia University
Last revision: 09/25/03
© Columbia University Libraries