Administration Guide

LOAD, IMPORT, and EXPORT File Formats

Four types of files can be imported to a database, and three types can be exported or loaded. The type indicates the format of the data within the operating system file. The supported file formats are:

DEL

Delimited ASCII, for exchanging files with a wide variety of industry applications, especially other database products. This is a commonly used way of storing data that separates column values with a special delimiting character.

ASC

Non-delimited ASCII for importing or loading data from other applications that create flat text files with aligned column data.

WSF

Work-Sheet formats, for exchange with products such as Lotus** 1-2-3** and Symphony**. The LOAD utility does not support this data type. The database manager supports WSF files generated and/or supported by:

Lotus 1-2-3 Release 1, 1A, 2 and 2J
Lotus Symphony Release 1.0 and 1.1

IXF

PC version of the Integrated Exchange Format, the preferred method for exchange within the database manager. Use PC/IXF to export data from a table so it can be imported later into the same or another table.

For DEL, WSF, and ASC data file formats, define the table, including its column names and data types, before importing the file. The data types in the operating system file fields are converted into the corresponding type of data in the database table. The IMPORT utility accepts data with minor incompatibility problems, including character data imported with possible padding or truncation, and numeric data imported into different types of numeric fields.

For IXF data file formats, the table does not need to exist before beginning the import. It can be automatically created when the data is imported. User-defined distinct types (UDTs) are not made part of the new table column types; instead, the base type is used.

Similarly, when exporting to the IXF data file format, UDTs are stored as base data types in the IXF file.

When working with typed tables and using the PC/IXF data file format, the types and tables do not need to exist before beginning the IMPORT command. The types and tables can be automatically created when the data is imported. User-defined distinct types (UDTs) are not made part of the new table column types; instead, the base type is used.

The following topics describe these file formats:

"Delimited ASCII (DEL) File Format"
"Nondelimited ASCII (ASC) File Format"
"WSF File Format"
"PC/IXF File Format"

For more information on using these formats, see the Command Reference .

Delimited ASCII (DEL) File Format

A DEL file is a sequential ASCII file with row and column delimiters. It can be used to exchange data with a variety of products using different column delimiters.

Each DEL file is a stream of ASCII characters consisting of cell values ordered by row and then by column. Rows in the data stream are separated by row delimiters. Within a row, the individual cell values are separated by column delimiters. When a file is defined as DEL, spaces that precede the first character or follow the last character of a cell value are discarded.

You can override the default characters for the column delimiter (,), the character string delimiter ("), and the decimal point (.).

The following is an example of a DEL file:

   "Smith, Bob",4973,15.46
   "Jones, Bill",12345,16.34
   "Williams, Sam",452,193.78

Each line ends with a row delimiter which is the end-of-line indicator used by the operating system. In the case of UNIX-based implementations, the end-of-line indicator is an ASCII line feed (LF) character. In the case of Intel-based implementations, the end-of-line indicator is an ASCII carriage return/line feed (CRLF) sequence. Each line ends with a line feed (LF) character which is the row delimiter. In this example, a row is "Smith, Bob",4973,15.46.

Quotes (that is, character string delimiters: ") are required so that the commas in the names are not interpreted as being column delimiters. In the example DEL file above, the first column contains "Smith, Bob" "Jones, Bill" "Williams, Sam".

If you change the column delimiter to a semicolon (;), the character string delimiter to a single quote ('), and the decimal point character to a comma (,), the same file would appear as follows:

   'Smith, Bob';4973;15,46
   'Jones, Bill';12345;16,34
   'Williams, Sam';452;193,78

When importing or exporting DEL files, keep in mind the following:

For the character string and column delimiters:
- A space (X'20') is never a valid delimiter or column delimiter.
- The period (.) is not a valid character string delimiter, because it conflicts with periods in time and timestamp values.
- When exporting to a DEL file, for the character delimiter string choose a character that does not occur within the data to be exported. An attempt to export character data containing a character string delimiter will cause a warning message. An attempt to import such a file will produce erroneous results.
Import of character strings that are not enclosed in character string delimiters is allowed. The end of a nondelimited character string is determined by the first occurrence of a space, a character string delimiter, or a row delimiter.
A null value is indicated by the absence of a cell value where one would normally occur, or by a string of spaces.
Because some other products restrict the length of character fields, the EXPORT command sends a warning message whenever a character column greater than 254 characters is selected for export. The IMPORT command accommodates fields as long as the longest possible length, which is 32 700 bytes.
When working with DB2 on Intel-based operating systems, the first occurrence of an end-of-file character (X'1A') that is not within a character string indicates the end of the file. No data following the end-of-file character is imported. If the NOEOFCHAR option is specified, this character is ignored.
Integer, decimal, and scientific notation constants can be imported into numeric database columns that are within the proper range.
The acceptable forms for importing date and time data are based on the country code of the target database.

When exporting DEL files, all dates by default are in YYYYMMDD format. To get ISO format (YYYY-MM-DD), specify DATEISO in the FILETMOD attribute.

Code Page Considerations: When you are importing or exporting a DEL file, the code page for the data is assumed to be the same as that of the application executing the utility. If it is different, unpredictable results may occur. When loading a DEL file, the code page for the data is assumed to be the same as that of the database.

Any graphic data extracted (using EXPORT) by a client running under Japanese or Traditional-Chinese EUC code pages will be encoded using the EUC encoding rather than the UCS-2 internal representation when it is written to the file. Any graphic data imported to (using IMPORT) or loaded by (using LOAD) clients running under these code pages will be converted from the EUC encoding to the UCS-2 internal representation before the data is inserted or loaded, respectively, into the database.

Nondelimited ASCII (ASC) File Format

An ASC file is a sequential ASCII file with row delimiters. It can be used to exchange data with any ASCII product that can create data in a columnar format, including word processors.

Each ASC file is a stream of ASCII characters consisting of data values organized by row and column. Rows in the data stream are separated by a row delimiter, which is the end-of-line indicator used by the operating system. In the case of UNIX-based implementations, the end-of-line indicator is an ASCII line feed (LF) character. In the case of Intel-based implementations, the end-of-line indicator is an ASCII carriage return/line feed (CRLF) sequence. If the RECLEN=x option is used, each "x" characters is considered one row.

Each column within a row is defined by a beginning-ending location pair. Each pair represents locations specified as byte positions within a row. (The first position within a row is byte position 1.) The first element of each location pair is the byte within the row where the column begins and the second element is the byte where the column ends. The columns may overlap. Within one ASCII file, every row has the same column definitions.

No special processing is done for column names. Each row is considered to be data, which means that ASC files are assumed to have no row or column names.

See the API Reference and the Command Reference for more information about ASCII file formats used for import.

Code Page Considerations: When you are importing an ASC file, the code page for the data is assumed to be the same as that of the application executing the utility. If it is different, unpredictable results may occur. When loading an ASC file, the code page for the data is assumed to be the same as that of the database.

WSF File Format

Lotus 1-2-3 and Symphony products use the same basic format, with additional functions added at each new release. The database manager supports the subset of the worksheet records that are the same for all the Lotus products. That is, for the releases of Lotus 1-2-3 and Symphony products supported by the database manager, all file names with any three-character extension are accepted, for example: WKS, WK1, WRK, WR1, WJ2.

Each WSF file represents one worksheet. The database manager uses the following conventions to interpret worksheets and to provide consistency in worksheets generated by its export operations:

Cells in the first row (ROW value 0) are reserved for descriptive information about the entire worksheet. All data within this row is optional. It is ignored during import.
Cells in the second row (ROW value 1) are used for column labels.
The remaining rows are data rows (records, or rows of data from the table).
Cell values under any column heading are values for that particular column or field.
A null value is indicated by the absence of a real cell content record (for example, no integer, number, label, or formula record) for a particular column within a row of cell content records.
Note: A row of all nulls will be neither imported nor exported.

In order to create a file that is compliant with WSF format, some loss of data may occur when exporting from a table into a file with WSF format.

Code Page Considerations: Data in the WSF files use a Lotus code point mapping that is not necessarily the same as existing code pages supported by DB2. As a result, when importing or exporting a WSF file, data is converted from the Lotus code points to/from the code points used by the application code page. DB2 supports conversion between the Lotus code points and code points defined by code pages 437, 819, 850, 860, 863, and 865.
Note: For multi-byte character set users, no conversions are performed.

PC/IXF File Format

The personal computer (PC) version of the IXF format is a specific format used by the database manager. IMPORT and LOAD accept only PC/IXF files, not host IXF files. PC/IXF is a structured description of a database table that contains an external representation of the internal table. Data exported in PC/IXF format can be imported into another DB2 for Universal Database product database. The code page value stored in the IXF file must pass code page checks with the application environment and database. The IMPORT utility can be invoked with the parameter settings indicating that code page mismatches are to be ignored.

Keep the following rules in mind when importing PC/IXF files into tables and views:

A non-nullable PC/IXF column can be loaded or imported into a nullable column.
A nullable PC/IXF column can be loaded or imported into a non-nullable column, although some rows may be rejected.
Numeric columns accept columns of any numeric type, although some data may be rejected because it is out of range.
Fixed-length string columns in the PC/IXF file that are too long for the target column are not compatible and are not imported or loaded. Variable-length string columns with actual lengths that are not compatible with the target column are processed according to the compatibility rules used when adding data to a table or view. The data is padded on the right with spaces if necessary.
Date, time, and timestamp columns accept data from PC/IXF columns with matching types and from character PC/IXF columns. Data values from character PC/IXF columns must be valid input values for dates, times, or timestamps for successful insertion into each of the corresponding type columns.
A file with more than 1024 columns will be rejected.
Large object (LOB) files can only go into large objects (LOBs).
Large objects (LOBs) can go into CHAR fields.

Code Page Considerations: A PC/IXF file does not have to be using the same code page as the application running the import or load utility. The code page of the data in the PC/IXF file is stored in the file.

If the PC/IXF file and the application performing the import or load are using the same code page, processing occurs as for a regular application. If they are using different code pages, processing depends on how the import or load utility were invoked:

If the FORCEIN option has been specified, the file code page is ignored and the import or load assume that the data is in the application code page.
If the FORCEIN option is not specified, the results depend on whether a code page conversion table exists for the file code page and the application code page for IMPORT or the database code page for LOAD.
- If a conversion table exists, the IMPORT utility or LOAD utility converts the data, and the utility continues with a warning that the conversion has occurred.
- If there is no conversion table, the IMPORT utility or LOAD utility ends with an error.

When exporting a PC/IXF file using the LOBSINFILE option and then importing or loading to a client having a different code page, any CLOBs or DBCLOBs are not converted. The CLOBs and DBCLOBs are kept in separate files when the rest of the data is imported or loaded. To properly import or load CLOB and DBCLOB data, the utility must be used as an application having the same code page as the PC/IXF file.

Any graphic data imported to (using IMPORT) or loaded by (using LOAD) clients running under Japanese or Traditional-Chinese Extended Unix Code (EUC) code pages will be assumed to be encoded using the UCS-2 code set. Mixed character data is assumed to be encoded using the EUC code set. Similarly, any graphic data extracted (using EXPORT) by clients running under either of the two EUC code pages remains encoded as UCS-2. This is done to improve performance.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]