IBM Books

Administration Guide


LOAD, IMPORT, and EXPORT File Formats

Four types of files can be imported to a database, and three types can be exported or loaded. The type indicates the format of the data within the operating system file. The supported file formats are:

DEL
Delimited ASCII, for exchanging files with a wide variety of industry applications, especially other database products. This is a commonly used way of storing data that separates column values with a special delimiting character.

ASC
Non-delimited ASCII for importing or loading data from other applications that create flat text files with aligned column data.

WSF
Work-Sheet formats, for exchange with products such as Lotus** 1-2-3** and Symphony**. The LOAD utility does not support this data type. The database manager supports WSF files generated and/or supported by:

IXF
PC version of the Integrated Exchange Format, the preferred method for exchange within the database manager. Use PC/IXF to export data from a table so it can be imported later into the same or another table.

For DEL, WSF, and ASC data file formats, define the table, including its column names and data types, before importing the file. The data types in the operating system file fields are converted into the corresponding type of data in the database table. The IMPORT utility accepts data with minor incompatibility problems, including character data imported with possible padding or truncation, and numeric data imported into different types of numeric fields.

For IXF data file formats, the table does not need to exist before beginning the import. It can be automatically created when the data is imported. User-defined distinct types (UDTs) are not made part of the new table column types; instead, the base type is used.

Similarly, when exporting to the IXF data file format, UDTs are stored as base data types in the IXF file.

When working with typed tables and using the PC/IXF data file format, the types and tables do not need to exist before beginning the IMPORT command. The types and tables can be automatically created when the data is imported. User-defined distinct types (UDTs) are not made part of the new table column types; instead, the base type is used.

The following topics describe these file formats:

For more information on using these formats, see the Command Reference .

Delimited ASCII (DEL) File Format

A DEL file is a sequential ASCII file with row and column delimiters. It can be used to exchange data with a variety of products using different column delimiters.

Each DEL file is a stream of ASCII characters consisting of cell values ordered by row and then by column. Rows in the data stream are separated by row delimiters. Within a row, the individual cell values are separated by column delimiters. When a file is defined as DEL, spaces that precede the first character or follow the last character of a cell value are discarded.

You can override the default characters for the column delimiter (,), the character string delimiter ("), and the decimal point (.).

The following is an example of a DEL file:

   "Smith, Bob",4973,15.46
   "Jones, Bill",12345,16.34
   "Williams, Sam",452,193.78

Each line ends with a row delimiter which is the end-of-line indicator used by the operating system. In the case of UNIX-based implementations, the end-of-line indicator is an ASCII line feed (LF) character. In the case of Intel-based implementations, the end-of-line indicator is an ASCII carriage return/line feed (CRLF) sequence. Each line ends with a line feed (LF) character which is the row delimiter. In this example, a row is "Smith, Bob",4973,15.46.

Quotes (that is, character string delimiters: ") are required so that the commas in the names are not interpreted as being column delimiters. In the example DEL file above, the first column contains "Smith, Bob" "Jones, Bill" "Williams, Sam".

If you change the column delimiter to a semicolon (;), the character string delimiter to a single quote ('), and the decimal point character to a comma (,), the same file would appear as follows:

   'Smith, Bob';4973;15,46
   'Jones, Bill';12345;16,34
   'Williams, Sam';452;193,78

When importing or exporting DEL files, keep in mind the following:

When exporting DEL files, all dates by default are in YYYYMMDD format. To get ISO format (YYYY-MM-DD), specify DATEISO in the FILETMOD attribute.

Code Page Considerations: When you are importing or exporting a DEL file, the code page for the data is assumed to be the same as that of the application executing the utility. If it is different, unpredictable results may occur. When loading a DEL file, the code page for the data is assumed to be the same as that of the database.

Any graphic data extracted (using EXPORT) by a client running under Japanese or Traditional-Chinese EUC code pages will be encoded using the EUC encoding rather than the UCS-2 internal representation when it is written to the file. Any graphic data imported to (using IMPORT) or loaded by (using LOAD) clients running under these code pages will be converted from the EUC encoding to the UCS-2 internal representation before the data is inserted or loaded, respectively, into the database.

Nondelimited ASCII (ASC) File Format

An ASC file is a sequential ASCII file with row delimiters. It can be used to exchange data with any ASCII product that can create data in a columnar format, including word processors.

Each ASC file is a stream of ASCII characters consisting of data values organized by row and column. Rows in the data stream are separated by a row delimiter, which is the end-of-line indicator used by the operating system. In the case of UNIX-based implementations, the end-of-line indicator is an ASCII line feed (LF) character. In the case of Intel-based implementations, the end-of-line indicator is an ASCII carriage return/line feed (CRLF) sequence. If the RECLEN=x option is used, each "x" characters is considered one row.

Each column within a row is defined by a beginning-ending location pair. Each pair represents locations specified as byte positions within a row. (The first position within a row is byte position 1.) The first element of each location pair is the byte within the row where the column begins and the second element is the byte where the column ends. The columns may overlap. Within one ASCII file, every row has the same column definitions.

No special processing is done for column names. Each row is considered to be data, which means that ASC files are assumed to have no row or column names.

See the API Reference and the Command Reference for more information about ASCII file formats used for import.

Code Page Considerations: When you are importing an ASC file, the code page for the data is assumed to be the same as that of the application executing the utility. If it is different, unpredictable results may occur. When loading an ASC file, the code page for the data is assumed to be the same as that of the database.

Any graphic data extracted (using EXPORT) by a client running under Japanese or Traditional-Chinese EUC code pages will be encoded using the EUC encoding rather than the UCS-2 internal representation when it is written to the file. Any graphic data imported to (using IMPORT) or loaded by (using LOAD) clients running under these code pages will be converted from the EUC encoding to the UCS-2 internal representation before the data is inserted or loaded, respectively, into the database.

WSF File Format

Lotus 1-2-3 and Symphony products use the same basic format, with additional functions added at each new release. The database manager supports the subset of the worksheet records that are the same for all the Lotus products. That is, for the releases of Lotus 1-2-3 and Symphony products supported by the database manager, all file names with any three-character extension are accepted, for example: WKS, WK1, WRK, WR1, WJ2.

Each WSF file represents one worksheet. The database manager uses the following conventions to interpret worksheets and to provide consistency in worksheets generated by its export operations:

In order to create a file that is compliant with WSF format, some loss of data may occur when exporting from a table into a file with WSF format.

Code Page Considerations: Data in the WSF files use a Lotus code point mapping that is not necessarily the same as existing code pages supported by DB2. As a result, when importing or exporting a WSF file, data is converted from the Lotus code points to/from the code points used by the application code page. DB2 supports conversion between the Lotus code points and code points defined by code pages 437, 819, 850, 860, 863, and 865.
Note:For multi-byte character set users, no conversions are performed.

PC/IXF File Format

The personal computer (PC) version of the IXF format is a specific format used by the database manager. IMPORT and LOAD accept only PC/IXF files, not host IXF files. PC/IXF is a structured description of a database table that contains an external representation of the internal table. Data exported in PC/IXF format can be imported into another DB2 for Universal Database product database. The code page value stored in the IXF file must pass code page checks with the application environment and database. The IMPORT utility can be invoked with the parameter settings indicating that code page mismatches are to be ignored.

Keep the following rules in mind when importing PC/IXF files into tables and views:

Code Page Considerations: A PC/IXF file does not have to be using the same code page as the application running the import or load utility. The code page of the data in the PC/IXF file is stored in the file.

If the PC/IXF file and the application performing the import or load are using the same code page, processing occurs as for a regular application. If they are using different code pages, processing depends on how the import or load utility were invoked:

When exporting a PC/IXF file using the LOBSINFILE option and then importing or loading to a client having a different code page, any CLOBs or DBCLOBs are not converted. The CLOBs and DBCLOBs are kept in separate files when the rest of the data is imported or loaded. To properly import or load CLOB and DBCLOB data, the utility must be used as an application having the same code page as the PC/IXF file.

Any graphic data imported to (using IMPORT) or loaded by (using LOAD) clients running under Japanese or Traditional-Chinese Extended Unix Code (EUC) code pages will be assumed to be encoded using the UCS-2 code set. Mixed character data is assumed to be encoded using the EUC code set. Similarly, any graphic data extracted (using EXPORT) by clients running under either of the two EUC code pages remains encoded as UCS-2. This is done to improve performance.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]