Posts Tagged ‘sql loader control file’

SQL LOADER CONCEPTS

SQL*Loader is a bulk loader utility used for moving data from external files(flat files) into the Oracle database.

SQL*Loader reads from a data file(s)(usually a flat file) and a control file(file that contains description of the data to be loaded and it has nothing to do with database control files) . During loading , process information regarding loading is written to log file, bad rows are to bad file and discarded rows to the discard file.

SQL*Loader environment

SQL*Loader Control File

The control file is a text file that contains DDL instructions and provides the following information to SQL*Loader:

  • The name and location of the input data file
  • The format of the records in the input data file
  • The name of the table(s) to be loaded
  • Table and field list information
  • SQL*Loader configuration(memory management, record rejection criteria, interrupted .
  • The names and locations of bad file and  discard file

SQL*Loader Options:

Following are few commonly used Options in control file:

 SKIP = n (Number of records to be skipped before loading)
  ERRORS = n (Number of erroneous records allowed )
 ROWS = n (Number of rows before committing )
  DIRECT = {TRUE | FALSE}
 PARALLEL = {TRUE | FALSE}

 

If there is a small amount of data then its possible for the control file to contain the actual data to be loaded. This is achieved by using BEGINDATA clause before the actual data in the control file.

Case Study for simple control file including data

SQL*Loader Input Data Files

SQL*Loader reads data from one or more files organized as records in one of three formats:

  • Fixed-record format
  • Variable-record format
  • Stream-record format

Fixed-Record Format

A file is in fixed-record format when all records in a data file are the same byte length.

INFILE  ‘filename’ “fix  n”

The following example shows a control file that specifies a fixed-record format data file. The data file contains 2 physical records. The first record is [0001, abcd], which is exactly nine

bytes long (using a single-byte character set) and the carriage return is the tenth byte.

Control File:

load data

infile ‘example.dat’    “fix 10”

into table example

fields terminated by ’,’

(col1, col2)

input.dat:

0001,abcd

0002,fghi

Variable-Record Format

A file is in variable-record format when the length of each record in a character field is included at the beginning of each record in the data file.
INFILE  ‘filename’ “var  n”   (n is the number of bytes in record length field)

The following example tells SQL*Loader to look for data in the

input.dat data file and to expect variable-record format where the record length fields are 3

bytes long. The example.dat data file consists of two physical records. The first is specified

to be 009 ( nine) bytes long, the second is 010 bytes long (including a one-character

newline).

Control File:

load data

infile ‘example.dat’     “var 3“

into table example

fields terminated by ’,’ optionally enclosed by ’”’

(col1 char(5),col2 char(7))

input.dat:

009hello,cd,

010world,im,

Stream-Record Format

A file is in stream-record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator.

INFILE  ‘filename’ “str terminator_string”

The following example illustrates loading data in stream-record format where the terminator string is specified using a character string, ’|\n’.

Control File:

load data

infile ‘example.dat’   “str ’|\n’“

into table example

fields terminated by ’,’ optionally enclosed by ’”’

(col1 char(5), col2 char(7))

input.dat:

hello,world,|

james,bond,|

* Filler fields in control file are fields that have names but they are not loaded into the table.

Loading Methods

SQL*Loader provides two methods for loading data:

• Conventional path

• Direct path

Conventional Path Load

Conventional path load builds an array of rows to be inserted and uses the SQL INSERT statement to load the data. During conventional path loads, input records are parsed based on field specifications, and an array of records is built and inserted into the table specified in the control file.

Direct Path Load

A direct path load builds blocks of data in memory and saves these blocks directly into the extents allocated for the table being loaded. Online redo log file entries are not generated unless the database is in ARCHIVELOG mode. Direct path loads use the field specifications to build whole Oracle blocks of data, and write the blocks directly to Oracle data files above the high water mark.

The high-water mark is the highest point to which data has been written into the table so far. Direct path load bypasses the database buffer cache and accesses the SGA only for extent management and adjustment of the high-water mark.

Comparing Direct and Conventional Path Loads

Conventional Load Direct Path Load
Uses COMMIT to make changes permanent Uses data saves
Redo entries always generated Generates redo only under specific conditions
Enforces all constraints Enforces only PRIMARY KEY, UNIQUE, and NOT NULL
INSERT triggers fire INSERT triggers do not fire
Can load into clustered tables Cannot load into clustered tables
Other users can make changes to tables Other users cannot make changes to tables
INSERT Triggers are fired INSERT Triggers disabled at start and enabled at the end
Does not lock the table being loaded into so other transactions can change the table Locks table in Exclusive mode except when several parallel direct load sessions are used concurrently

Comparing Direct and Conventional Path Loads

Method of Saving Data

Conventional path loads use SQL processing and a database COMMIT for saving data. The

insertion of an array of records is followed by a commit operation.Direct path loads use data saves to write blocks of data to Oracle data files. The following features differentiate a data save from a COMMIT:

• During a data save, only full database blocks are written to the database.

• The blocks are written after the high-water mark of the table.

• After a data save, the high-water mark is moved.

• Internal resources are not released after a data save.

• A data save does not end the transaction.

• Indexes are not updated at each data save.

Logging Changes

Conventional path loading generates redo entries similar to any DML statement. When using a direct path load, redo entries are not generated if:
• The database is in NOARCHIVELOG mode

• The database is in ARCHIVELOG mode, but logging is disabled.

Logging can be disabled by setting the NOLOGGING attribute for the table or by using the UNRECOVERABLE clause in the control file.

Conventional Vs Direct Path Load (from Shailesh’s Blog )

Useful  Links:

Maximizing SQL*Load Performance( from Burleson Consulting )

Oracle FAQ on Orafaq

Intro to SQL*Loader by O’Reilly

Oracle SQL Loader Concepts on Oracle Site