SQL*Loader is a bulk loader utility used for moving data from external files(flat files) into the Oracle database.
SQL*Loader reads from a data file(s)(usually a flat file) and a control file(file that contains description of the data to be loaded and it has nothing to do with database control files) . During loading , process information regarding loading is written to log file, bad rows are to bad file and discarded rows to the discard file.
SQL*Loader environment
SQL*Loader Control File
The control file is a text file that contains DDL instructions and provides the following information to SQL*Loader:
- The name and location of the input data file
- The format of the records in the input data file
- The name of the table(s) to be loaded
- Table and field list information
- SQL*Loader configuration(memory management, record rejection criteria, interrupted .
- The names and locations of bad file and discard file
SQL*Loader Options:
Following are few commonly used Options in control file:
SKIP = n (Number of records to be skipped before loading)
ERRORS = n (Number of erroneous records allowed )
ROWS = n (Number of rows before committing )
DIRECT = {TRUE | FALSE}
PARALLEL = {TRUE | FALSE}
If there is a small amount of data then its possible for the control file to contain the actual data to be loaded. This is achieved by using BEGINDATA clause before the actual data in the control file.
Case Study for simple control file including data
SQL*Loader Input Data Files
SQL*Loader reads data from one or more files organized as records in one of three formats:
- Fixed-record format
- Variable-record format
- Stream-record format
Fixed-Record Format
A file is in fixed-record format when all records in a data file are the same byte length.
INFILE ‘filename’ “fix n”
The following example shows a control file that specifies a fixed-record format data file. The data file contains 2 physical records. The first record is [0001, abcd], which is exactly nine
bytes long (using a single-byte character set) and the carriage return is the tenth byte.
Control File:
load data
infile ‘example.dat’ “fix 10”
into table example
fields terminated by ’,’
(col1, col2)
input.dat:
0001,abcd
0002,fghi
Variable-Record Format
A file is in variable-record format when the length of each record in a character field is included at the beginning of each record in the data file.
INFILE ‘filename’ “var n” (n is the number of bytes in record length field)
The following example tells SQL*Loader to look for data in the
input.dat data file and to expect variable-record format where the record length fields are 3
bytes long. The example.dat data file consists of two physical records. The first is specified
to be 009 ( nine) bytes long, the second is 010 bytes long (including a one-character
newline).
Control File:
load data
infile ‘example.dat’ “var 3“
into table example
fields terminated by ’,’ optionally enclosed by ’”’
(col1 char(5),col2 char(7))
input.dat:
009hello,cd,
010world,im,
Stream-Record Format
A file is in stream-record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator.
INFILE ‘filename’ “str terminator_string”
The following example illustrates loading data in stream-record format where the terminator string is specified using a character string, ’|\n’.
Control File:
load data
infile ‘example.dat’ “str ’|\n’“
into table example
fields terminated by ’,’ optionally enclosed by ’”’
(col1 char(5), col2 char(7))
input.dat:
hello,world,|
james,bond,|
* Filler fields in control file are fields that have names but they are not loaded into the table.
Loading Methods
SQL*Loader provides two methods for loading data:
• Conventional path
• Direct path
Conventional Path Load
Conventional path load builds an array of rows to be inserted and uses the SQL INSERT statement to load the data. During conventional path loads, input records are parsed based on field specifications, and an array of records is built and inserted into the table specified in the control file.
Direct Path Load
A direct path load builds blocks of data in memory and saves these blocks directly into the extents allocated for the table being loaded. Online redo log file entries are not generated unless the database is in ARCHIVELOG mode. Direct path loads use the field specifications to build whole Oracle blocks of data, and write the blocks directly to Oracle data files above the high water mark.
The high-water mark is the highest point to which data has been written into the table so far. Direct path load bypasses the database buffer cache and accesses the SGA only for extent management and adjustment of the high-water mark.
Comparing Direct and Conventional Path Loads
Conventional Load |
Direct Path Load |
Uses COMMIT to make changes permanent |
Uses data saves |
Redo entries always generated |
Generates redo only under specific conditions |
Enforces all constraints |
Enforces only PRIMARY KEY, UNIQUE, and NOT NULL |
INSERT triggers fire |
INSERT triggers do not fire |
Can load into clustered tables |
Cannot load into clustered tables |
Other users can make changes to tables |
Other users cannot make changes to tables |
INSERT Triggers are fired |
INSERT Triggers disabled at start and enabled at the end |
Does not lock the table being loaded into so other transactions can change the table |
Locks table in Exclusive mode except when several parallel direct load sessions are used concurrently |
Comparing Direct and Conventional Path Loads
Method of Saving Data
Conventional path loads use SQL processing and a database COMMIT for saving data. The
insertion of an array of records is followed by a commit operation.Direct path loads use data saves to write blocks of data to Oracle data files. The following features differentiate a data save from a COMMIT:
• During a data save, only full database blocks are written to the database.
• The blocks are written after the high-water mark of the table.
• After a data save, the high-water mark is moved.
• Internal resources are not released after a data save.
• A data save does not end the transaction.
• Indexes are not updated at each data save.
Logging Changes
Conventional path loading generates redo entries similar to any DML statement. When using a direct path load, redo entries are not generated if:
• The database is in NOARCHIVELOG mode
• The database is in ARCHIVELOG mode, but logging is disabled.
Logging can be disabled by setting the NOLOGGING attribute for the table or by using the UNRECOVERABLE clause in the control file.
Conventional Vs Direct Path Load (from Shailesh’s Blog )
Useful Links:
Maximizing SQL*Load Performance( from Burleson Consulting )
Oracle FAQ on Orafaq
Intro to SQL*Loader by O’Reilly
Oracle SQL Loader Concepts on Oracle Site