Loading CSV files into Oracle tables using PLSQL

There are various ways to load flat files into Oracle tables like using few popular GUI i.e. TOAD , PLSQL Developer, SQL Developer and many more..

But we can also use this by few other techniques such as using External tables or by using UTL_FILE package.

Here I am going to show how to use UTL_FILE package and read the CSV from the Oracle directory and load into the target table and finally rename the CSV by concatenating with sysdate.

By renaming the file it is always easier to track by file name with date or timestamp like when the source flat file was imported into the oracle table.

So to understand all the above mentioned we will have to go through a complete example.

Let us create a sample target table:

SQL> CREATE TABLE temp
 ( col1 NUMBER,
 col2 VARCHAR2(40),
 col3 DATE
 );

Table created.

SQL>

Create a CSV file with the below sample data:


COL1 COL2 COL3
1 AA 1-Jan-12
2 BB12 2-Feb-12
3 CCcc11 22-Aug-12
4 DD 22-Aug-12

Put the CSV file into the Oracle directory and if you do not have any directory then create it as like below:


SQL>CREATE OR REPLACE DIRECTORY ORA_DIR AS 'C:\temp'; -- Change the path as per your wish

SQL> GRANT READ, WRITE ON DIRECTORY ORA_DIR TO << Your User >>;

Now here is the Procedure to read the CSV file and load it into the temp table which we have created earlier and finally rename the source file with

<<File Name>><<_DD_MON_RR>>.csv  .

After so many comments to my older post , I thought of making it little better and generic way to load the CSV file into my tables. Earlier what I used to do is write the insert script for the target table and execute it to load the file.

Now if I have to load 3 files into 3 different tables then I have to write 3 different procedure to load the files but then thought why not just pass the table name and file name as a parameter and load it directly into my target table and then finally all my files would be renamed like concatenating with date .

Again it would be helpful if I have a table which will have the track of what files are been loaded and how many number of records are been passed along with that if I have a table to get the log like the discarded records then would be very helpful to do further analysis on why the records are getting failed.

Check if the user is having UTL_FILE privileges or not :


SQL> SELECT OWNER,
OBJECT_TYPE
FROM ALL_OBJECTS
WHERE OBJECT_NAME = ‘UTL_FILE’
AND OWNER =<>;

 

 

If the user is not having the privileges then grant “UTL_FILE” to user from SYS user:

SQL> GRANT EXECUTE ON UTL_FILE TO <>;

To make use of the I have created two different tables as below: 

import_hist table and badlog table to track the history of the load and another to check the bad log if it occurs while doing the load .

Under the same user create an error log table to log the error out records while doing the import:

 

SQL> CREATE TABLE badlog
(
table_name VARCHAR2(200)
errm VARCHAR2(4000),
data VARCHAR2(4000) ,
error_date TIMESTAMP
);

 

 

Under the same user create Load history table to log the details of the file and tables that are imported with a track of records loaded:

 
SQL> create table IMPORT_HIST
(
FILENAME varchar2(200),
TABLE_NAME varchar2(200),
NUM_OF_REC number,
IMPORT_DATE DATE
);

 

 

One of my team member who is an intern ,took this as an challenge to make the correction on my blog and to make it very generic like what I mentioned for PIPE delimited files.

Thanks a ton Vani for your contribution…!!!

CREATE OR REPLACE FUNCTION LOAD_CSV ( p_table in varchar2,
 p_dir in varchar2 DEFAULT 'ORA_DIR' ,
 P_FILENAME in varchar2,
 p_ignore_headerlines IN INTEGER DEFAULT 1,
 p_delimiter in varchar2 default ',',
 p_optional_enclosed in varchar2 default '"' )
 return number
 is
/***************************************************************************
-- PROCEDURE LOAD_CSV
-- PURPOSE: This Procedure read the data from a CSV file.
-- And load it into the target oracle table.
-- Finally it renames the source file with date.
--
-- P_FILENAME
-- The name of the flat file(a text file)
--
-- P_DIRECTORY
-- Name of the directory where the file is been placed.
-- Note: The grant has to be given for the user to the directory
-- before executing the function
--
-- P_IGNORE_HEADERLINES:
-- Pass the value as '1' to ignore importing headers.
--
-- P_DELIMITER
-- By default the delimiter is used as ','
-- As we are using CSV file to load the data into oracle
--
-- P_OPTIONAL_ENCLOSED
-- By default the optionally enclosed is used as '"'
-- As we are using CSV file to load the data into oracle
--
-- AUTHOR:
-- Sloba
-- Version 1.0
-- Vani (bobba.vani31@gmail.com)
-- Version 1.1
**************************************************************************/
 l_input utl_file.file_type;
 l_theCursor integer default dbms_sql.open_cursor;
 l_lastLine varchar2(4000);
 l_cnames varchar2(4000);
 l_bindvars varchar2(4000);
 l_status integer;
 l_cnt number default 0;
 l_rowCount number default 0;
 l_sep char(1) default NULL;
 L_ERRMSG varchar2(4000);
 V_EOF BOOLEAN := false;

 begin

 l_cnt := 1;
 for TAB_COLUMNS in (
 select column_name, data_type from user_tab_columns where table_name=p_table order by column_id
 ) loop
 l_cnames := l_cnames || tab_columns.column_name || ',';
 l_bindvars := l_bindvars || case when tab_columns.data_type in ('DATE', 'TIMESTAMP(6)') then 'to_date(:b' || l_cnt || ',"YYYY-MM-DD HH24:MI:SS"),' else ':b'|| l_cnt || ',' end;

 l_cnt := l_cnt + 1;
 end loop;
 l_cnames := rtrim(l_cnames,',');
 L_BINDVARS := RTRIM(L_BINDVARS,',');

 L_INPUT := UTL_FILE.FOPEN( P_DIR, P_FILENAME, 'r' );
 IF p_ignore_headerlines > 0
 THEN
 BEGIN
 FOR i IN 1 .. p_ignore_headerlines
 LOOP
 UTL_FILE.get_line(l_input, l_lastLine);
 END LOOP;
 EXCEPTION
 WHEN NO_DATA_FOUND
 THEN
 v_eof := TRUE;
 end;
 END IF;

 if not v_eof then
 dbms_sql.parse( l_theCursor, 'insert into ' || p_table || '(' || l_cnames || ') values (' || l_bindvars || ')', dbms_sql.native );

 loop
 begin
 utl_file.get_line( l_input, l_lastLine );
 exception
 when NO_DATA_FOUND then
 exit;
 end;

 if length(l_lastLine) > 0 then
 for i in 1 .. l_cnt-1
 LOOP

 dbms_sql.bind_variable( l_theCursor, ':b'||i,
 rtrim(rtrim(ltrim(ltrim(
 REGEXP_SUBSTR(l_lastline,'(^|,)("[^"]*"|[^",]*)',1,i),p_delimiter),p_optional_enclosed),p_delimiter),p_optional_enclosed));
 end loop;
 begin
 l_status := dbms_sql.execute(l_theCursor);
 l_rowCount := l_rowCount + 1;
 exception
 when OTHERS then
 L_ERRMSG := SQLERRM;
 insert into BADLOG ( TABLE_NAME, ERRM, data, ERROR_DATE )
 values ( P_TABLE,l_errmsg, l_lastLine ,systimestamp );
 end;
 end if;
 end loop;

 dbms_sql.close_cursor(l_theCursor);
 utl_file.fclose( l_input );
 commit;
 end if;
 insert into IMPORT_HIST (FILENAME,TABLE_NAME,NUM_OF_REC,IMPORT_DATE)
 values ( P_FILENAME, P_TABLE,l_rowCount,sysdate );

 UTL_FILE.FRENAME(
 P_DIR,
 P_FILENAME,
 P_DIR,
 REPLACE(P_FILENAME,
 '.csv',
 '_' || TO_CHAR(SYSDATE, 'DD_MON_RRRR_HH24_MI_SS_AM') || '.csv'
 ));
 commit;
 RETURN L_ROWCOUNT;
 end LOAD_CSV;

Here is the sample PLSQL block which will call “LOAD_CSV” procedure to read the flat file (i.e. CSV file) from the directory which I have used as ‘ORA_DIR’ and rename the source file:

DECLARE
 p_table VARCHAR2(200) := <>;
 p_dir VARCHAR2(200) := <>;
 p_filename VARCHAR2(200) := <>;
 v_return NUMBER;
BEGIN
 v_return := LOAD_CSV(p_table => p_table
 ,p_dir => p_dir
 ,p_filename => p_filename
 ,p_ignore_headerlines => p_ignore_headerlines
 ,p_delimiter => p_delimiter
 ,p_optional_enclosed => p_optional_enclosed);

 dbms_output.put_line(‘v_return = ‘ || v_return);
END;

You can modify the and utilize it ….

Thanks ,

Sloba

Advertisements

102 thoughts on “Loading CSV files into Oracle tables using PLSQL

  1. Hello ,
    how can we capture the file name from where the data is being loaded into table – when loading data using sql loader ??

    thanks and regards,
    Nitya

    • If you see the function for the below lines the filename is already been captured:

      check the below lines from the code provided :
      fucntion ….
      ….
      insert into IMPORT_HIST (FILENAME,TABLE_NAME,NUM_OF_REC,IMPORT_DATE)
      values ( P_FILENAME, P_TABLE,l_rowCount,sysdate );
      ….
      end;

    • You can determine the number of columns and names from a flat file before loading it but how does that help in loading into the target tables as you must have specific number of columns that already exists where you want to load it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s