Loading CSV files into Oracle tables using PLSQL

There are various ways to load flat files into Oracle tables like using few popular GUI i.e. TOAD , PLSQL Developer, SQL Developer and many more..

But we can also use this by few other techniques such as using External tables or by using UTL_FILE package.

Here I am going to show how to use UTL_FILE package and read the CSV from the Oracle directory and load into the target table and finally rename the CSV by concatenating with sysdate.

By renaming the file it is always easier to track by file name with date or timestamp like when the source flat file was imported into the oracle table.

So to understand all the above mentioned we will have to go through a complete example.

Let us create a sample target table:

SQL> CREATE TABLE temp
 ( col1 NUMBER,
 col2 VARCHAR2(40),
 col3 DATE
 );

Table created.

SQL>

Create a CSV file with the below sample data:


COL1 COL2 COL3
1 AA 1-Jan-12
2 BB12 2-Feb-12
3 CCcc11 22-Aug-12
4 DD 22-Aug-12

Put the CSV file into the Oracle directory and if you do not have any directory then create it as like below:


SQL>CREATE OR REPLACE DIRECTORY ORA_DIR AS 'C:\temp'; -- Change the path as per your wish

SQL> GRANT READ, WRITE ON DIRECTORY ORA_DIR TO << Your User >>;

Now here is the Procedure to read the CSV file and load it into the temp table which we have created earlier and finally rename the source file with

<<File Name>><<_DD_MON_RR>>.csv  .

After so many comments to my older post , I thought of making it little better and generic way to load the CSV file into my tables. Earlier what I used to do is write the insert script for the target table and execute it to load the file.

Now if I have to load 3 files into 3 different tables then I have to write 3 different procedure to load the files but then thought why not just pass the table name and file name as a parameter and load it directly into my target table and then finally all my files would be renamed like concatenating with date .

Again it would be helpful if I have a table which will have the track of what files are been loaded and how many number of records are been passed along with that if I have a table to get the log like the discarded records then would be very helpful to do further analysis on why the records are getting failed.

Check if the user is having UTL_FILE privileges or not :


SQL> SELECT OWNER,
OBJECT_TYPE
FROM ALL_OBJECTS
WHERE OBJECT_NAME = ‘UTL_FILE’
AND OWNER =<>;

 

 

If the user is not having the privileges then grant “UTL_FILE” to user from SYS user:

SQL> GRANT EXECUTE ON UTL_FILE TO <>;

To make use of the I have created two different tables as below: 

import_hist table and badlog table to track the history of the load and another to check the bad log if it occurs while doing the load .

Under the same user create an error log table to log the error out records while doing the import:

 

SQL> CREATE TABLE badlog
(
table_name VARCHAR2(200)
errm VARCHAR2(4000),
data VARCHAR2(4000) ,
error_date TIMESTAMP
);

 

 

Under the same user create Load history table to log the details of the file and tables that are imported with a track of records loaded:

 
SQL> create table IMPORT_HIST
(
FILENAME varchar2(200),
TABLE_NAME varchar2(200),
NUM_OF_REC number,
IMPORT_DATE DATE
);

 

 

One of my team member who is an intern ,took this as an challenge to make the correction on my blog and to make it very generic like what I mentioned for PIPE delimited files.

Thanks a ton Vani for your contribution…!!!

CREATE OR REPLACE FUNCTION LOAD_CSV ( p_table in varchar2,
 p_dir in varchar2 DEFAULT 'ORA_DIR' ,
 P_FILENAME in varchar2,
 p_ignore_headerlines IN INTEGER DEFAULT 1,
 p_delimiter in varchar2 default ',',
 p_optional_enclosed in varchar2 default '"' )
 return number
 is
/***************************************************************************
-- PROCEDURE LOAD_CSV
-- PURPOSE: This Procedure read the data from a CSV file.
-- And load it into the target oracle table.
-- Finally it renames the source file with date.
--
-- P_FILENAME
-- The name of the flat file(a text file)
--
-- P_DIRECTORY
-- Name of the directory where the file is been placed.
-- Note: The grant has to be given for the user to the directory
-- before executing the function
--
-- P_IGNORE_HEADERLINES:
-- Pass the value as '1' to ignore importing headers.
--
-- P_DELIMITER
-- By default the delimiter is used as ','
-- As we are using CSV file to load the data into oracle
--
-- P_OPTIONAL_ENCLOSED
-- By default the optionally enclosed is used as '"'
-- As we are using CSV file to load the data into oracle
--
-- AUTHOR:
-- Sloba
-- Version 1.0
-- Vani (bobba.vani31@gmail.com)
-- Version 1.1
**************************************************************************/
 l_input utl_file.file_type;
 l_theCursor integer default dbms_sql.open_cursor;
 l_lastLine varchar2(4000);
 l_cnames varchar2(4000);
 l_bindvars varchar2(4000);
 l_status integer;
 l_cnt number default 0;
 l_rowCount number default 0;
 l_sep char(1) default NULL;
 L_ERRMSG varchar2(4000);
 V_EOF BOOLEAN := false;

 begin

 l_cnt := 1;
 for TAB_COLUMNS in (
 select column_name, data_type from user_tab_columns where table_name=p_table order by column_id
 ) loop
 l_cnames := l_cnames || tab_columns.column_name || ',';
 l_bindvars := l_bindvars || case when tab_columns.data_type in ('DATE', 'TIMESTAMP(6)') then 'to_date(:b' || l_cnt || ',"YYYY-MM-DD HH24:MI:SS"),' else ':b'|| l_cnt || ',' end;

 l_cnt := l_cnt + 1;
 end loop;
 l_cnames := rtrim(l_cnames,',');
 L_BINDVARS := RTRIM(L_BINDVARS,',');

 L_INPUT := UTL_FILE.FOPEN( P_DIR, P_FILENAME, 'r' );
 IF p_ignore_headerlines > 0
 THEN
 BEGIN
 FOR i IN 1 .. p_ignore_headerlines
 LOOP
 UTL_FILE.get_line(l_input, l_lastLine);
 END LOOP;
 EXCEPTION
 WHEN NO_DATA_FOUND
 THEN
 v_eof := TRUE;
 end;
 END IF;

 if not v_eof then
 dbms_sql.parse( l_theCursor, 'insert into ' || p_table || '(' || l_cnames || ') values (' || l_bindvars || ')', dbms_sql.native );

 loop
 begin
 utl_file.get_line( l_input, l_lastLine );
 exception
 when NO_DATA_FOUND then
 exit;
 end;

 if length(l_lastLine) > 0 then
 for i in 1 .. l_cnt-1
 LOOP

 dbms_sql.bind_variable( l_theCursor, ':b'||i,
 rtrim(rtrim(ltrim(ltrim(
 REGEXP_SUBSTR(l_lastline,'(^|,)("[^"]*"|[^",]*)',1,i),p_delimiter),p_optional_enclosed),p_delimiter),p_optional_enclosed));
 end loop;
 begin
 l_status := dbms_sql.execute(l_theCursor);
 l_rowCount := l_rowCount + 1;
 exception
 when OTHERS then
 L_ERRMSG := SQLERRM;
 insert into BADLOG ( TABLE_NAME, ERRM, data, ERROR_DATE )
 values ( P_TABLE,l_errmsg, l_lastLine ,systimestamp );
 end;
 end if;
 end loop;

 dbms_sql.close_cursor(l_theCursor);
 utl_file.fclose( l_input );
 commit;
 end if;
 insert into IMPORT_HIST (FILENAME,TABLE_NAME,NUM_OF_REC,IMPORT_DATE)
 values ( P_FILENAME, P_TABLE,l_rowCount,sysdate );

 UTL_FILE.FRENAME(
 P_DIR,
 P_FILENAME,
 P_DIR,
 REPLACE(P_FILENAME,
 '.csv',
 '_' || TO_CHAR(SYSDATE, 'DD_MON_RRRR_HH24_MI_SS_AM') || '.csv'
 ));
 commit;
 RETURN L_ROWCOUNT;
 end LOAD_CSV;

Here is the sample PLSQL block which will call “LOAD_CSV” procedure to read the flat file (i.e. CSV file) from the directory which I have used as ‘ORA_DIR’ and rename the source file:

DECLARE
 p_table VARCHAR2(200) := <>;
 p_dir VARCHAR2(200) := <>;
 p_filename VARCHAR2(200) := <>;
 v_return NUMBER;
BEGIN
 v_return := LOAD_CSV(p_table => p_table
 ,p_dir => p_dir
 ,p_filename => p_filename
 ,p_ignore_headerlines => p_ignore_headerlines
 ,p_delimiter => p_delimiter
 ,p_optional_enclosed => p_optional_enclosed);

 dbms_output.put_line(‘v_return = ‘ || v_return);
END;

You can modify the and utilize it ….

Thanks ,

Sloba

Advertisements

111 thoughts on “Loading CSV files into Oracle tables using PLSQL

  1. Hello ,
    how can we capture the file name from where the data is being loaded into table – when loading data using sql loader ??

    thanks and regards,
    Nitya

    • If you see the function for the below lines the filename is already been captured:

      check the below lines from the code provided :
      fucntion ….
      ….
      insert into IMPORT_HIST (FILENAME,TABLE_NAME,NUM_OF_REC,IMPORT_DATE)
      values ( P_FILENAME, P_TABLE,l_rowCount,sysdate );
      ….
      end;

    • You can determine the number of columns and names from a flat file before loading it but how does that help in loading into the target tables as you must have specific number of columns that already exists where you want to load it.

  2. One of my columns stores value generated by a sequence . How can i handle this case using this generic API ? Should i pass null for this column in the csv file and allow a trigger to insert the value ? Will this work ? Any suggestions?

  3. hey sloba…I tried to use your code and it works perfectly fine. But my table has some default values assigned to it. So, those default values are not getting inserted as a result of your code. Is there something I need to change for this? Can you let me know?

  4. Hi Sloba,

    I did use your code and it works perfectly fine. But if there are any not null columns in the table and we provide some default values for them when creation of the table, your code is for some reason not allowing to have those default values inserted. It instead inserts nulls.

    Is there a way to get over this?

    Regards,
    Sowmya.K

  5. Please find the structure below.

    DROP TABLE XXNRECA.TEST CASCADE CONSTRAINTS;

    CREATE TABLE XXNRECA.TEST
    (
    GUID VARCHAR2(36 BYTE) NOT NULL,
    CS_PERSON_ID VARCHAR2(50 BYTE) NOT NULL,
    USER_NAME VARCHAR2(50 BYTE) ,
    LAST_NAME VARCHAR2(100 BYTE) NOT NULL,
    FIRST_NAME VARCHAR2(100 BYTE) NOT NULL,
    MIDDLE_NAME VARCHAR2(100 BYTE) ,
    GENDER VARCHAR2(1 BYTE) NOT NULL,
    PREFERRED_NAME VARCHAR2(100 BYTE),
    VETERAN_STATUS VARCHAR2(100 BYTE),
    DISABILITY VARCHAR2(100 BYTE),
    ETHNIC_ORIGIN VARCHAR2(100 BYTE),
    ADDRESS_LINE_1 VARCHAR2(100 BYTE) NOT NULL,
    ADDRESS_LINE_2 VARCHAR2(100 BYTE),
    CITY VARCHAR2(50 BYTE) NOT NULL,
    STATE VARCHAR2(2 BYTE) NOT NULL,
    ZIP_CODE VARCHAR2(10 BYTE) NOT NULL,
    COUNTRY VARCHAR2(30 BYTE),
    ADDRESS_TYPE VARCHAR2(20 BYTE) NOT NULL,
    ORG_ID NUMBER,
    POSITION VARCHAR2(100 BYTE) NOT NULL,
    LOCATION_ID NUMBER,
    VACANCY VARCHAR2(50 BYTE),
    SUPERVISOR_ID NUMBER,
    CO_COST_CENTER VARCHAR2(50 BYTE) NOT NULL,
    SALARY_BASIS VARCHAR2(20 BYTE),
    SALARY_NEW_VALUE NUMBER NOT NULL,
    HIRE_DATE DATE NOT NULL,
    SEAT_NUMBER VARCHAR2(50 BYTE) NOT NULL,
    GRADE VARCHAR2(10 BYTE),
    DISPLAY_JOB_TITLE VARCHAR2(100 BYTE),
    PHONE_MOBILE VARCHAR2(20 BYTE),
    PERSONAL_EMAIL_ADDRESS VARCHAR2(50 BYTE),
    ASSIGNMENT_CATEGORY VARCHAR2(40 BYTE),
    EX_EMP_NUMBER NUMBER,
    EX_CWK_NUMBER NUMBER,
    ORACLE_PERSON_ID NUMBER,
    ASSIGNMENT_ID NUMBER,
    RECEIVED_FROM_CS_DATE DATE DEFAULT SYSDATE,
    LOADED_TO_ORACLE_DATE DATE ,
    –CS_REHIRE_FLAG VARCHAR2(10 BYTE),
    LOAD_IN_ORACLE_FLAG VARCHAR2(5 BYTE) DEFAULT ‘Y’ NOT NULL,
    CREATION_DATE DATE DEFAULT SYSDATE NOT NULL ,
    CREATED_BY NUMBER(15) NOT NULL,
    LAST_UPDATE_DATE DATE DEFAULT SYSDATE NOT NULL ,
    LAST_UPDATED_BY NUMBER(15) NOT NULL,
    LAST_UPDATE_LOGIN NUMBER(15) NOT NULL,
    REQUEST_ID NUMBER(15),
    PERSON_API_SUCCESS VARCHAR2(1 BYTE),
    PERSON_API_MESSAGE VARCHAR2(1000 BYTE),
    ASSIGNMENT_API_SUCCESS VARCHAR2(1 BYTE),
    ASSIGNMENT_API_MESSAGE VARCHAR2(1000 BYTE),
    SALARY_API_SUCCESS VARCHAR2(1 BYTE),
    SALARY_API_MESSAGE VARCHAR2(1000 BYTE),
    PHONE_API_SUCCESS VARCHAR2(1 BYTE),
    PHONE_API_MESSAGE VARCHAR2(1000 BYTE),
    ADDRESS_API_SUCCESS VARCHAR2(1 BYTE),
    ADDRESS_API_MESSAGE VARCHAR2(1000 BYTE),
    EMP_FINAL_PROC_API_SUCCESS VARCHAR2(1 BYTE),
    EMP_FINAL_PROC_API_MESSAGE VARCHAR2(1000 BYTE),
    REHIRE_EMPLOYEE_API_SUCCESS VARCHAR2(1 BYTE),
    REHIRE_EMPLOYEE_API_MESSAGE VARCHAR2(1000 BYTE),
    TERM_CWK_API_SUCCESS VARCHAR2(1 BYTE),
    TERM_CWK_API_MESSAGE VARCHAR2(1000 BYTE),
    ETHNIC_API_SUCCESS VARCHAR2(1 BYTE),
    ETHNIC_API_MESSAGE VARCHAR2(1000 BYTE),
    VERSION_NUMBER NUMBER DEFAULT 1 NOT NULL ,
    NOTES_TO_HR VARCHAR2(2000 BYTE),
    SSN VARCHAR2(20 BYTE),
    DOB DATE,
    BENEFIT_GROUP VARCHAR2(30 BYTE),
    ASSIGNMENT_GROUP VARCHAR2(30 BYTE),
    ADJUSTED_SERVICE_DATE DATE,
    ADDRESS_LINE_3 VARCHAR2(100 BYTE),
    REA VARCHAR2(30 BYTE),
    JOB VARCHAR2(50 BYTE),
    PAYROLL VARCHAR2(30 BYTE),
    ASSIGNMENT_STATUS VARCHAR2(50 BYTE),
    ASSIGNMENT_NUMBER VARCHAR2(10 BYTE),
    GRE VARCHAR2(50 BYTE),
    REPORTING_ESTABLISHMENT VARCHAR2(50 BYTE),
    ASSIGNMENT_REASON VARCHAR2(40 BYTE),
    SAL_REASON VARCHAR2(70 BYTE),
    NEW_HIRE_FLAG VARCHAR2(1 BYTE),
    RE_HIRE_FLAG VARCHAR2(1 BYTE),
    HIRE_SCENARIO VARCHAR2(25 BYTE),
    REHIRE_PERSON_TYPE VARCHAR2(25 BYTE)
    )
    TABLESPACE APPS_TS_TX_DATA
    PCTUSED 0
    PCTFREE 10
    INITRANS 1
    MAXTRANS 255
    STORAGE (
    INITIAL 64K
    NEXT 1M
    MINEXTENTS 1
    MAXEXTENTS UNLIMITED
    PCTINCREASE 0
    BUFFER_POOL DEFAULT
    )
    LOGGING
    NOCOMPRESS
    NOCACHE
    MONITORING;

    CREATE OR REPLACE SYNONYM APPS.TEST FOR XXNRECA.TEST;
    CREATE OR REPLACE SYNONYM APPSRO.TEST FOR XXNRECA.TEST;

    — as xxnreca
    GRANT ALTER, DELETE, INDEX, INSERT, REFERENCES, SELECT, UPDATE, ON COMMIT REFRESH, QUERY REWRITE, DEBUG, FLASHBACK ON XXNRECA.TEST TO APPS;

    • OK I think you are not passing the default values for which it is failing. Did you pass “DEFAULT” for null values against where you want to pass your default values?

      Example:

      drop table my_csv_load_test;

      create table my_csv_load_test
      (
      col1 varchar2(100),
      col2 varchar2(100),
      col3 varchar2(100) default ‘Y’ not null
      );

      insert into my_csv_load_test values (‘1′,’first’,null);

      Error starting at line : 10 in command –
      insert into my_csv_load_test values (‘1′,’first’,null)
      Error report –
      ORA-01400: cannot insert NULL into (“SLOBA”.”MY_CSV_LOAD_TEST”.”COL3″)

      insert into my_csv_load_test values (‘1′,’first’,default);

      1 row inserted.

      commit;

      Commit complete.

      select * from my_csv_load_test;

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s