csv格式的数据导入到oracle数据库

系统环境:centos6.5

oracle版本:oracle11gr2

     由于客户导出的格式是csv格式的,并且数据量比较大,两千多万,使用plsql不太合适。考虑使用oracle客户端工具sqlldr。

     如果提供的有表结构最好,直接导入创建表就行了。如果没有提供,可以先使用wps打开csv文件,根据内容自己去创建表结构。

使用sqlldr有几种导入方式,这里使用direct直接路径加并行的模式。

  1.    先在数据库创建好空表。
  2. 创建ctl文件
options(skip=1,columnarrayrows=20971520,ROWS=10000,READSIZE=20971520,ERRORS=999999999) 
load data
infile '/ora11/tmp/zz.csv' 
append into table "*****_RECORD" 
fields terminated by ','
Optionally enclosed by '"'
(id,record_organ_code,record_organ_name)

1行是导入参数配置,
已经是支持大量数据导入的参数方案。

第3行infile指定导入的文件是USER_INFO.csv;

第4行into table前面的insert表示导入方式:

insert :默认方式,在导入记录前要求表为空;
append :在表中追加新导入的记录;
replace :删除旧记录(等价delete from table语句),替换成新导入的记录;
truncate:删除旧记录(等价truncate table语句),替换成新导入的记录;
into table后面指定导入数据库表USER_INFO,
且表名必须大写;
第5行指定每一行的字段是以逗号(,)分隔;

第6行指定字段是用两个分号(')包围起来的,可选的;

最后一行对应导入的字段,
注意如果导入的是时间字段,
需要指明时间转换的格式(在这里有坑,要注意格式。本人在实施的过程中已经被坑过了。格式不对会报

SQL*Loader-951: Error calling once/load initialization
ORA-02373: Error parsing insert statement for table ESNS."JWXZZ_COVID_RECORD".
ORA-00936: missing expression

 

)。

正确转换格式写法

birth_date date "yyyy-mm-dd",

 

    3、使用sqlldr导入

sqlldr userid=esns/***** control=/ora11/tmp/loadfile_csv.ctl direct=true parallel=true

4、导入的相关日志

SQL*Loader: Release 11.2.0.1.0 - Production on Sun Feb 27 23:39:42 2022

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Control File:   /ora11/tmp/loadfile_csv.ctl
Data File:      /ora11/tmp/zz.csv
  Bad File:     /ora11/tmp/zz.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 999999999
Continuation:    none specified
Path used:      Direct - with parallel option.

Table "JWXZZ_COVID_RECORD", loaded from every logical record.
Insert option in effect for this table: APPEND

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   ,  O(") CHARACTER            
RECORD_ORGAN_CODE                    NEXT     *   ,  O(") CHARACTER            
RECORD_ORGAN_NAME                    NEXT     *   ,  O(") CHARACTER            
CHILD_CODE                           NEXT     *   ,  O(") CHARACTER            
CHILD_NAME                           NEXT     *   ,  O(") CHARACTER            
GENDAR                               NEXT     *   ,  O(") CHARACTER            
            

SQL*Loader-281: Warning: ROWS parameter ignored in parallel mode.


Table "********_RECORD":
  25730788 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :   10000
Stream buffer bytes:  256000
Read   buffer bytes:20971520

Total logical records skipped:          1
Total logical records read:      25730788
Total logical records rejected:         0
Total logical records discarded:        0
Total stream buffers loaded by SQL*Loader main thread:     2786
Total stream buffers loaded by SQL*Loader load thread:    24336

Run began on Sun Feb 27 23:39:42 2022
Run ended on Sun Feb 27 23:43:52 2022

Elapsed time was:     00:04:09.46
CPU time was:         00:02:55.86

参考文档&感谢:https://blog.csdn.net/bugzeroman/article/details/103031317

posted @ 2022-02-27 17:51  wazz_s  阅读(1932)  评论(0编辑  收藏  举报