(转)谈论 读取Excel丢失内容的问题

原文出自:http://freeoh.spaces.live.com/blog/cns!57853FFFBF81D887!1185.entry?wa=wsignin1.0

引用 :读取Excel丢失内容的问题

通常读取Excel文件的方法有两种,一是通过ADO的OleDb,一是通过Com组件;
这里要说的是通过OleDb读取时,遇到的内容丢失问题。
症状:使用Microsoft.Jet.OLEDB.4.0读取数据时,当某一字段内分别含有文本和数字的混合数据时,某一类型的数据会产生丢失。
读取方式:
string connString = " Provider=Microsoft.Jet.OLEDB.4.0; Data Source=文件名; Extended Properties='Excel 8.0;'";
OleDbConnection conn = new OleDbConnection(connString);
string sql = " SELECT * FROM [Sheet1$] ";
conn.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter(sql, conn);
DataSet ds = new DataSet();
adapter.Fill(ds, "[Sheet1$]");
案例1
Excel数据如下:(一共9行)
"入库指令单号","SF公司","物料编号","实收数量","包装","批号","生产日期","入库时间","PO#","状态"
"11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
"11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
......
"11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
"11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
读取结果并不理想,第9行的xxxx数据丢失了
案例2
Excel数据如下:(一共9行)
"入库指令单号","SF公司","物料编号","实收数量","包装","批号","生产日期","入库时间","PO#","状态"
"11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
"11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
......
"11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
"11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
读取结果也不理想,第1行和第9行的xxxx数据都丢失了

问题分析:
  产生这种问题的根源与Excel ISAM(Indexed Sequential Access Method,即索引顺序存取方法)驱动程序的限制有关,Excel ISAM 驱动程序通过检查前几行中实际值确定一个 Excel 列的类型,然后选择能够代表其样本中大部分值的数据类型。也即Excel ISAM查找某列前几行(默认情况下是8行),把占多的类型作为其处理类型。例如如果数字占多,那么其它含有字母等文本的数据项就会置空;相反如果文本居多,纯数字的数据项就会被置空。
现具体分析Extended Properties项中的HDR和IMEX所代表的含义。HDR用来设置是否将Excel表中第一行作为字段名,“YES”代表是,“NO”代表不是即也为数据内容;IMEX是用来告诉驱动程序使用Excel文件的模式,其值有0、1、2三种,分别代表导出、导入、混合模式。当我们设置IMEX=1时将强制混合数据转换为文本,但仅仅这种设置并不可靠,IMEX=1只确保在某列前8行数据至少有一个是文本项的时候才起作用,它只是把查找前8行数据中数据类型占优选择的行为作了略微的改变。例如某列前8行数据全为纯数字,那么它仍然以数字类型作为该列的数据类型,随后行里的含有文本的数据仍然变空。

解决方案1
前提,你的Excel包括Header,且Header都是文本,如案例1和案例2
使用:Provider=Microsoft.Jet.OLEDB.4.0; Data Source=文件名; Extended Properties='Excel 8.0;HDR=NO;IMEX=1'
IMEX=1将强制混合数据转换为文本,HDR=NO将第一行作为内容,由于第一行Header都是文本,因此所有列的类型都将转换成文本。
解决方案2
如果你的Excel不包含Header,且某列的数据又是混合数据,这种情况只能通过Com组件来读取Excel。

使用OleDb读取Excel的条件总结:
只有一条,保证Excel中每一列的前8行数据中至少有一个文本。(多适用于有Header的内容)

posted @ 2008-09-09 21:44  代码乱了  阅读(1369)  评论(3编辑  收藏  举报