【820】Python R 读取 csv 文件加入数据类型控制

Python：根据具体的列名指定数据格式

import pandas as pd
# the column of "id" will be stored as "string", otherwise it will be stored as "int", maybe
pd.read_csv("df.csv", dtype={"id": str})

R：用缩写代替具体的列的属性

df <- readr::read_csv("df_eu_sim.csv", col_types = "Ddc")

具体如下：

col_types

One of NULL, a cols() specification, or a string. See vignette("readr") for more details.

If NULL, all column types will be inferred from guess_max rows of the input, interspersed throughout the file. This is convenient (and fast), but not robust. If the guessed types are wrong, you'll need to increase guess_max or supply the correct types yourself.

Column specifications created by list() or cols() must contain one column specification for each column. If you only want to read a subset of the columns, use cols_only().

Alternatively, you can use a compact string representation where each character represents one column:

c = character
i = integer
n = number
d = double
l = logical
f = factor
D = date
T = date time
t = time
? = guess
_ or - = skip

posted on 2023-03-13 17:53 McDelfino 阅读(94) 评论(0) 收藏举报

刷新页面返回顶部

alex_bn_lee

导航

公告