python 读写 HDFS

pandas dataframe写入hdfs csv文件的两种方式:

1、

from hdfs.client import Client

cleint.write(hdfs_url, df.to_csv(idnex=False), overwrite=True, encoding='utf-8')

2、

with client.write(hdfs_url, overwrite=True) as writer:

  df.to_csv(writer, encoding='utf-8', index=False)

推荐使用方法二,写入效率要比方法一高得多。

 

从hdfs读文本数据
from hdfs.client import Client

client = Client("http://localhost:50070")

filepath="test.txt"
with client.read(filepath) as fs:
  content = fs.read()
  print(content)

 

从hdfs读excel

with client.read(filepath) as fs:
  content = fs.read()
  table = pd.read_excel(content)

posted @ 2020-12-08 17:24  zcsh  阅读(2571)  评论(0编辑  收藏  举报