pandas 的unicode转换成中文
需要将unicode转换成中文
import pandas as pd
data = [{'ds': 20200621, 'tags_name': '(\\u6253\\u91ce")"'},
{'ds': 20200621, 'tags_name': '(10-19\\u5e01""'},
{'ds': 20200621, 'tags_name': '(10-19\\u5e01")"'},
{'ds': 20200621, 'tags_name': '(11-20\\u5e01""'},
{'ds': 20200621, 'tags_name': '(11-20\\u5e01")"'}]
df = pd.DataFrame(data)
df['tags_name'] = df['tags_name'].str.replace(' ','')
df['tags_name'] = df['tags_name'].str.replace(r'U',r'u',regex=True)
df['tags_name'] = df['tags_name'].apply(lambda x:x.encode('utf-8').decode('unicode_escape'))
难点:
- 将unicode转换成中文:
x.encode('utf-8').decode('unicode_escape') - python需要将
\U替换成\u


浙公网安备 33010602011771号