python3 词法拆分

1.可以利用translate+string模块

2.可以利用jieba进行分词(结巴分词会分成词,但是我需要断句,所以这里不用)

3.利用python内置函数解决

仅仅只列出第3种方法,其他两种方法更加简单点所以就不列出来了。上代码:

 1 s = input("搜一搜:")
 2 py = zw = sz = fh = ''
 3 py_list = []
 4 zw_list = []
 5 sz_list = []
 6 fh_list = []
 7 py_num = zw_num = sz_num = fh_num = 0
 8 for i in range(len(s)):
 9     if s[i].encode('UTF-8').isalpha():
10         if(i == py_num or i-py_num==1 or py == ''):
11             py += s[i]
12             py_num = i
13         else:
14             py_list.append(py)
15             py = ''+s[i]
16             py_num = i
17     elif (s[i].isdigit()):
18         if (i == sz_num or i - sz_num == 1 or sz == ''):
19             sz += s[i]
20             sz_num = i
21         else:
22             sz_list.append(sz)
23             sz = '' + s[i]
24             sz_num = i
25     elif (s[i].isalpha()):
26         if (i == zw_num or i - zw_num == 1 or zw == ''):
27             zw += s[i]
28             zw_num = i
29         else:
30             zw_list.append(zw)
31             zw = '' + s[i]
32             zw_num = i
33     else:
34         if (i == fh_num or i - fh_num == 1 or fh == ''):
35             fh += s[i]
36             fh_num = i
37         else:
38             fh_list.append(fh)
39             fh = '' + s[i]
40             fh_num = i
41 if py not in py_list:
42     py_list.append(py)
43 if sz not in sz_list:
44     sz_list.append(sz)
45 if zw_list not in zw_list:
46     zw_list.append(zw)
47 if fh not in fh_list:
48     fh_list.append(fh)
49 print('数字:{}\n中文:{}\n拼音:{}\n符号:{}\n'.format(''.join(sz_list),''.join(zw_list),''.join(py_list),''.join(fh_list)))

 

posted @ 2019-02-28 10:58  耳虫  阅读(866)  评论(0编辑  收藏  举报