英文文章分词处理

s = '''
PREFACE 
  So long as there shall exist, by virtue of law and custom, decrees of damnation pronounced by society, artificially creating hells amid the civilization of earth, and adding the element of human fate to divine destiny; so long as the three great problems of the century-- the degradation of man through pauperism, the corruption of woman through hunger, the crippling of children through lack of light-- are unsolved; so long as social asphyxia is possible in any part of the world;--in other words, and with a still wider significance, so long as ignorance and poverty exist on earth, books of the nature of Les Miserables cannot fail to be of use. 
  HAUTEVILLE HOUSE, 1862.
Victor Hugo
'''
special_char = [',','.','-','\n',';']
zllist = []

for item in special_char:
    if item in s:
        s = s.replace(item, " ")

for item in s.split(" "):
    if len(item)>0:
        zllist.append(item)

for index, item in enumerate(zllist):
    print(index, item.strip())

 

posted @ 2019-06-20 01:11  n0page404  阅读(172)  评论(0)    收藏  举报