pandas里使用map函数并且是自定义函数来实现规则的方式def

 

 现在我的问题是,有一堆数据,用一些字符串开头,然后换成数字

 

 

原本我用的死办法,直接先做字典,然后用字典的内容作为map的映射规则

但是由于这些字符太多了,做字典很容易出错,导致映射出来很多NaN值,所以改用了新方法

使用函数作为映射规则,而函数再进行判断的时候,可以就用开头的字符串进行判断,准确率提升很多

#这几列看起来像是学生家长的职业信息,考虑是否需要需要数字化它,只用后面的学生家长职业的大类能否达到处理结果
#认识到处理这些数据本身,其实就是一个过程


def fx_asbg20(x):
    if x.startswith('Not applicable'): 
        return 0
    elif x.startswith('Has never worked'): 
        return 1
    elif x.startswith( 'Small Business Owner'):
        return 2
    elif x.startswith( 'Clerical Worker Includes'):
        return 3
    elif x.startswith('Service or Sales Worker'):
        return 4
    elif x.startswith('Skilled Agricultural'):
        return 5
    elif x.startswith('Craft or Trade Worker'):
        return 6
    elif x.startswith('Plant or Machine Operator'):
        return 7
    elif x.startswith('General Laborers'):
        return 8
    elif x.startswith('Corporate Manager'):
        return 9
    elif x.startswith('Professional Includes scientists'):
        return 10
    elif x.startswith('Technician or Associate Professional'):
        return 11

#dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(mapASBH20)
dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(fx_asbg20)

#dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(mapASBH20)
dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(fx_asbg20)

dfASH_avg.head()

原来写的复杂字典,费时又费力,效率还差,纪念一下

mapASBH20={'Has never worked for pay':1,
            'Small Business Owner Includes owners of small businesses (fewer than 25 employees) such as retail shops, services, resta':2,
            'Clerical Worker Includes office clerks; secretaries; typists; data entry operators;customer service cler':3,
            ' Service or Sales Worker Includes travel attendants; restaurant service workers;personal care workers; protective service workers; junior military; salespersons; street vend':4,
            'Skilled Agricultural or Fishery Worker Includes farmers; forestry workers; fishery workers; hunters and trapper':5,
            'Craft or Trade Worker Includes builders, carpenters, plumbers, electricians,metal workers; machine mechanics; handicraft workers':6,
            ' Plant or Machine Operator Includes plant and machine operators;assembly-line operators; motor-vehicle drivers':7,
            'General Laborers Includes domestic helpers and cleaners; building caretakers;messengers, porters, and doorkeepers; farm, fishery,agricultural, and construction workers':8,
            'Corporate Manager or Senior Official Includes corporate managers such as managers of large companies (25 or more employees) or managers of departments within large companies; legislators or senior government officials; senior officials of special-interest organizations; military officers':9,
            'Professional Includes scientists; mathematicians; computer scientists;architects; engineers; life science and health professionals;teachers; legal professionals; police officers; social scientists;writers and artists; religious professionals':10,
            'Technician or Associate Professional Includes science, engineering, and computer associates and technicians; life science and health technicians and assistants;teacher aides; finance and sales associate professionals;business service agents; administrative assistants':11,
            'Not applicable':0
    
    
}

 

posted @ 2023-02-14 17:56  今天也是开心的一天呀  阅读(95)  评论(0)    收藏  举报