pandas里使用map函数并且是自定义函数来实现规则的方式def
现在我的问题是,有一堆数据,用一些字符串开头,然后换成数字

原本我用的死办法,直接先做字典,然后用字典的内容作为map的映射规则
但是由于这些字符太多了,做字典很容易出错,导致映射出来很多NaN值,所以改用了新方法
使用函数作为映射规则,而函数再进行判断的时候,可以就用开头的字符串进行判断,准确率提升很多
#这几列看起来像是学生家长的职业信息,考虑是否需要需要数字化它,只用后面的学生家长职业的大类能否达到处理结果 #认识到处理这些数据本身,其实就是一个过程 def fx_asbg20(x): if x.startswith('Not applicable'): return 0 elif x.startswith('Has never worked'): return 1 elif x.startswith( 'Small Business Owner'): return 2 elif x.startswith( 'Clerical Worker Includes'): return 3 elif x.startswith('Service or Sales Worker'): return 4 elif x.startswith('Skilled Agricultural'): return 5 elif x.startswith('Craft or Trade Worker'): return 6 elif x.startswith('Plant or Machine Operator'): return 7 elif x.startswith('General Laborers'): return 8 elif x.startswith('Corporate Manager'): return 9 elif x.startswith('Professional Includes scientists'): return 10 elif x.startswith('Technician or Associate Professional'): return 11 #dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(mapASBH20) dfASH_avg['ASBH20A']=dfASH_avg['ASBH20A'].map(fx_asbg20) #dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(mapASBH20) dfASH_avg['ASBH20B']=dfASH_avg['ASBH20B'].map(fx_asbg20) dfASH_avg.head()
原来写的复杂字典,费时又费力,效率还差,纪念一下
mapASBH20={'Has never worked for pay':1,
'Small Business Owner Includes owners of small businesses (fewer than 25 employees) such as retail shops, services, resta':2,
'Clerical Worker Includes office clerks; secretaries; typists; data entry operators;customer service cler':3,
' Service or Sales Worker Includes travel attendants; restaurant service workers;personal care workers; protective service workers; junior military; salespersons; street vend':4,
'Skilled Agricultural or Fishery Worker Includes farmers; forestry workers; fishery workers; hunters and trapper':5,
'Craft or Trade Worker Includes builders, carpenters, plumbers, electricians,metal workers; machine mechanics; handicraft workers':6,
' Plant or Machine Operator Includes plant and machine operators;assembly-line operators; motor-vehicle drivers':7,
'General Laborers Includes domestic helpers and cleaners; building caretakers;messengers, porters, and doorkeepers; farm, fishery,agricultural, and construction workers':8,
'Corporate Manager or Senior Official Includes corporate managers such as managers of large companies (25 or more employees) or managers of departments within large companies; legislators or senior government officials; senior officials of special-interest organizations; military officers':9,
'Professional Includes scientists; mathematicians; computer scientists;architects; engineers; life science and health professionals;teachers; legal professionals; police officers; social scientists;writers and artists; religious professionals':10,
'Technician or Associate Professional Includes science, engineering, and computer associates and technicians; life science and health technicians and assistants;teacher aides; finance and sales associate professionals;business service agents; administrative assistants':11,
'Not applicable':0
}
浙公网安备 33010602011771号