sklearn--feature extract--人脸识别

1.原始数据加载

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_lfw_people
people=fetch_lfw_people(min_faces_per_person=20,resize=0.7)
image_shapes=people.images[0].shape
fig,axes=plt.subplots(2,5,figsize=(15,8),subplot_kw={'xticks':(),'yticks':()})
for target,image,ax in zip(people.target,people.images,axes.ravel()):
	ax.imshow(image)
	ax.set_title(people.target_names[target])
  • fetch_lfw_people库需要从官网下载
  • image_shape输出图像类型为87*65像素
  • 原始数据集为字典类型,可通过点语法访问其键值,如target,image,target_names
  • axes.ravel()可访问所有子图
In [20]: people.images.shape
Out[20]: (2341, 87, 65)

In [21]: len(people.target_names)
Out[21]: 39
  • 共计2341张照片
  • 每张图片大小87*65
  • 属于39个人
  • 每张图片都有target属性作为标记

2.预处理

为了使数据减少倾斜性,每个人只取小于等于50张照片,也是为了减少某个人童年时期照片太多而造成的过拟合

    import numpy as np
    mask=np.zeros(people.target.shape,dtype=np.bool)
    for target in np.unique(people.target):
	    mask[np.where(people.target==target)[0][:50]]=1
    X_people=people.data[mask]
    y_people=people.target[mask]
    X_people=X_people/255

mask是对所有对应图片的掩膜处理,类型为bool,默认值为false
X_people

    In [32]: X_people
    Out[32]:
    array([[ 0.24313726,  0.2379085 ,  0.20261438, ...,  0.70065361,
         0.66013068,  0.64313728],
       [ 0.31633985,  0.32287583,  0.39084965, ...,  0.20522875,
         0.20522875,  0.21045752],
       [ 0.78954244,  0.78562087,  0.77908498, ...,  0.05359477,
         0.05359477,  0.05228758],
       ...,
       [ 0.15163399,  0.15294118,  0.15294118, ...,  0.19346404,
         0.16862746,  0.16732027],
       [ 0.36732024,  0.4130719 ,  0.44705883, ...,  0.94901961,
         0.95816994,  0.96732026],
       [ 0.07843138,  0.08496732,  0.11633987, ...,  0.51764709,
         0.53202617,  0.53464049]], dtype=float32)

y_people

    In [34]: y_people
    Out[34]: array([19, 10,  6, ...,  0,  3, 12])
posted @ 2018-01-05 15:36  天波-风客  阅读(1074)  评论(0编辑  收藏  举报