2.2.3 准备数据:归一化数值

1 def autoNorm(dataSet):
2     minVals = dataSet.min(0)   #min(0)从列中选取最小值,注意参数为0
3     maxVals = dataSet.max(0)    #max(0)从列中选取最大值,注意参数为0
4     ranges = maxVals - minVals   #取值范围
5     normdataSet = zeros(shape(dataSet))
6     m = dataSet.shape[0]  #获取样本条目数
7     normDataSet = dataSet - tile(minVals,(m,1))   #tile函数获得一个m行阵列:[minVals,minVals,...]
8     normDataSet = normDataSet/tile(ranges,(m,1))   #特征值相除,得到归一化结果
9     return normDataSet,ranges,minVals    

 

下列变量:normMat:  特征值归一化结果, shape:(1000,3)

                  ranges:每列特征值的范围(最大值-最小值),shape:(3,), 因为有3个特征列

                  minVals:每个特征列的最小值,shape:(3,)

                  

 1 normMat,ranges,minVals = kNN.autoNorm(datingDataMat)
 2 
 3 ranges   
 4 Out[174]: array([9.1273000e+04, 2.0919349e+01, 1.6943610e+00])
 5 
 6 normMat
 7 Out[175]: 
 8 array([[0.44832535, 0.39805139, 0.56233353],
 9        [0.15873259, 0.34195467, 0.98724416],
10        [0.28542943, 0.06892523, 0.47449629],
11        ...,
12        [0.29115949, 0.50910294, 0.51079493],
13        [0.52711097, 0.43665451, 0.4290048 ],
14        [0.47940793, 0.3768091 , 0.78571804]])
15 
16 len(normMat)
17 Out[176]: 1000
18 
19 ranges
20 Out[177]: array([9.1273000e+04, 2.0919349e+01, 1.6943610e+00])
21 
22 minVals
23 Out[178]: array([0.      , 0.      , 0.001156])

 

posted @ 2021-07-19 16:59  xinxinmama  阅读(128)  评论(0)    收藏  举报