从二类分类到多类分类
转载请注明出处:http://www.cnblogs.com/OldPanda
上次的感知机算法笔记最后提到了将算法用于多类分类,还提供了相关的PPT链接,最近两天随便翻了翻《模式识别》的感知机部分,觉得某些地方比《统计学习方法》要来得晦涩一些,不过书上的内容比较全,其中就有这个将二类分类到多类分类的问题,是一个叫Kesler construction的方法,书上也给出了例题,在之前的PPT上从第35张开始,给的例题也非常详细,所以具体的算法操作过程就不再详述了。
现在就对书上的例题给出解答。
例 在二维空间考虑三类问题。每一类的训练向量如下:
因为不同类的向量位于不同的象限,所以很明显这是一个线性可分问题。
大体的思路是这样的,先将向量扩展到三维空间,然后利用Kesler construction将这9个向量扩展到18个向量,每个向量为9×1大小。相应的权向量为
为了编码方便,我直接将这三个权向量合并成一个1×9的向量w。我们可以通过使这18个特征向量都满足wx>0来运行感知机算法,也就是说,使所有的向量位于决策超平面的同一面。权向量的初值一般随机生成,这里简单起见全部设为0。
同样,给出我的求解代码:
import os
import numpy as np
# An simple example, the training set and parameters' sizes are fixed
training_set = np.array([[(1, 1), 1], [(2, 2), 1], [(2, 1), 1], [(1, -1), 2], [(1, -2), 2], [(2, -2), 2], [(-1, 1), 3], [(-1, 2), 3], [(-2, 1), 3]])
num_of_class = 3
new_features = np.array([]) # features after Kesler construction
w = np.zeros((1, 9)) # initial weights can be obtained by a random generator
learning_rate = 0.5
# extend dimension of features into a higher number, like from 2D to 3D
def extend(feature):
new_feature = np.append(feature, (1))
return new_feature
# build Kesler construction
def k_construct(item):
global new_features
dimension = (len(item[0]) + 1) * num_of_class
extended_feature = extend(item[0])
minus_feature = np.negative(extended_feature)
zero_vector = np.zeros((1, 3))
i = item[1] - 1
res_1 = np.zeros((1, 9))
res_2 = np.zeros((1, 9))
flag = False
for j in range(num_of_class):
if i == j:
res_1[0, (i * num_of_class) : ((i + 1) * num_of_class)] = extended_feature
res_2[0, (i * num_of_class) : ((i + 1) * num_of_class)] = extended_feature
elif flag == False:
res_1[0, (j * num_of_class) : ((j + 1) * num_of_class)] = minus_feature
res_2[0, (j * num_of_class) : ((j + 1) * num_of_class)] = zero_vector
flag = True
else:
res_1[0, (j * num_of_class) : ((j + 1) * num_of_class)] = zero_vector
res_2[0, (j * num_of_class) : ((j + 1) * num_of_class)] = minus_feature
if len(new_features) == 0:
new_features = res_1
new_features = np.vstack((new_features, res_2))
else:
new_features = np.vstack((new_features, res_1))
new_features = np.vstack((new_features, res_2))
# update weights
def update(item):
global w
w += learning_rate * item
# calculate the condition the hyperplane should satisfy
def cal(item):
global w
res = np.dot(item, np.transpose(w))
return sum(res)
# check if the hyperplane can classify the examples correctly
def check():
flag = False
for item in new_features:
if cal(item) <= 0:
flag = True
update(item)
if not flag:
print "RESULT: w: " + str(w)
os._exit(0)
flag = False
if __name__ == "__main__":
for item in training_set:
k_construct(item)
for i in range(1000): # if we check more than 1000 times and can still not get the result, okey, goodbye!
check()
print "The training_set is not linear separable. "
最后可以得到结果RESULT: w: [[ 1. 0.5 -1. 0.5 -1.5 0.5 -1.5 1. 0.5]],w的初值不同,结果也不一样。由于用python画图的技术还不怎么娴熟,就不给结果图了。

浙公网安备 33010602011771号