在R中使用支持向量机(SVM)( 2.Kernlab包)

包里函数ksvm()通过.Call接口,使用bsvm和libsvm库中的优化方法,得以实现svm算法。对于分类,有C-SVM分类算法和v-SVM分类算法,同时还包括C分类器的有界约束的版本。对于回归,提供了ε-SVM回归算法和v-SVM回归算法。对于多类分类,有一对一(one-against-one)方法和原生多类分类方法,下面将会有介绍。例如

> library("kernlab") #导入包

> data("iris") #导入数据集iris

> irismodel <- ksvm(Species ~ ., data = iris,

+ type = "C-bsvc", kernel = "rbfdot",

+ kpar = list(sigma = 0.1), C = 10,

+ prob.model = TRUE) #训练

其中,type表示是用于分类还是回归,还是检测,取决于y是否是一个因子。缺省取C-svceps-svr。可取值有

• C-svc  C classification

• nu-svc  nu classification

• C-bsvc  bound-constraint svm classification

• spoc-svc  Crammer, Singer native multi-class

• kbb-svc  Weston, Watkins native multi-class

• one-svc  novelty detection

• eps-svr  epsilon regression

• nu-svr  nu regression

• eps-bsvr bound-constraint svm regression

Kernel设置核函数。可设核函数有

• rbfdot Radial Basis kernel "Gaussian"

• polydot  Polynomial kernel

• vanilladot  Linear kernel

• tanhdot  Hyperbolic tangent kernel

• laplacedot  Laplacian kernel

• besseldot  Bessel kernel

• anovadot  ANOVA RBF kernel

• splinedot  Spline kernel

• stringdot  String kernel

> irismodel

Support Vector Machine object of class "ksvm"

SV type: C-bsvc (classification)

parameter : cost C = 10

Gaussian Radial Basis kernel function.

Hyperparameter : sigma = 0.1

Number of Support Vectors : 32

Training error : 0.02

Probability model included.

>predict(irismodel, iris[c(3, 10, 56, 68, 107, 120), -5], type = "probabilities")

setosa   versicolor  virginica

[1,] 0.986432820 0.007359407 0.006207773

[2,] 0.983323813 0.010118992 0.006557195

[3,] 0.004852528 0.967555126 0.027592346

[4,] 0.009546823 0.988496724 0.001956452

[5,] 0.012767340 0.069496029 0.917736631

[6,] 0.011548176 0.150035384 0.838416441

Ksvm支持自定义核函数。如

>k <- function(x, y) { (sum(x * y) + 1) * exp(0.001 * sum((x - y)^2)) }

> class(k) <- "kernel"

> data("promotergene")

> gene <- ksvm(Class ~ ., data = promotergene, kernel = k, C = 10, cross = 5)#训练

> gene

Support Vector Machine object of class "ksvm"

SV type: C-svc (classification)

parameter : cost C = 10

Number of Support Vectors : 66

Training error : 0

Cross validation error : 0.141558

对于二分类问题,可以对结果用plot()进行可视化。例子如下

>x <- rbind(matrix(rnorm(120), , 2), matrix(rnorm(120, mean = 3), , 2))

> y <- matrix(c(rep(1, 60), rep(-1, 60)))

> svp <- ksvm(x, y, type = "C-svc", kernel = "rbfdot", kpar = list(sigma = 2))

> plot(svp)

包的连接地址在这里http://cran.r-project.org/web/packages/kernlab/index.html

posted on 2009-03-16 18:28  zgw21cn  阅读(14425)  评论(2编辑  收藏  举报

导航