Sunday, February 16, 2014

code snippet: extracting weights from the linear svm in R

Here's a few snippets of code, showing how to extract the weights, decision hyperplane, and distance to the hyperplane from a linear SVM fit in R (e1071 interface to libsvm).

The setup: train.data and test.data are matrices with examples in the rows and voxels in the columns (i.e. two non-intersecting subsets of the entire dataset, making up a single cross-validation fold). The first column in each (named "target") contains the class labels, and the rest of the columns ("vox1", "vox2", ...) have the BOLD activation values.

fit (train) the SVM using train.data:
fit <- svm(target~., data=train.data, type="C-classification", kernel="linear", cost=1, scale=FALSE); 

extract the weights and constant from the fit SVM:
w <- t(fit$coefs) %*% fit$SV; 
b <- -1 * fit$rho; (sometimes called w0)

Now, the equation of the decision hyperplane is 0 = b + w1*vox1 + w2*vox2 ...

The distance of each point (test case) from the hyperplane can be calculated as usual for a point-plane distance:
((w %*% t(test.data[i,2:ncol(test.data)])) + b) / sqrt(w %*% t(w));

The class of any test point is determined by sign((w * x) + b):

sign((w %*% test.data[,i]) + b)

1 comment:

  1. Could this be used to approximate beta weights (like in a GLM regression)?

    ReplyDelete