CompactClassificationSVM
Compact support vector machine (SVM) for one-class and binary classification
Description
CompactClassificationSVM
is a compact version of the support vector machine (SVM) classifier. The compact classifier does not include the data used for training the SVM classifier. Therefore, you cannot perform some tasks, such as cross-validation, using the compact classifier. Use a compact SVM classifier for tasks such as predicting the labels of new data.
Creation
Create aCompactClassificationSVM
model from a full, trainedClassificationSVM
classifier by usingcompact
.
Properties
SVM Properties
Alpha
—Trained classifier coefficients
numeric vector
This property is read-only.
Trained classifier coefficients, specified as ans-by-1 numeric vector.sis the number of support vectors in the trained classifier,sum(Mdl.IsSupportVector)
.
Alpha
contains the trained classifier coefficients from the dual problem, that is, the estimated Lagrange multipliers. If you remove duplicates by using theRemoveDuplicates
名称-值pair argument offitcsvm
, then for a given set of duplicate observations that are support vectors,Alpha
contains one coefficient corresponding to the entire set. That is, MATLAB®attributes a nonzero coefficient to one observation from the set of duplicates and a coefficient of0
to all other duplicate observations in the set.
Data Types:single
|double
Beta
—Linear predictor coefficients
numeric vector
This property is read-only.
Linear predictor coefficients, specified as a numeric vector. The length ofBeta
is equal to the number of predictors used to train the model.
MATLAB expands categorical variables in the predictor data using full dummy encoding. That is, MATLAB creates one dummy variable for each level of each categorical variable.Beta
大的es one value for each predictor variable, including the dummy variables. For example, if there are three predictors, one of which is a categorical variable with three levels, thenBeta
is a numeric vector containing five values.
IfKernelParameters.Function
is'linear'
, then the classification score for the observationxis
Mdl
大的esβ,b, andsin the propertiesBeta
,Bias
, andKernelParameters.Scale
, respectively.
To estimate classification scores manually, you must first apply any transformations to the predictor data that were applied during training. Specifically, if you specify'Standardize',true
when usingfitcsvm
, then you must standardize the predictor data manually by using the meanMdl.Mu
and standard deviationMdl.Sigma
, and then divide the result by the kernel scale inMdl.KernelParameters.Scale
.
All SVM functions, such asresubPredict
andpredict
, apply any required transformation before estimation.
IfKernelParameters.Function
is not'linear'
, thenBeta
is empty ([]
).
Data Types:single
|double
Bias
—Bias term
scalar
This property is read-only.
Bias term, specified as a scalar.
Data Types:single
|double
KernelParameters
—Kernel parameters
structure array
This property is read-only.
Kernel parameters, specified as a structure array. The kernel parameters property contains the fields listed in this table.
Field | Description |
---|---|
Function | Kernel function used to compute the elements of theGram matrix. For details, see |
Scale | Kernel scale parameter used to scale all elements of the predictor data on which the model is trained. For details, see |
To display the values ofKernelParameters
, use dot notation. For example,Mdl.KernelParameters.Scale
displays the kernel scale parameter value.
The software acceptsKernelParameters
as inputs and does not modify them.
Data Types:struct
SupportVectorLabels
—Support vector class labels
s-by-1 numeric vector
This property is read-only.
Support vector class labels, specified as ans-by-1 numeric vector.sis the number of support vectors in the trained classifier,sum(Mdl.IsSupportVector)
.
A value of+1
inSupportVectorLabels
indicates that the corresponding support vector is in the positive class (ClassNames{2}
). A value of–1
indicates that the corresponding support vector is in the negative class (ClassNames{1}
).
If you remove duplicates by using theRemoveDuplicates
名称-值pair argument offitcsvm
, then for a given set of duplicate observations that are support vectors,SupportVectorLabels
contains one unique support vector label.
Data Types:single
|double
SupportVectors
—Support vectors
s-by-p数字矩阵
This property is read-only.
Support vectors in the trained classifier, specified as ans-by-p数字矩阵.sis the number of support vectors in the trained classifier,sum(Mdl.IsSupportVector)
, andpis the number of predictor variables in the predictor data.
SupportVectors
contains rows of the predictor dataX
that MATLAB considers to be support vectors. If you specify'Standardize',true
when training the SVM classifier usingfitcsvm
, thenSupportVectors
contains the standardized rows ofX
.
If you remove duplicates by using theRemoveDuplicates
名称-值pair argument offitcsvm
, then for a given set of duplicate observations that are support vectors,SupportVectors
contains one unique support vector.
Data Types:single
|double
Other Classification Properties
CategoricalPredictors
—Categorical predictor indices
vector of positive integers|[]
This property is read-only.
Categorical predictor indices, specified as a vector of positive integers.CategoricalPredictors
contains index values indicating that the corresponding predictors are categorical. The index values are between 1 andp
, wherep
is the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty ([]
).
Data Types:double
ClassNames
—Unique class labels
categorical array|character array|logical vector|numeric vector|cell array of character vectors
This property is read-only.
Unique class labels used in training, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors.ClassNames
has the same data type as the class labelsY
.(The software treats string arrays as cell arrays of character vectors.)ClassNames
also determines the class order.
Data Types:single
|double
|logical
|char
|cell
|categorical
Cost
—Misclassification cost
numeric square matrix
This property is read-only.
Misclassification cost, specified as a numeric square matrix.
For two-class learning, the
Cost
property stores the misclassification cost matrix specified by theCost
名称-值argument of the fitting function. The rows correspond to the true class and the columns correspond to the predicted class. That is,Cost(i,j)
is the cost of classifying a point into classj
if its true class isi
. The order of the rows and columns ofCost
corresponds to the order of the classes inClassNames
.For one-class learning,
Cost = 0
.
Data Types:double
ExpandedPredictorNames
—Expanded predictor names
cell array of character vectors
This property is read-only.
Expanded predictor names, specified as a cell array of character vectors.
If the model uses dummy variable encoding for categorical variables, thenExpandedPredictorNames
包括名称,描述了扩展变化ables. Otherwise,ExpandedPredictorNames
is the same asPredictorNames
.
Data Types:cell
Mu
—Predictor means
numeric vector|[]
This property is read-only.
Predictor means, specified as a numeric vector. If you specify'Standardize',1
or'Standardize',true
when you train an SVM classifier usingfitcsvm
, the length ofMu
is equal to the number of predictors.
MATLAB expands categorical variables in the predictor data using dummy variables.Mu
大的es one value for each predictor variable, including the dummy variables. However, MATLAB does not standardize the columns that contain categorical variables.
If you set'Standardize',false
当你训练SVM分类器使用fitcsvm
, thenMu
is an empty vector ([]
).
Data Types:single
|double
PredictorNames
—Predictor variable names
cell array of character vectors
This property is read-only.
Predictor variable names, specified as a cell array of character vectors. The order of the elements ofPredictorNames
corresponds to the order in which the predictor names appear in the training data.
Data Types:cell
Prior
—Prior probabilities
numeric vector
This property is read-only.
Prior probabilities for each class, specified as a numeric vector.
For two-class learning, if you specify a cost matrix, then the software updates the prior probabilities by incorporating the penalties described in the cost matrix.
For two-class learning, the software normalizes the prior probabilities specified by the
Prior
名称-值argument of the fitting function so that the probabilities sum to 1. ThePrior
属性存储规范化prior probabilities. The order of the elements ofPrior
corresponds to the elements ofMdl.ClassNames
.For one-class learning,
Prior = 1
.
Data Types:single
|double
ScoreTransform
—Score transformation
character vector|function handle
Score transformation, specified as a character vector or function handle.ScoreTransform
represents a built-in transformation function or a function handle for transforming predicted classification scores.
To change the score transformation function tofunction
, for example, use dot notation.
For a built-in function, enter a character vector.
Mdl.ScoreTransform = 'function';
This table describes the available built-in functions.
Value Description 'doublelogit'
1/(1 +e–2x) 'invlogit'
log(x/ (1 –x)) 'ismax'
Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0 'logit'
1/(1 +e–x) 'none'
or'identity'
x(no transformation) 'sign'
–1 forx< 0
0 forx= 0
1 forx> 0'symmetric'
2x– 1 'symmetricismax'
Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1 'symmetriclogit'
2/(1 +e–x) – 1 For a MATLAB function or a function that you define, enter its function handle.
Mdl.ScoreTransform = @function;
function
must accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).
Data Types:char
|function_handle
Sigma
—Predictor standard deviations
[]
(default) |numeric vector
This property is read-only.
Predictor standard deviations, specified as a numeric vector.
If you specify'Standardize',true
当你训练SVM分类器使用fitcsvm
, the length ofSigma
is equal to the number of predictor variables.
MATLAB expands categorical variables in the predictor data using dummy variables.Sigma
大的es one value for each predictor variable, including the dummy variables. However, MATLAB does not standardize the columns that contain categorical variables.
If you set'Standardize',false
当你训练SVM分类器使用fitcsvm
, thenSigma
is an empty vector ([]
).
Data Types:single
|double
Object Functions
compareHoldout |
Compare accuracies of two classification models using new data |
discardSupportVectors |
Discard support vectors for linear support vector machine (SVM) classifier |
edge |
Find classification edge for support vector machine (SVM) classifier |
fitPosterior |
Fit posterior probabilities for compact support vector machine (SVM) classifier |
gather |
Gather properties ofStatistics and Machine Learning Toolboxobject from GPU |
incrementalLearner |
Convert binary classification support vector machine (SVM) model to incremental learner |
lime |
Local interpretable model-agnostic explanations (LIME) |
loss |
Find classification error for support vector machine (SVM) classifier |
margin |
Find classification margins for support vector machine (SVM) classifier |
partialDependence |
Compute partial dependence |
plotPartialDependence |
Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots |
predict |
Classify observations using support vector machine (SVM) classifier |
shapley |
Shapley values |
update |
Update model parameters for code generation |
Examples
Reduce Size of SVM Classifier
Reduce the size of a full support vector machine (SVM) classifier by removing the training data. Full SVM classifiers (that is,ClassificationSVM
classifiers) hold the training data. To improve efficiency, use a smaller classifier.
Load theionosphere
data set.
loadionosphere
Train an SVM classifier. Standardize the predictor data and specify the order of the classes.
SVMModel = fitcsvm(X,Y,'Standardize',true,...'ClassNames',{'b','g'})
SVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' NumObservations: 351 Alpha: [90x1 double] Bias: -0.1343 KernelParameters: [1x1 struct] Mu: [0.8917 0 0.6413 0.0444 0.6011 0.1159 0.5501 ... ] Sigma: [0.3112 0 0.4977 0.4414 0.5199 0.4608 0.4927 ... ] BoxConstraints: [351x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [351x1 logical] Solver: 'SMO' Properties, Methods
SVMModel
is aClassificationSVM
classifier.
Reduce the size of the SVM classifier.
CompactSVMModel = compact(SVMModel)
CompactSVMModel = CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [90x1 double] Bias: -0.1343 KernelParameters: [1x1 struct] Mu: [0.8917 0 0.6413 0.0444 0.6011 0.1159 0.5501 ... ] Sigma: [0.3112 0 0.4977 0.4414 0.5199 0.4608 0.4927 ... ] SupportVectors: [90x34 double] SupportVectorLabels: [90x1 double] Properties, Methods
CompactSVMModel
is aCompactClassificationSVM
classifier.
Display the amount of memory used by each classifier.
whos('SVMModel','CompactSVMModel')
Name Size Bytes Class Attributes CompactSVMModel 1x1 31058 classreg.learning.classif.CompactClassificationSVM SVMModel 1x1 141148 ClassificationSVM
The full SVM classifier (SVMModel
) is more than four times larger than the compact SVM classifier (CompactSVMModel
).
To label new observations efficiently, you can removeSVMModel
from the MATLAB® Workspace, and then passCompactSVMModel
and new predictor values topredict
.
To further reduce the size of the compact SVM classifier, use thediscardSupportVectors
function to discard support vectors.
Train and Cross-Validate SVM Classifier
Load theionosphere
data set.
loadionosphere
Train and cross-validate an SVM classifier. Standardize the predictor data and specify the order of the classes.
rng(1);% For reproducibilityCVSVMModel = fitcsvm(X,Y,'Standardize',true,...'ClassNames',{'b','g'},'CrossVal','on')
CVSVMModel = ClassificationPartitionedModel CrossValidatedModel: 'SVM' PredictorNames: {1x34 cell} ResponseName: 'Y' NumObservations: 351 KFold: 10 Partition: [1x1 cvpartition] ClassNames: {'b' 'g'} ScoreTransform: 'none' Properties, Methods
CVSVMModel
is aClassificationPartitionedModel
cross-validated SVM classifier. By default, the software implements 10-fold cross-validation.
Alternatively, you can cross-validate a trainedClassificationSVM
classifier by passing it tocrossval
.
Inspect one of the trained folds using dot notation.
CVSVMModel。训练有素的{1}
ans = CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [78x1 double] Bias: -0.2210 KernelParameters: [1x1 struct] Mu: [0.8888 0 0.6320 0.0406 0.5931 0.1205 0.5361 ... ] Sigma: [0.3149 0 0.5033 0.4441 0.5255 0.4663 0.4987 ... ] SupportVectors: [78x34 double] SupportVectorLabels: [78x1 double] Properties, Methods
Each fold is aCompactClassificationSVM
classifier trained on 90% of the data.
Estimate the generalization error.
genError = kfoldLoss(CVSVMModel)
genError = 0.1168
On average, the generalization error is approximately 12%.
References
[1] Hastie, T., R. Tibshirani, and J. Friedman.The Elements of Statistical Learning, Second Edition. NY: Springer, 2008.
[2] Scholkopf, B., J. C. Platt, J. C. Shawe-Taylor, A. J. Smola, and R. C. Williamson. “Estimating the Support of a High-Dimensional Distribution.”Neural Computation. Vol. 13, Number 7, 2001, pp. 1443–1471.
[3] Christianini, N., and J. C. Shawe-Taylor.An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000.
[4] Scholkopf, B., and A. Smola.Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond, Adaptive Computation and Machine Learning.Cambridge, MA: The MIT Press, 2002.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
To integrate the prediction of an SVM classification model into Simulink®, you can use theClassificationSVM Predictblock in the Statistics and Machine Learning Toolbox™ library or a MATLAB Function block with the
predict
function.When you train an SVM model by using
fitcsvm
, the following restrictions apply.The value of the
'ScoreTransform'
名称-值pair argument cannot be an anonymous function. For generating code that predicts posterior probabilities given new observations, pass a trained SVM model tofitPosterior
orfitSVMPosterior
. TheScoreTransform
property of the returned model contains an anonymous function that represents the score-to-posterior-probability function and is configured for code generation.For fixed-point code generation, the value of the
'ScoreTransform'
名称-值pair argument cannot be'invlogit'
. Also, the value of the'KernelFunction'
名称-值pair argument must be'gaussian'
,'linear'
, or'polynomial'
.For fixed-point code generation and code generation with a coder configurer, the following additional restrictions apply.
Categorical predictors (
logical
,categorical
,char
,string
, orcell
) are not supported. You cannot use theCategoricalPredictors
名称-值argument.To include categorical predictors in a model, preprocess them by usingdummyvar
before fitting the model.Class labels with the
categorical
data type are not supported. Both the class label value in training data (Tbl
orY
) and the value of theClassNames
名称-值argument cannot be an array with thecategorical
data type.
For more information, seeIntroduction to Code Generation.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
The following object functions fully support GPU arrays:
The following object functions offer limited support for GPU arrays:
For more information, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).
Version History
Introduced in R2014aR2022a:Cost
属性存储user-specified cost matrix
Starting in R2022a, theCost
属性存储user-specified cost matrix, so that you can compute the observed misclassification cost using the specified cost value. The software stores normalized prior probabilities (Prior
) that do not reflect the penalties described in the cost matrix. To compute the observed misclassification cost, specify theLossFun
名称-值argument as"classifcost"
when you call theloss
function.
Note that model training has not changed and, therefore, the decision boundaries between classes have not changed.
For training, the fitting function updates the specified prior probabilities by incorporating the penalties described in the specified cost matrix, and then normalizes the prior probabilities and observation weights. This behavior has not changed. In previous releases, the software stored the default cost matrix in theCost
property and stored the prior probabilities used for training in thePrior
property. Starting in R2022a, the software stores the user-specified cost matrix without modification, and stores normalized prior probabilities that do not reflect the cost penalties. For more details, seeMisclassification Cost Matrix, Prior Probabilities, and Observation Weights.
Some object functions use theCost
andPrior
properties:
The
loss
function uses the cost matrix stored in theCost
property if you specify theLossFun
名称-值argument as"classifcost"
or"mincost"
.The
loss
andedge
functions use the prior probabilities stored in thePrior
property to normalize the observation weights of the input data.
If you specify a nondefault cost matrix when you train a classification model, the object functions return a different value compared to previous releases.
If you want the software to handle the cost matrix, prior probabilities, and observation weights as in previous releases, adjust the prior probabilities and observation weights for the nondefault cost matrix, as described inAdjust Prior Probabilities and Observation Weights for Misclassification Cost Matrix. Then, when you train a classification model, specify the adjusted prior probabilities and observation weights by using thePrior
andWeights
名称-值arguments, respectively, and use the default cost matrix.
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)
Europe
- Belgium(English)
- Denmark(English)
- Deutschland(Deutsch)
- España(Español)
- Finland(English)
- France(Français)
- Ireland(English)
- Italia(Italiano)
- Luxembourg(English)
- Netherlands(English)
- Norway(English)
- Österreich(Deutsch)
- 葡萄牙(English)
- Sweden(English)
- Switzerland
- United Kingdom(English)