+2 votes
in Machine Learning by (77.1k points)
recategorized by

After training the SVM classifier, I am using predict_proba() to get the probability for the classes of the data. But it gives error "AttributeError: predict_proba is not available when  probability=False".

>>> clf.predict_proba(X_test)

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "C:\Users\pkumar81\Anaconda2\lib\site-packages\sklearn\svm\base.py", line 596, in _predict_proba

    raise NotFittedError("predict_proba is not available when fitted "

sklearn.exceptions.NotFittedError: predict_proba is not available when fitted with probability=False

I am using the following code:

 >>> from sklearn import svm

>>> clf = svm.SVC()

>>> clf.fit(X_test, y_test)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,

  decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',

  max_iter=-1, probability=False, random_state=None, shrinking=True,

  tol=0.001, verbose=False)

>>> clf.predict_proba(X_test)

1 Answer

+3 votes
by (354k points)
selected by
 
Best answer

By default, probability is not enabled. You can enable probability by using probability=True

clf=svm.SVC(probability=True)

Make the above changes and it should work.

Also, you can use CalibratedClassifierCV() function to solve this problem. From sklearn's website:

When performing classification you often want to predict not only the class label, but also the associated probability. This probability gives you some kind of confidence on the prediction. However, not all classifiers provide well-calibrated probabilities, some being over-confident while others being under-confident. Thus, a separate calibration of predicted probabilities is often desirable as a postprocessing.

Have a look at the following example with CalibratedClassifierCV().

>>> from sklearn.calibration import CalibratedClassifierCV
>>> svm = SVC()
>>> clf = CalibratedClassifierCV(svm)
>>> clf.fit(X_train, y_train)
CalibratedClassifierCV(base_estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False),
            cv=3, method='sigmoid')
>>> clf.predict_proba(X_test)
array([[0.02352877, 0.64021213, 0.33625911],


...