I am applying the LogisticRegression() function of the scikit module to imbalanced data. How can I use the *class_weight* parameter to assign different weights to the classes?

+2 votes

Best answer

The *class_weight* parameter of the *LogisticRegression()* function of the *scikit-learn* module takes a dictionary or 'balanced' as values. If you want to assign equal weight to each class in imbalanced data, you can use *class_weight='balanced'.*

>>> from sklearn.linear_model import LogisticRegression

>>> clf = LogisticRegression(random_state=0, class_weight='balanced')

When 'balanced' is used, it adjusts weights of the classes inversely proportional to their frequencies in the input data.

Here is an example to show how the weight is computed when 'balanced' is used.

>>> import numpy as np>>> y=np.asarray([1,0,0,1,1,0,0,0,0,0,1])>>> yarray([1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1])>>> n_samples = len(y)>>> n_classes = 2>>> n_samples / (n_classes * np.bincount(y))array([0.78571429, 1.375 ])

In case, you want manually to assign different weights to classes, you can do it using a dictionary.

>>> clf = LogisticRegression(random_state=0, class_weight={0:2,1:3})

In the above example, ratio of weights of class 0 to class 1 is 2/3.