Feature-based Measures

`f1`(X, y)	Calculates the Maximum Fisher's discriminant ratio (F1) metric.
`f1v`(X, y)	Calculates the Directional vector maximum Fisher's discriminant ratio (F1v) metric.
`f2`(X, y)	Calculates the Volume of overlapping region (F2) metric.
`f3`(X, y)	Calculates the Maximum individual feature efficiency (F3) metric.
`f4`(X, y)	Calculates the Collective feature efficiency (F4) metric.

problexity.classification.f1(X, y)

Calculates the Maximum Fisher’s discriminant ratio (F1) metric.

Measure describes the overlap of feature values in each class.

\[F1=\frac{1}{1+max^{m}_{i=1}r_{f_{i}}}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

F1 score

problexity.classification.f1v(X, y)

Calculates the Directional vector maximum Fisher’s discriminant ratio (F1v) metric.

\[F1v=\frac{1}{1+dF}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

F1v score

problexity.classification.f2(X, y)

Calculates the Volume of overlapping region (F2) metric.

Describes the overlap of the feature values within the classes. The measure is determined by the minimum and maximum values of features in the class. The overlap is then calculated and normalized by the range of values in each class.

\[F2=\prod^{m}_{i}{\frac{overlap(f_i)}{range(f_i)}}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

F2 score

problexity.classification.f3(X, y)

Calculates the Maximum individual feature efficiency (F3) metric.

Measure describes the efficiency of each feature in the separation of classes. It considers the maximum value among all features.

\[F3=\min^{m}_{i=1}{\frac{n_o(f_i)}{n}}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

F3 score

problexity.classification.f4(X, y)

Calculates the Collective feature efficiency (F4) metric.

The measure describes an overview of how the features work together. The instances separated by the most discriminant feature that was not used already are excluded from the further analysis. The process continues until all instances are classified or all features are used. The measure is calculated according to the number of instances in the overlapping region after all features were used and the total number of samples.

\[F4=\frac{n_o(f_{min}(T_l))}{n}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

F4 score