Complexity Calculator

ComplexityCalculator([metrics, colors, ...])

Complexity Calculator Class.

class problexity.ComplexityCalculator(metrics=None, colors=None, ranges=None, weights=None, mode='classification', multiclass_strategy='ovo')

Bases: object

Complexity Calculator Class.

A class that allows to determine all or selected metrics for a given data set. The report can be returned both as a simple vector of metrics, as well as a dictionary containing all set parameters and visualization in the form of a radar.

Parameters:

metrics (list, optional (default=all the metrics avalable in problexity)) – List of classification complexity measures used to validate a given set.
mode (string, optional (default=classification)) – Recognition task for which metrics should be calculated. Might be selected between classification and regression.
multiclass_strategy (string, optional (default=ova)) – Strategy used for multiclass metric integration. Might be selected between ova and ovo.
ranges (dict, optional (default=all the default six groups of metrics)) – Configuration of radar visualisation, allowing to group metrics by color.
colors (list, optional (default=six-color palette)) – List of colors assigned to groups on radar visualisation.
weights (list, optional (default=list of weights, where weight are equal to 1 for all measures where simpler problems have smaller value, otherwise -1)) – List of weights taken into account in score() procedure.

Variables:

complexity (list) – The list of all the scores acquired with metrics defined by metrics list.
n_samples (int) – The number of samples in the fitted dataset.
n_features (int) – The number of features of the fitted dataset.
n_classes (int) – The number of classes in the fitted dataset.
classes (array-like, shape (n_classes, )) – The class labels.
prior_probability (array-like, shape (n_classes, )) – The prior probability of classes.

Examples:

>>> from problexity import ComplexityCalculator
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification()
>>> cc = ComplexityCalculator().fit(X, y)
>>> print(cc.complexity)
[0.3158144010174404, 0.1508882806154997, 0.005974480517635054, 0.57, 0.0, 
 0.10518058962953956, 0.1, 0.07, 0.135, 0.48305940839428635, 0.27, 0.11, 
 1.0, 0.9642, 0.9892929292929293, 0.9321428571428572, 0.9297111755529109, 
 0.2, 0.16, 0.8, 0.0, 0.0]
>>> report = cc.report()
>>> print(report)
{
    'n_samples': 100, 'n_features': 20, 'n_classes': 2, 
    'classes': array([0, 1]), 
    'prior_probability': array([0.5, 0.5]), 
    'score': 0.377, 
    'complexities': 
    {
        'f1': 0.316, 'f1v': 0.151, 'f2': 0.006, 'f3': 0.57, 'f4': 0.0, 
        'l1': 0.105, 'l2': 0.1, 'l3': 0.07, 
        'n1': 0.135, 'n2': 0.483, 'n3': 0.27, 'n4': 0.11, 't1': 1.0, 'lsc': 0.964, 
        'density': 0.989, 'clsCoef': 0.932, 'hubs': 0.93, 
        't2': 0.2, 't3': 0.16, 't4': 0.8, 'c1': 0.0, 'c2': 0.0
    }
}

fit(X, y)

Calculates metrics for given dataset.

Parameters:

X (array-like, shape (n_samples, n_features)) – The input samples.
y (array-like, shape (n_samples, )) – The input labels. In case of classification may represent binary or multi-class problem.

Return type:

ComplexityCalculator class object

Returns:

ComplexityCalculator class object.

plot(figure, spec=(1, 1, 1))

Returns integrated score of problem complexity

Parameters:

weights (matplotlib figure object) – Figure to draw radar on.
spec (tuple, optional (default=(1,1,1))) – Matplotlib subplot location.

Return type:

object

Returns:

Matplotlib axis object.

report(precision=3)

Returns report of problem complexity

Parameters:: precision (int, optional (default=3)) – The rounding precision.
Return type:: dict
Returns:: Dictionary with complexity report
Examples:

>>> from problexity import ComplexityCalculator
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification()
>>> cc = ComplexityCalculator().fit(X, y)
>>> report = cc.report()
>>> print(report)
{
    'n_samples': 100, 'n_features': 20, 'n_classes': 2, 
    'classes': array([0, 1]), 
    'prior_probability': array([0.5, 0.5]), 
    'score': 0.377, 
    'complexities': 
    {
        'f1': 0.316, 'f1v': 0.151, 'f2': 0.006, 'f3': 0.57, 'f4': 0.0, 
        'l1': 0.105, 'l2': 0.1, 'l3': 0.07, 
        'n1': 0.135, 'n2': 0.483, 'n3': 0.27, 'n4': 0.11, 't1': 1.0, 'lsc': 0.964, 
        'density': 0.989, 'clsCoef': 0.932, 'hubs': 0.93, 
        't2': 0.2, 't3': 0.16, 't4': 0.8, 'c1': 0.0, 'c2': 0.0
    }
}

score()

Returns integrated score of problem complexity

Parameters:: weights (array-like, optional (default=None), shape (n_metrics)) – Optional weights of metrics.
Return type:: float
Returns:: Single score for integrated metrics