Complexity Calculator
|
Complexity Calculator Class. |
- class problexity.ComplexityCalculator(metrics=None, colors=None, ranges=None, weights=None, mode='classification', multiclass_strategy='ovo')
Bases:
object
Complexity Calculator Class.
A class that allows to determine all or selected metrics for a given data set. The report can be returned both as a simple vector of metrics, as well as a dictionary containing all set parameters and visualization in the form of a radar.
- Parameters:
metrics (list, optional (default=all the metrics avalable in problexity)) – List of classification complexity measures used to validate a given set.
mode (string, optional (default=classification)) – Recognition task for which metrics should be calculated. Might be selected between classification and regression.
multiclass_strategy (string, optional (default=ova)) – Strategy used for multiclass metric integration. Might be selected between ova and ovo.
ranges (dict, optional (default=all the default six groups of metrics)) – Configuration of radar visualisation, allowing to group metrics by color.
colors (list, optional (default=six-color palette)) – List of colors assigned to groups on radar visualisation.
weights (list, optional (default=list of weights, where weight are equal to 1 for all measures where simpler problems have smaller value, otherwise -1)) – List of weights taken into account in score() procedure.
- Variables:
complexity (list) – The list of all the scores acquired with metrics defined by metrics list.
n_samples (int) – The number of samples in the fitted dataset.
n_features (int) – The number of features of the fitted dataset.
n_classes (int) – The number of classes in the fitted dataset.
classes (array-like, shape (n_classes, )) – The class labels.
prior_probability (array-like, shape (n_classes, )) – The prior probability of classes.
- Examples:
>>> from problexity import ComplexityCalculator >>> from sklearn.datasets import make_classification >>> X, y = make_classification() >>> cc = ComplexityCalculator().fit(X, y) >>> print(cc.complexity) [0.3158144010174404, 0.1508882806154997, 0.005974480517635054, 0.57, 0.0, 0.10518058962953956, 0.1, 0.07, 0.135, 0.48305940839428635, 0.27, 0.11, 1.0, 0.9642, 0.9892929292929293, 0.9321428571428572, 0.9297111755529109, 0.2, 0.16, 0.8, 0.0, 0.0] >>> report = cc.report() >>> print(report) { 'n_samples': 100, 'n_features': 20, 'n_classes': 2, 'classes': array([0, 1]), 'prior_probability': array([0.5, 0.5]), 'score': 0.377, 'complexities': { 'f1': 0.316, 'f1v': 0.151, 'f2': 0.006, 'f3': 0.57, 'f4': 0.0, 'l1': 0.105, 'l2': 0.1, 'l3': 0.07, 'n1': 0.135, 'n2': 0.483, 'n3': 0.27, 'n4': 0.11, 't1': 1.0, 'lsc': 0.964, 'density': 0.989, 'clsCoef': 0.932, 'hubs': 0.93, 't2': 0.2, 't3': 0.16, 't4': 0.8, 'c1': 0.0, 'c2': 0.0 } }
- fit(X, y)
Calculates metrics for given dataset.
- Parameters:
X (array-like, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples, )) – The training input labels.
- Return type:
ComplexityCalculator class object
- Returns:
ComplexityCalculator class object.
- plot(figure, spec=(1, 1, 1))
Returns integrated score of problem complexity
- report(precision=3)
Returns report of problem complexity
- Parameters:
precision (int, optional (default=3)) – The rounding precision.
- Return type:
- Returns:
Dictionary with complexity report
- Examples:
>>> from problexity import ComplexityCalculator >>> from sklearn.datasets import make_classification >>> X, y = make_classification() >>> cc = ComplexityCalculator().fit(X, y) >>> report = cc.report() >>> print(report) { 'n_samples': 100, 'n_features': 20, 'n_classes': 2, 'classes': array([0, 1]), 'prior_probability': array([0.5, 0.5]), 'score': 0.377, 'complexities': { 'f1': 0.316, 'f1v': 0.151, 'f2': 0.006, 'f3': 0.57, 'f4': 0.0, 'l1': 0.105, 'l2': 0.1, 'l3': 0.07, 'n1': 0.135, 'n2': 0.483, 'n3': 0.27, 'n4': 0.11, 't1': 1.0, 'lsc': 0.964, 'density': 0.989, 'clsCoef': 0.932, 'hubs': 0.93, 't2': 0.2, 't3': 0.16, 't4': 0.8, 'c1': 0.0, 'c2': 0.0 } }