Smoothness Measures
|
Calculates the output distribution (S1) measure. |
|
Calculates the input distribution (S2) measure. |
|
Calculates the error of nearest neighbor regressor (S3) measure. |
- problexity.regression.s1(X, y, normalize=True)
Calculates the output distribution (S1) measure.
Calculates complexity based on a similarity of instances adjacent in minimum spanning tree (MST). Returns the average difference of labels (y), of samples connected by MST. By default a 0-1 interval normalization is performed.
\[S1=\frac{1}{n}\sum_{i,j \in MST}|y_i - y_j|\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
S1 score
- problexity.regression.s2(X, y, normalize=True)
Calculates the input distribution (S2) measure.
Calculates complexity based on a similarity of features (X) of instances with close output values (y). Returns the average euclidean norm of difference of input values, of samples neighbouring after sorting them by output values. By default a 0-1 interval normalization is performed.
\[S2=\frac{1}{n}\sum_{i=2}^{n}||x_i-x_{i-1}||_2\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
S2 score
- problexity.regression.s3(X, y, normalize=True)
Calculates the error of nearest neighbor regressor (S3) measure.
Returns mean squared error of a 1-nearest neighbor regressor, established during leave-one-out procedure. By default, the data in normalized with 0-1 interval normalization.
\[S3=\frac{1}{n}\sum_{i=1}^{n}(NN(x_i)-y_i)^2\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
S3 score