Geometry Measures

`l3`(X, y[, normalize])	Calculates the nom-linearity of a linear regressor (L3) measure.
`s4`(X, y[, normalize])	Calculates the non-linearity of a nearest neighbor regressor (S4) measure.
`t2`(X, y)	Calculates the average number of examples per dimension (T2) measure.

problexity.regression.l3(X, y, normalize=True)

Calculates the nom-linearity of a linear regressor (L3) measure.

Linearly interpolates both input (X) and output (y) values of each pair of samples with similar output values. Generated l=n-1 synthetic samples and then measures the mean squared error of a linear regressor, fitted with original data and evaluated on synthetic points. By default performs a normalization of samples.

\[L3=\frac{1}{l}\sum_{i=1}^{l}(f(x'_i) - y'_i)^2\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels for regression task

Return type:

float

Returns:

L3 score

problexity.regression.s4(X, y, normalize=True)

Calculates the non-linearity of a nearest neighbor regressor (S4) measure.

Linearly interpolates both input (X) and output (y) values of each pair of samples with similar output values. Generated l=n-1 synthetic samples and then measures the mean squared error of a nearest neighbor regessor, fitted with original data and evaluated on synthetic points. By default performs a normalization of samples.

\[S4=\frac{1}{l}\sum_{i=1}^{l}(NN(x'_i) - y'_i)^2\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels for regression task

Return type:

float

Returns:

S4 score

problexity.regression.t2(X, y)

Calculates the average number of examples per dimension (T2) measure.

Returns number of samples per number of features. Higher values indicate simpler problems.

\[T2=\frac{n}{d}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels for regression task

Return type:

float

Returns:

T2 score