Dimensionality Measures
|
Calculates the Average number of features per dimension (T2) metric. |
|
Calculates the Average number of PCA dimensions per points (T3) metric. |
|
Calculates the Ration of the PCA dimension to the original dimension (T4) metric. |
- problexity.classification.t2(X, y)
Calculates the Average number of features per dimension (T2) metric.
To obtaint this measure, the number of dimensions describing the dataset is divided by the number of instances.
\[T2=\frac{m}{n}\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
T2 score
- problexity.classification.t3(X, y)
Calculates the Average number of PCA dimensions per points (T3) metric.
To obtain this measure, first, the number of PCA components needed to represent 95% of data variability is calculated. Then, the value is divided by the instance number in the dataset.
\[T3=\frac{m'}{n}\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
T3 score
- problexity.classification.t4(X, y)
Calculates the Ration of the PCA dimension to the original dimension (T4) metric.
To obtain this measure, the number of PCA components needed to represent 95% of data variability is divided by the original number of dimensions. This measure describes the proportion of relevant dimensions in the dataset.
\[T4=\frac{m'}{m}\]- Parameters:
X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels
- Return type:
- Returns:
T4 score