Network Measures

`density`(X, y)	Calculates the Density metric.
`clsCoef`(X, y)	Calculates the Clustering Coefficient metric.
`hubs`(X, y)	Calculates the Hubs metric.

problexity.classification.clsCoef(X, y)

Calculates the Clustering Coefficient metric.

Generates an epsilon-Nearest Neighbours graph. The epsilon value is set to 0.15. The edges are selected based on the Gower distance between samples, normalized to the range between 0 and 1. Edges between instances of distinct classes are removed. The neighborhood of each vertex is calculated – the instances directly connected to it. Then, the number of edges between the sample’s neighbors is calculated and divided by the maximum possible number of edges between them. The final measure is calculated based on the neighborhood of each point.

\[ClsCoef=1-\frac{1}{n}\sum^{n}_{i=1}\frac{2|e_{jk} : v_j, v_k \in N_i|}{k_i(k_i-1)}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

Clustering Coefficient score

problexity.classification.density(X, y)

Calculates the Density metric.

Generates an epsilon-Nearest Neighbours graph. The epsilon value is set to 0.15. The edges are selected based on the Gower distance between samples, normalized to the range between 0 and 1. Edges between instances of distinct classes are removed. The measure calculates the number of edges in the final graph divided by the total possible number of edges.

\[Density =1 - \frac{2|E|}{n(n-1)}\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

Density score

problexity.classification.hubs(X, y)

Calculates the Hubs metric.

Generates an epsilon-Nearest Neighbours graph. The epsilon value is set to 0.15. The edges are selected based on the Gower distance between samples, normalized to the range between 0 and 1. Edges between instances of distinct classes are removed. The neighborhood of each vertex is obtained – the instances directly connected to it. The measure scores each sample by the number of connections to neighbors, weighted by the number of connections the neighbors have.

\[Hubs=1-\frac{1}{n}\sum^{n}_{i=1}hub(v_i)\]

Parameters:

X (array-like, shape (n_samples, n_features)) – Dataset
y (array-like, shape (n_samples)) – Labels of binary classification task ([0,1])

Return type:

float

Returns:

Hubs score