glassbox.core.math¶
Low-level statistical functions, distance metrics, and tree split utilities — all implemented with NumPy.
calc_mean
¶
Calculate the mean of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The calculated mean. |
Source code in glassbox/core/math.py
calc_median
¶
Calculate the median of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
The calculated median. |
Source code in glassbox/core/math.py
calc_mode
¶
Calculate the mode of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float | str
|
The calculated mode. |
Source code in glassbox/core/math.py
calc_std
¶
Calculate the standard deviation of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Standard deviation. |
Source code in glassbox/core/math.py
calc_variance
¶
Calculate the variance (MSE) of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Array of continuous values, shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Calculated variance. |
Source code in glassbox/core/math.py
generate_bootstrap_indices
¶
Generate random indices for a bootstrap sample.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of samples in the original dataset. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of bootstrapped indices of shape (n_samples,). |
Source code in glassbox/core/math.py
generate_feature_subset_indices
¶
Generate random indices for a feature subset (sqrt of total features).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_features
|
int
|
Number of total features. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of subset feature indices. |
Source code in glassbox/core/math.py
calc_skew
¶
Calculate the skewness of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Skewness value. |
Source code in glassbox/core/math.py
calc_kurtosis
¶
Calculate the kurtosis of a 1D array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Kurtosis value. |
Source code in glassbox/core/math.py
calc_pearson
¶
Calculate Pearson correlation coefficient between two numerical arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr_x
|
ndarray
|
First numeric array of shape (n_samples,). |
required |
arr_y
|
ndarray
|
Second numeric array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Pearson correlation coefficient. |
Source code in glassbox/core/math.py
calc_cramers_v
¶
Calculate Cramer's V statistic for categorical-categorical association between 2 arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr_x
|
ndarray
|
First nominal array of shape (n_samples,). |
required |
arr_y
|
ndarray
|
Second nominal array of shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Cramer's V score between 0.0 and 1.0. |
Source code in glassbox/core/math.py
calc_percentile
¶
Calculate the precise percentile of an array using interpolation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Array dimension to extract percentile from. |
required |
p
|
float
|
Percentile range (0-100). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Calculated percentile. |
Source code in glassbox/core/math.py
calc_iqr
¶
Calculate the Interquartile Range (IQR) bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Array to bound. |
required |
Returns:
| Type | Description |
|---|---|
Tuple
|
Tuple containing parameters for lower and upper limits. |
Source code in glassbox/core/math.py
calc_split_gain
¶
Calculate the information gain or variance reduction of a split.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parent_cost
|
float
|
Cost of the parent node. |
required |
left_cost
|
float
|
Cost of the left child node. |
required |
right_cost
|
float
|
Cost of the right child node. |
required |
n_parent
|
int
|
Number of samples in the parent node. |
required |
n_left
|
int
|
Number of samples in the left child node. |
required |
n_right
|
int
|
Number of samples in the right child node. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The calculated gain. |
Source code in glassbox/core/math.py
calc_gini_impurity
¶
Calculate the Gini impurity of an array of categorical labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Array of categorical labels, shape (n_samples,). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Calculated Gini impurity. |
Source code in glassbox/core/math.py
calc_euclidean
¶
Calculate the Euclidean distance between two vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray
|
First numeric array. |
required |
y
|
ndarray
|
Second numeric array. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Euclidean distance. |
Source code in glassbox/core/math.py
calc_manhattan
¶
Calculate the Manhattan distance between two vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray
|
First numeric array. |
required |
y
|
ndarray
|
Second numeric array. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Manhattan distance. |