glassbox.frame.dataset¶
The Dataset class — a lightweight, named-column wrapper around a 2-D NumPy array.
Dataset
¶
Container for data matrices with multiple helper functions.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
ndarray
|
Data arranged as a 2D matrix. To access columns, take the transpose. |
columns |
List[str]
|
Names of the columns stored in a list. |
shape |
Tuple[int, int]
|
Shape of the dataset (# of rows, # of columns). |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
ndarray
|
Data arranged as a 2D matrix - (n_rows, n_cols) |
required |
columns
|
List[str]
|
column names - must match data.shape[1] |
required |
Source code in glassbox/frame/dataset.py
get_columns
¶
Retrieve data for specific columns by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
str | List[str]
|
A single column name or a list of column names. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array slice representing the requested columns. |
Source code in glassbox/frame/dataset.py
get_rows
¶
Get specific rows based on indices and return as a new dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
indices
|
ndarray
|
Integer array of row coordinates. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
A new Dataset instance containing the selected rows. |
Source code in glassbox/frame/dataset.py
update_column
¶
Update the array content for an existing column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Target column to update. |
required |
new_data
|
ndarray
|
Array values to overwrite the column. |
required |
Returns:
| Type | Description |
|---|---|
None
|
|
Source code in glassbox/frame/dataset.py
drop_columns
¶
Remove columns by name from the dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
names
|
str | List[str]
|
Target column or list of columns to remove. |
required |
Returns:
| Type | Description |
|---|---|
None
|
|
Raises:
| Type | Description |
|---|---|
KeyError
|
if one of the columns to drop doesn't exist in the dataset |
Source code in glassbox/frame/dataset.py
add_columns
¶
Add new columns alongside the dataset arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_dataset
|
Dataset
|
New data to append. |
required |
Returns:
| Type | Description |
|---|---|
None
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If column to add already exists |