Getting Started¶
Installation¶
GlassBox requires Python 3.11+ and NumPy.
Quickstart¶
A minimal end-to-end workflow: load → inspect → clean → train → predict.
1. Load a Dataset¶
from glassbox.frame import read_csv
ds = read_csv("data.csv")
print(ds) # Dataset(shape=(1000, 12), columns=[...])
2. Inspect the Data¶
from glassbox.inspector import DataAuditor
auditor = DataAuditor()
report = auditor.run_audit(ds)
# Feature types detected automatically
print(report.feature_types)
# Outlier counts per numeric column
print(report.outliers_info)
# Export to JSON
print(report.to_json())
3. Clean the Data¶
import numpy as np
from glassbox.cleaner import SimpleImputer, StandardScaler
X = ds.get_columns(["feat_1", "feat_2"]).data.astype(float)
# Impute missing values with the column mean
imputer = SimpleImputer()
X = imputer.fit_transform(X)
# Standardize to zero-mean, unit-variance
scaler = StandardScaler()
X = scaler.fit_transform(X)
4. Train a Model¶
from glassbox.models import DecisionTreeClassifier
y = ds.get_columns("target").data[:, 0].astype(float)
model = DecisionTreeClassifier(max_depth=10)
model.fit(X, y)
5. Predict¶
What's Next?¶
- Frame — Data loading and manipulation.
- Inspector — Exploratory Data Analysis.
- Cleaner — Data preprocessing pipeline.
- Models — Machine learning algorithms.
- API Reference — Auto-generated from source docstrings.