HazelJS v0.3.0: Feature Store, Experiment Tracking, and Drift Detection for TypeScript
Introducing the first TypeScript-native feature store, MLflow-style experiment tracking, and production ML monitoring—making HazelJS the only full-stack data + ML framework for Node.js
Author: HazelJS Team
Today, we're thrilled to announce HazelJS v0.3.0—our most ambitious release yet. This update introduces five major features that position HazelJS as the only full-stack data + ML framework for TypeScript, competing directly with Python tools like MLflow, Feast, dbt, and Dagster.
The Problem: TypeScript's ML Gap
If you've built ML-powered applications in Node.js, you've faced this reality: all the good ML tooling is in Python. Need a feature store? Use Feast (Python). Want experiment tracking? MLflow (Python). Production monitoring? Evidently AI (Python).
The result? TypeScript teams either:
- Build everything from scratch (expensive, time-consuming)
- Maintain a Python microservice alongside their Node.js stack (complex, fragile)
- Skip production ML best practices entirely (risky)
HazelJS v0.3.0 changes this.
What's New: 5 Game-Changing Features
1. Feature Store — The First for TypeScript
We've built the first and only TypeScript-native feature store, bringing Feast-like capabilities to Node.js without any Python dependencies.
Why it matters: Feature stores solve the "training-serving skew" problem—ensuring your model sees the same features in production as it did during training. Until now, TypeScript teams had no native solution.
What you get:
- Dual storage architecture — Online store (Redis/Memory) for low-latency serving, offline store (Postgres/File) for training
- Point-in-time correctness — Prevents data leakage by retrieving features as they existed at training time
- Decorator-driven API — Define features with
@Featureand@FeatureView - Zero infrastructure mode — Start with in-memory/file storage, scale to Redis/Postgres when ready
@FeatureView({
name: 'user-behavior',
entities: ['user'],
online: true,
offline: true,
})
class UserBehaviorFeatures {
@Feature({ valueType: 'number' })
loginCount: number;
@Feature({ valueType: 'number' })
avgSessionDuration: number;
}
// Get features for real-time inference
const features = await featureStore.getOnlineFeatures(
['user123'],
['loginCount', 'avgSessionDuration']
);
// Get historical features for training (point-in-time correct)
const trainingFeatures = await featureStore.getOfflineFeatures(
['user123'],
['loginCount', 'avgSessionDuration'],
new Date('2024-01-01') // Features as they were on this date
);
No Python. No Feast. Just TypeScript.
2. Experiment Tracking — MLflow for Node.js
Track your ML experiments like the pros—runs, metrics, parameters, artifacts—all in TypeScript.
Why it matters: Without experiment tracking, you lose the ability to reproduce results, compare models, or understand what actually worked. MLflow is the gold standard, but it's Python-only.
What you get:
- Complete experiment lifecycle — Create experiments, start runs, log metrics/params/artifacts
- Run comparison — Find your best model by any metric
- Artifact storage — Save models, plots, logs alongside metrics
- File-based storage — Zero infrastructure required (SQLite/file backend)
@Experimentdecorator — Auto-track training runs
const experiment = experimentService.createExperiment('sentiment-classifier');
const run = experimentService.startRun(experiment.id, {
params: { learningRate: 0.01, epochs: 10, batchSize: 32 }
});
// Log metrics during training
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.logMetric(run.id, 'loss', 0.05);
// Save the model
experimentService.logArtifact(run.id, 'model', 'model', modelBuffer);
experimentService.endRun(run.id, 'completed');
// Find the best run
const bestRun = experimentService.getBestRun(experiment.id, 'accuracy', 'max');
console.log('Best accuracy:', bestRun.metrics.accuracy);
Your entire ML workflow, tracked and reproducible.
3. Drift Detection & Monitoring — Production ML Without Python
Monitor your models in production with five statistical drift detection methods, all implemented in pure TypeScript.
Why it matters: Models degrade over time. User behavior changes. Data distributions shift. Without monitoring, you won't know your model is failing until customers complain.
What you get:
- 5 statistical methods — PSI, Kolmogorov-Smirnov, Jensen-Shannon, Chi-square, Wasserstein
- Feature drift detection — Catch when input distributions change
- Prediction drift detection — Detect when model outputs shift
- Continuous monitoring — Set up alerts for accuracy degradation
- Pure TypeScript — No Python, no native dependencies
const driftService = new DriftService();
// Set reference distribution from training data
driftService.setReferenceDistribution('age', trainingAges);
// Detect drift in production data
const result = driftService.detectDrift('age', productionAges, {
method: 'ks', // Kolmogorov-Smirnov test
threshold: 0.1,
});
if (result.driftDetected) {
console.warn(`Drift detected: ${result.message}`);
console.log(`KS statistic: ${result.score}, p-value: ${result.pValue}`);
}
// Set up continuous monitoring
const monitorService = new MonitorService(driftService);
monitorService.registerModel({
modelName: 'credit-risk-model',
featureDrift: { method: 'ks', threshold: 0.1 },
accuracyMonitor: { threshold: 0.85, windowSize: 100 },
checkIntervalMinutes: 60,
});
monitorService.onAlert(async (alert) => {
// Send to Slack, PagerDuty, etc.
console.error(`[${alert.severity}] ${alert.message}`);
});
Production ML monitoring, no Python required.
4. Data Contracts — Schema Agreements for Data Pipelines
Implement the hottest trend in data engineering: data contracts that define schema agreements between producers and consumers.
Why it matters: Data pipelines break when schemas change unexpectedly. Data contracts prevent this by making schema agreements explicit, versioned, and validated.
What you get:
@DataContractdecorator — Define schemas, SLAs, owners, and consumers- Breaking change detection — Automatically detect field removals and type changes
- SLA tracking — Monitor freshness, completeness, quality, and availability
- Version management — Support multiple contract versions with semver
- Violation recording — Track and query contract violations over time
@DataContract({
name: 'user-events',
version: '1.0.0',
owner: 'analytics-team',
schema: {
userId: { type: 'string', required: true },
eventType: { type: 'string', required: true },
timestamp: { type: 'date', required: true },
},
sla: {
freshness: { maxDelayMinutes: 5 },
completeness: {
minCompleteness: 0.95,
requiredFields: ['userId', 'eventType']
}
},
consumers: ['recommendation-service', 'analytics-dashboard']
})
@Pipeline('user-events-pipeline')
class UserEventsPipeline extends PipelineBase {
// Pipeline automatically validates against contract
}
// Validate data
const validation = registry.validate('user-events', data);
if (!validation.valid) {
console.error('Contract violations:', validation.violations);
}
// Detect breaking changes between versions
const diff = registry.diff('user-events', '1.0.0', '2.0.0');
if (diff.isBreaking) {
console.warn('Breaking changes:', diff.breakingChanges);
}
Data quality and reliability, built into your pipelines.
5. Production Connectors — JSONL & PostgreSQL
Two essential connectors for production data pipelines.
JSONL Connector — Newline-delimited JSON for efficient streaming:
const jsonlSource = new JsonlSource({ filePath: './data.jsonl' });
await jsonlSource.open();
for await (const record of jsonlSource.read()) {
console.log(record);
}
PostgreSQL Connector — Full database integration with upsert support:
const pgSink = new PostgresSink({
host: 'localhost',
database: 'mydb',
table: 'users',
columns: ['id', 'name', 'email'],
conflictColumn: 'id', // enables upserts
});
await pgSink.writeBatch(records);
Why This Matters: The Competitive Landscape
Let's be clear: these three features combined don't exist in any TypeScript framework.
| Feature | Python Tools | TypeScript (Before) | HazelJS v0.3.0 |
|---|---|---|---|
| Feature Store | Feast, Tecton | ❌ None | ✅ Native |
| Experiment Tracking | MLflow, W&B | ❌ None | ✅ Native |
| Drift Detection | Evidently AI, WhyLabs | ❌ None | ✅ Native |
| Data Contracts | dbt, Great Expectations | ❌ None | ✅ Native |
Result: HazelJS is now the only full-stack data + ML framework for TypeScript—not just an ORM or pipeline runner, but a complete platform.
Architecture Decisions: Built for Production
We made five key architectural decisions to ensure HazelJS scales with your needs:
1. Zero Dependencies by Default
All new connectors (Postgres, Redis) are optional peer dependencies. The core functionality works with in-memory/file storage—no databases required.
2. Decorator-First API
Every feature includes decorators (@Feature, @FeatureView, @Experiment, @DataContract) for consistency with existing HazelJS patterns.
3. Pure TypeScript Implementation
All statistical tests (PSI, KS, JSD, Chi-square, Wasserstein) are implemented in pure TypeScript—no Python or native dependencies.
4. Pluggable Storage
- Feature store: Memory, Redis, File, Postgres
- Experiment tracking: File, SQLite
- Easy to extend with custom backends
5. Type Safety First
Full TypeScript type inference, strict null checks, and ESLint compliance with zero warnings.
Getting Started
Installation
npm install @hazeljs/ml @hazeljs/data @hazeljs/core
Optional Dependencies (only if needed)
npm install pg redis # For Postgres and Redis support
Quick Example: End-to-End ML Workflow
import { FeatureStoreService, ExperimentService, DriftService } from '@hazeljs/ml';
// 1. Set up feature store
const featureStore = new FeatureStoreService();
featureStore.configure({
online: { type: 'memory' },
offline: { type: 'file', file: { directory: './features' } },
});
// 2. Track experiments
const experimentService = new ExperimentService();
const experiment = experimentService.createExperiment('my-model');
const run = experimentService.startRun(experiment.id, {
params: { learningRate: 0.01 }
});
// 3. Train and log
// ... your training code ...
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.endRun(run.id, 'completed');
// 4. Monitor in production
const driftService = new DriftService();
driftService.setReferenceDistribution('feature1', trainingData);
const drift = driftService.detectDrift('feature1', productionData, {
method: 'ks',
threshold: 0.1,
});
Migration Guide
Good news: This release has zero breaking changes. All new features are additive.
To use the new features:
- Update to v0.3.0:
npm update @hazeljs/ml @hazeljs/data - Install optional dependencies if needed:
npm install pg redis - Import and use the new services as shown in the examples above
What's Next: The Roadmap
We're not stopping here. Phase 3 (v0.4.0) will include:
- S3 Connector — Cloud object storage integration
- Kafka Connector — Streaming data source/sink
- Model Serving Strategies — A/B testing, canary deployment, shadow mode
- Incremental Processing — Checkpoints and watermarks for ETL
Future considerations:
- MongoDB and Parquet connectors
- DAG pipelines with fan-out/fan-in
- Model explainability (SHAP-like values)
- Advanced anomaly detection
The Bottom Line
HazelJS v0.3.0 brings production-grade ML and data engineering to TypeScript. No more Python microservices. No more building everything from scratch. No more choosing between Node.js and ML best practices.
You can now:
- Store and serve ML features with point-in-time correctness
- Track experiments like MLflow, entirely in TypeScript
- Monitor models in production with statistical drift detection
- Enforce data contracts across your pipelines
- Connect to PostgreSQL and stream JSONL files
All in TypeScript. All in one framework. All production-ready.
Try It Today
npm install @hazeljs/ml @hazeljs/data
Resources:
Questions? Issues? Feature requests? Open an issue on GitHub.
Built with ❤️ by the HazelJS team. Making TypeScript a first-class citizen for ML and data engineering.