HazelJS v0.3.0: Feature Store, Experiment Tracking, and Drift Detection for TypeScript

Introducing the first TypeScript-native feature store, MLflow-style experiment tracking, and production ML monitoring—making HazelJS the only full-stack data + ML framework for Node.js

Author: HazelJS Team

Today, we're thrilled to announce HazelJS v0.3.0—our most ambitious release yet. This update introduces five major features that position HazelJS as the only full-stack data + ML framework for TypeScript, competing directly with Python tools like MLflow, Feast, dbt, and Dagster.

The Problem: TypeScript's ML Gap

If you've built ML-powered applications in Node.js, you've faced this reality: all the good ML tooling is in Python. Need a feature store? Use Feast (Python). Want experiment tracking? MLflow (Python). Production monitoring? Evidently AI (Python).

The result? TypeScript teams either:

Build everything from scratch (expensive, time-consuming)
Maintain a Python microservice alongside their Node.js stack (complex, fragile)
Skip production ML best practices entirely (risky)

HazelJS v0.3.0 changes this.

What's New: 5 Game-Changing Features

1. Feature Store — The First for TypeScript

We've built the first and only TypeScript-native feature store, bringing Feast-like capabilities to Node.js without any Python dependencies.

Why it matters: Feature stores solve the "training-serving skew" problem—ensuring your model sees the same features in production as it did during training. Until now, TypeScript teams had no native solution.

What you get:

Dual storage architecture — Online store (Redis/Memory) for low-latency serving, offline store (Postgres/File) for training
Point-in-time correctness — Prevents data leakage by retrieving features as they existed at training time
Decorator-driven API — Define features with @Feature and @FeatureView
Zero infrastructure mode — Start with in-memory/file storage, scale to Redis/Postgres when ready

@FeatureView({
  name: 'user-behavior',
  entities: ['user'],
  online: true,
  offline: true,
})
class UserBehaviorFeatures {
  @Feature({ valueType: 'number' })
  loginCount: number;

  @Feature({ valueType: 'number' })
  avgSessionDuration: number;
}

// Get features for real-time inference
const features = await featureStore.getOnlineFeatures(
  ['user123'],
  ['loginCount', 'avgSessionDuration']
);

// Get historical features for training (point-in-time correct)
const trainingFeatures = await featureStore.getOfflineFeatures(
  ['user123'],
  ['loginCount', 'avgSessionDuration'],
  new Date('2024-01-01') // Features as they were on this date
);

No Python. No Feast. Just TypeScript.

2. Experiment Tracking — MLflow for Node.js

Track your ML experiments like the pros—runs, metrics, parameters, artifacts—all in TypeScript.

Why it matters: Without experiment tracking, you lose the ability to reproduce results, compare models, or understand what actually worked. MLflow is the gold standard, but it's Python-only.

What you get:

Complete experiment lifecycle — Create experiments, start runs, log metrics/params/artifacts
Run comparison — Find your best model by any metric
Artifact storage — Save models, plots, logs alongside metrics
File-based storage — Zero infrastructure required (SQLite/file backend)
@Experiment decorator — Auto-track training runs

const experiment = experimentService.createExperiment('sentiment-classifier');
const run = experimentService.startRun(experiment.id, {
  params: { learningRate: 0.01, epochs: 10, batchSize: 32 }
});

// Log metrics during training
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.logMetric(run.id, 'loss', 0.05);

// Save the model
experimentService.logArtifact(run.id, 'model', 'model', modelBuffer);
experimentService.endRun(run.id, 'completed');

// Find the best run
const bestRun = experimentService.getBestRun(experiment.id, 'accuracy', 'max');
console.log('Best accuracy:', bestRun.metrics.accuracy);

Your entire ML workflow, tracked and reproducible.

3. Drift Detection & Monitoring — Production ML Without Python

Monitor your models in production with five statistical drift detection methods, all implemented in pure TypeScript.

Why it matters: Models degrade over time. User behavior changes. Data distributions shift. Without monitoring, you won't know your model is failing until customers complain.

What you get:

5 statistical methods — PSI, Kolmogorov-Smirnov, Jensen-Shannon, Chi-square, Wasserstein
Feature drift detection — Catch when input distributions change
Prediction drift detection — Detect when model outputs shift
Continuous monitoring — Set up alerts for accuracy degradation
Pure TypeScript — No Python, no native dependencies

const driftService = new DriftService();

// Set reference distribution from training data
driftService.setReferenceDistribution('age', trainingAges);

// Detect drift in production data
const result = driftService.detectDrift('age', productionAges, {
  method: 'ks', // Kolmogorov-Smirnov test
  threshold: 0.1,
});

if (result.driftDetected) {
  console.warn(`Drift detected: ${result.message}`);
  console.log(`KS statistic: ${result.score}, p-value: ${result.pValue}`);
}

// Set up continuous monitoring
const monitorService = new MonitorService(driftService);
monitorService.registerModel({
  modelName: 'credit-risk-model',
  featureDrift: { method: 'ks', threshold: 0.1 },
  accuracyMonitor: { threshold: 0.85, windowSize: 100 },
  checkIntervalMinutes: 60,
});

monitorService.onAlert(async (alert) => {
  // Send to Slack, PagerDuty, etc.
  console.error(`[${alert.severity}] ${alert.message}`);
});

Production ML monitoring, no Python required.

4. Data Contracts — Schema Agreements for Data Pipelines

Implement the hottest trend in data engineering: data contracts that define schema agreements between producers and consumers.

Why it matters: Data pipelines break when schemas change unexpectedly. Data contracts prevent this by making schema agreements explicit, versioned, and validated.

What you get:

@DataContract decorator — Define schemas, SLAs, owners, and consumers
Breaking change detection — Automatically detect field removals and type changes
SLA tracking — Monitor freshness, completeness, quality, and availability
Version management — Support multiple contract versions with semver
Violation recording — Track and query contract violations over time

@DataContract({
  name: 'user-events',
  version: '1.0.0',
  owner: 'analytics-team',
  schema: {
    userId: { type: 'string', required: true },
    eventType: { type: 'string', required: true },
    timestamp: { type: 'date', required: true },
  },
  sla: {
    freshness: { maxDelayMinutes: 5 },
    completeness: {
      minCompleteness: 0.95,
      requiredFields: ['userId', 'eventType']
    }
  },
  consumers: ['recommendation-service', 'analytics-dashboard']
})
@Pipeline('user-events-pipeline')
class UserEventsPipeline extends PipelineBase {
  // Pipeline automatically validates against contract
}

// Validate data
const validation = registry.validate('user-events', data);
if (!validation.valid) {
  console.error('Contract violations:', validation.violations);
}

// Detect breaking changes between versions
const diff = registry.diff('user-events', '1.0.0', '2.0.0');
if (diff.isBreaking) {
  console.warn('Breaking changes:', diff.breakingChanges);
}

Data quality and reliability, built into your pipelines.

5. Production Connectors — JSONL & PostgreSQL

Two essential connectors for production data pipelines.

JSONL Connector — Newline-delimited JSON for efficient streaming:

const jsonlSource = new JsonlSource({ filePath: './data.jsonl' });
await jsonlSource.open();
for await (const record of jsonlSource.read()) {
  console.log(record);
}

PostgreSQL Connector — Full database integration with upsert support:

const pgSink = new PostgresSink({
  host: 'localhost',
  database: 'mydb',
  table: 'users',
  columns: ['id', 'name', 'email'],
  conflictColumn: 'id', // enables upserts
});
await pgSink.writeBatch(records);

Why This Matters: The Competitive Landscape

Let's be clear: these three features combined don't exist in any TypeScript framework.

Feature	Python Tools	TypeScript (Before)	HazelJS v0.3.0
Feature Store	Feast, Tecton	❌ None	✅ Native
Experiment Tracking	MLflow, W&B	❌ None	✅ Native
Drift Detection	Evidently AI, WhyLabs	❌ None	✅ Native
Data Contracts	dbt, Great Expectations	❌ None	✅ Native

Result: HazelJS is now the only full-stack data + ML framework for TypeScript—not just an ORM or pipeline runner, but a complete platform.

Architecture Decisions: Built for Production

We made five key architectural decisions to ensure HazelJS scales with your needs:

1. Zero Dependencies by Default

All new connectors (Postgres, Redis) are optional peer dependencies. The core functionality works with in-memory/file storage—no databases required.

2. Decorator-First API

Every feature includes decorators (@Feature, @FeatureView, @Experiment, @DataContract) for consistency with existing HazelJS patterns.

3. Pure TypeScript Implementation

All statistical tests (PSI, KS, JSD, Chi-square, Wasserstein) are implemented in pure TypeScript—no Python or native dependencies.

4. Pluggable Storage

Feature store: Memory, Redis, File, Postgres
Experiment tracking: File, SQLite
Easy to extend with custom backends

5. Type Safety First

Full TypeScript type inference, strict null checks, and ESLint compliance with zero warnings.

Getting Started

Installation

npm install @hazeljs/ml @hazeljs/data @hazeljs/core

Optional Dependencies (only if needed)

npm install pg redis  # For Postgres and Redis support

Quick Example: End-to-End ML Workflow

import { FeatureStoreService, ExperimentService, DriftService } from '@hazeljs/ml';

// 1. Set up feature store
const featureStore = new FeatureStoreService();
featureStore.configure({
  online: { type: 'memory' },
  offline: { type: 'file', file: { directory: './features' } },
});

// 2. Track experiments
const experimentService = new ExperimentService();
const experiment = experimentService.createExperiment('my-model');
const run = experimentService.startRun(experiment.id, {
  params: { learningRate: 0.01 }
});

// 3. Train and log
// ... your training code ...
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.endRun(run.id, 'completed');

// 4. Monitor in production
const driftService = new DriftService();
driftService.setReferenceDistribution('feature1', trainingData);
const drift = driftService.detectDrift('feature1', productionData, {
  method: 'ks',
  threshold: 0.1,
});

Migration Guide

Good news: This release has zero breaking changes. All new features are additive.

To use the new features:

Update to v0.3.0: npm update @hazeljs/ml @hazeljs/data
Install optional dependencies if needed: npm install pg redis
Import and use the new services as shown in the examples above

What's Next: The Roadmap

We're not stopping here. Phase 3 (v0.4.0) will include:

S3 Connector — Cloud object storage integration
Kafka Connector — Streaming data source/sink
Model Serving Strategies — A/B testing, canary deployment, shadow mode
Incremental Processing — Checkpoints and watermarks for ETL

Future considerations:

MongoDB and Parquet connectors
DAG pipelines with fan-out/fan-in
Model explainability (SHAP-like values)
Advanced anomaly detection

The Bottom Line

HazelJS v0.3.0 brings production-grade ML and data engineering to TypeScript. No more Python microservices. No more building everything from scratch. No more choosing between Node.js and ML best practices.

You can now:

Store and serve ML features with point-in-time correctness
Track experiments like MLflow, entirely in TypeScript
Monitor models in production with statistical drift detection
Enforce data contracts across your pipelines
Connect to PostgreSQL and stream JSONL files

All in TypeScript. All in one framework. All production-ready.

Try It Today

npm install @hazeljs/ml @hazeljs/data

Resources:

Questions? Issues? Feature requests? Open an issue on GitHub.

Built with ❤️ by the HazelJS team. Making TypeScript a first-class citizen for ML and data engineering.