HazelJS ML Package

@hazeljs/ml provides machine learning model management for HazelJS with a model registry, decorator-based training/prediction APIs, batch inference, metrics tracking, feature store, experiment tracking, and drift detection.

Quick Reference

Purpose: @hazeljs/ml provides ML model lifecycle management: registration, training, prediction, batch inference, metrics, feature store, experiment tracking, and drift detection.
When to use: Use @hazeljs/ml for managing ML models (training, serving, versioning). Use @hazeljs/ai for LLM integration (OpenAI, Anthropic). Use @hazeljs/data for data preprocessing before ML.
Key concepts: Model registry, @Train decorator, @Predict decorator, batch inference, metrics tracking, feature store, experiment tracking, drift detection.
Dependencies: @hazeljs/core.
Common patterns: Register model in registry → train with @Train → serve predictions with @Predict → track metrics → detect drift.
Common mistakes: Not versioning models; not tracking experiment metrics; not monitoring for data drift in production; confusing @hazeljs/ml (classical ML) with @hazeljs/ai (LLM integration).

Purpose

Building ML-powered applications requires model registration, training pipelines, inference services, and evaluation metrics. The @hazeljs/ml package simplifies this by providing:

Model Registry – Register and discover models by name and version
Decorator-Based API – @Model, @Train, @Predict for declarative ML classes
Feature Store – TypeScript-native feature store with online/offline storage and point-in-time retrieval
Experiment Tracking – MLflow-style experiment and run tracking with metrics, params, and artifacts
Drift Detection – Production ML monitoring with statistical drift tests (PSI, KS, Jensen-Shannon, Chi-square, Wasserstein)
Training Pipeline – PipelineService for data preprocessing (normalize, filter)
Inference – PredictorService for single and batch predictions
Metrics – MetricsService for evaluation, A/B testing, and monitoring
Framework-Agnostic – Works with TensorFlow.js, ONNX, Transformers.js, or custom backends

Architecture

The package uses a registry-based architecture with decorator-driven model registration:

graph TD
  A["MLModule.forRoot()<br/>(Model Registration)"] --> B["MLModelBootstrap<br/>(Discovers @Train, @Predict)"]
  B --> C["ModelRegistry<br/>(Name/Version Lookup)"]
  
  D["@Model Decorator<br/>(Metadata)"] --> E["@Train / @Predict<br/>(Method Discovery)"]
  E --> B
  
  C --> F["TrainerService<br/>(Training)"]
  C --> G["PredictorService<br/>(Inference)"]
  C --> H["BatchService<br/>(Batch Predictions)"]
  C --> I["MetricsService<br/>(Evaluation)"]
  
  G --> J["Single / Batch Prediction"]
  F --> K["Training Pipeline"]
  
  style A fill:#3b82f6,stroke:#60a5fa,stroke-width:2px,color:#fff
  style B fill:#8b5cf6,stroke:#a78bfa,stroke-width:2px,color:#fff
  style C fill:#10b981,stroke:#34d399,stroke-width:2px,color:#fff
  style D fill:#8b5cf6,stroke:#a78bfa,stroke-width:2px,color:#fff

Key Components

MLModule – Registers ModelRegistry, TrainerService, PredictorService, BatchService, MetricsService
ModelRegistry – Stores and retrieves models by name and version
TrainerService – Discovers and invokes @Train methods
PredictorService – Discovers and invokes @Predict methods
PipelineService – Data preprocessing for training
MetricsService – Model evaluation and metrics tracking
Decorators – @Model, @Train, @Predict for declarative ML

ML Decorators

Three decorators define an ML model and how it is trained and used. The registry and services discover them via reflection—no manual wiring.

@Model (class)

Attaches registry metadata so the model can be registered and looked up by name and version.

Property	Type	Required	Description
`name`	string	Yes	Unique model id (e.g. `sentiment-classifier`)
`version`	string	Yes	Semver (e.g. `1.0.0`)
`framework`	string	Yes	`tensorflow` \| `onnx` \| `custom`
`description`	string	No	Human-readable description
`tags`	string[]	No	Tags for filtering (default: `[]`)

Use one @Model per class and add @Injectable() so the app can construct the model.

@Train (method)

Marks the single method that trains the model. TrainerService.train(modelName, data) invokes it.

Option	Type	Default	Description
`pipeline`	string	`default`	Name of a registered PipelineService pipeline to run before training
`batchSize`	number	`32`	Hint for batching (optional)
`epochs`	number	`10`	Hint for epochs (optional)

Exactly one @Train() method per model; it receives training data and can return TrainingResult (e.g. accuracy, loss).

@Predict (method)

Marks the single method that runs inference. PredictorService.predict(modelName, input) invokes it.

Option	Type	Default	Description
`batch`	boolean	`false`	Hint that the method supports batch input
`endpoint`	string	`/predict`	Hint for route naming

Exactly one @Predict() method per model; it receives one input and returns a prediction object (e.g. { sentiment, confidence }).

Rules

One model class = one @Model, one @Train method, one @Predict method.
Order: Apply @Model on the class, then @Train and @Predict on the methods. Use @Injectable() from @hazeljs/core.
Discovery: When you pass model classes to MLModule.forRoot({ models: [...] }), the bootstrap finds the decorated methods and registers the model.

Advantages

1. Declarative ML

Define models with decorators—training and prediction methods are discovered automatically.

2. Model Versioning

3. Framework Flexibility

Use TensorFlow.js, ONNX, Transformers.js, or custom implementations—the package is backend-agnostic.

4. Batch Inference

BatchService for efficient batch predictions with configurable batch size. Results preserve input order.

5. Evaluation Built-In

MetricsService with evaluate() to run predictions on test data and compute accuracy, F1, precision, and recall. Supports custom label and prediction keys.

Installation

npm install @hazeljs/ml @hazeljs/core

Optional Peer Dependencies

# TensorFlow.js
npm install @tensorflow/tfjs-node

# ONNX Runtime
npm install onnxruntime-node

# Hugging Face Transformers (embeddings, sentiment)
npm install @huggingface/transformers

Quick Start

1. Import MLModule

import { HazelApp } from '@hazeljs/core';
import { MLModule } from '@hazeljs/ml';

const app = new HazelApp({
  imports: [
    MLModule.forRoot({
      models: [SentimentClassifier, SpamClassifier],
    }),
  ],
});

app.listen(3000);

2. Define a Model

import { Service } from '@hazeljs/core';
import { Model, Train, Predict, ModelRegistry } from '@hazeljs/ml';

@Model({ name: 'sentiment-classifier', version: '1.0.0', framework: 'custom' })
@Service()
export class SentimentClassifier {
  private labels = ['positive', 'negative', 'neutral'];
  private weights: Record<string, number[]> = {};

  constructor(private registry: ModelRegistry) {}

  @Train()
  async train(data: { text: string; label: string }[]): Promise<void> {
    // Your training logic – e.g. bag-of-words, embeddings
    const vocab = this.buildVocabulary(data);
    this.weights = this.computeWeights(data, vocab);
  }

  @Predict()
  async predict(input: { text: string }): Promise<{ sentiment: string; confidence: number }> {
    const scores = this.score(input.text);
    const idx = scores.indexOf(Math.max(...scores));
    return {
      sentiment: this.labels[idx],
      confidence: scores[idx],
    };
  }
}

3. Predict from a Controller

import { Controller, Post, Body, Inject } from '@hazeljs/core';
import { PredictorService } from '@hazeljs/ml';

@Controller('ml')
export class MLController {
  constructor(private predictor: PredictorService) {}

  @Post('predict')
  async predict(@Body() body: { text: string; model?: string }) {
    const result = await this.predictor.predict(
      body.model ?? 'sentiment-classifier',
      body
    );
    return { result };
  }
}

Training Pipeline

Preprocess data before training with PipelineService. Use inline steps (no registration) or named pipelines:

import { PipelineService } from '@hazeljs/ml';

const pipeline = new PipelineService();

// Inline steps (no registration required)
const steps = [
  { name: 'normalize', transform: (d: unknown) => ({ ...(d as object), text: (d as { text: string }).text?.toLowerCase() }) },
  { name: 'filter', transform: (d: unknown) => (d as { text: string }).text?.length ? d : null },
];
const processed = await pipeline.run(data, steps);
await model.train(processed);

// Or register a named pipeline for reuse
pipeline.registerPipeline('default', steps);
const processed2 = await pipeline.run('default', data);

Batch Predictions

BatchService processes inputs in batches with configurable concurrency. Results are returned in the same order as inputs.

import { BatchService } from '@hazeljs/ml';

const batchService = new BatchService(predictorService);
const results = await batchService.predictBatch('sentiment-classifier', items, {
  batchSize: 32,
  concurrency: 4,
});
// results[i] corresponds to items[i]

Metrics and Evaluation

Inject MetricsService via MLModule (it receives PredictorService and ModelRegistry). Use evaluate() to run predictions on test data and compute metrics:

import { Injectable } from '@hazeljs/core';
import { MetricsService } from '@hazeljs/ml';

@Injectable()
class EvaluationService {
  constructor(private metricsService: MetricsService) {}

  async runEvaluation() {
    const testData = [
      { text: 'great product', label: 'positive' },
      { text: 'terrible', label: 'negative' },
    ];
    const evaluation = await this.metricsService.evaluate('sentiment-classifier', testData, {
      metrics: ['accuracy', 'f1', 'precision', 'recall'],
      labelKey: 'label',           // key in test sample for ground truth
      predictionKey: 'sentiment',  // key in prediction result (auto-detect: label, sentiment, class)
    });
    // evaluation.metrics: { accuracy, precision, recall, f1Score }
    // Result is automatically recorded via recordEvaluation()
  }
}

Manual Model Registration

When not using MLModule.forRoot({ models: [...] }):

import { registerMLModel, ModelRegistry, TrainerService, PredictorService } from '@hazeljs/ml';

registerMLModel(
  sentimentInstance,
  modelRegistry,
  trainerService,
  predictorService
);

Feature Store

TypeScript-native feature store for managing ML features with online and offline storage:

import {
  FeatureStoreService,
  Feature,
  FeatureView,
  MemoryOnlineStore,
  RedisOnlineStore,
  FileOfflineStore,
  PostgresOfflineStore,
} from '@hazeljs/ml';

// Define features with decorators
@FeatureView({
  name: 'user-behavior',
  entities: ['user'],
  description: 'Features derived from user behavior',
  online: true,
  offline: true,
})
class UserBehaviorFeatures {
  @Feature({ valueType: 'number', description: 'Total login count' })
  loginCount: number;

  @Feature({ valueType: 'number', description: 'Average session duration in seconds' })
  avgSessionDuration: number;

  @Feature({ valueType: 'string', tags: ['demographic'] })
  userSegment: string;
}

// Configure feature store
const featureStore = new FeatureStoreService();
featureStore.configure({
  online: {
    type: 'redis',
    redis: { host: 'localhost', port: 6379 },
  },
  offline: {
    type: 'postgres',
    postgres: {
      host: 'localhost',
      port: 5432,
      database: 'features',
      user: 'user',
      password: 'pass',
    },
  },
  enablePointInTime: true, // Prevents data leakage in training
});

// Get features for online inference (low-latency)
const onlineFeatures = await featureStore.getOnlineFeatures(
  ['user123', 'user456'],
  ['loginCount', 'avgSessionDuration']
);

// Get historical features for training (point-in-time correct)
const trainingFeatures = await featureStore.getOfflineFeatures(
  ['user123'],
  ['loginCount', 'avgSessionDuration'],
  new Date('2024-01-01') // Features as they were on this date
);

// Push features to online store
await featureStore.pushOnlineFeatures('user123', {
  loginCount: 42,
  avgSessionDuration: 320,
});

// Write features to offline store
await featureStore.writeOfflineFeatures(
  'user123',
  { loginCount: 42, avgSessionDuration: 320 },
  new Date()
);

Feature Store Benefits

Point-in-Time Correctness – Prevents data leakage by retrieving features as they existed at training time
Dual Storage – Online store (Redis/Memory) for low-latency inference, offline store (Postgres/File) for training
Type-Safe – Decorator-driven feature definitions with TypeScript types
Zero Python Dependencies – Pure TypeScript implementation, no Feast or Python required

Experiment Tracking

MLflow-style experiment tracking with runs, metrics, parameters, and artifacts:

import { ExperimentService, Experiment } from '@hazeljs/ml';

// Configure experiment service
const experimentService = new ExperimentService();
experimentService.configure({
  storage: 'file',
  file: { directory: './experiments' },
});

// Create an experiment
const experiment = experimentService.createExperiment('sentiment-classifier', {
  description: 'Training sentiment classification models',
  tags: ['nlp', 'classification'],
});

// Start a training run
const run = experimentService.startRun(experiment.id, {
  name: 'run-v1',
  params: { learningRate: 0.01, epochs: 10, batchSize: 32 },
  tags: ['baseline'],
});

// Log metrics during training
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.logMetric(run.id, 'loss', 0.05);
experimentService.logMetrics(run.id, {
  precision: 0.94,
  recall: 0.96,
  f1Score: 0.95,
});

// Log artifacts (models, plots, logs)
experimentService.logArtifact(
  run.id,
  'model',
  'model',
  modelBuffer,
  { framework: 'tensorflow', size: modelBuffer.length }
);

// End the run
experimentService.endRun(run.id, 'completed');

// Find best run by metric
const bestRun = experimentService.getBestRun(experiment.id, 'accuracy', 'max');
console.log('Best accuracy:', bestRun.metrics.accuracy);

// Compare runs
const comparison = experimentService.compareRuns([run1.id, run2.id, run3.id]);
// [{ runId, params, metrics, durationMs }, ...]

Experiment Tracking with @Experiment Decorator

@Experiment({
  name: 'sentiment-classifier',
  description: 'Training sentiment classification models',
  tags: ['nlp'],
  autoLogParams: true,
  autoLogMetrics: true,
})
@Model({ name: 'sentiment', version: '1.0.0', framework: 'custom' })
@Injectable()
class SentimentClassifier {
  @Train()
  async train(data: TrainingData) {
    // Training runs are automatically tracked
  }
}

Drift Detection & Monitoring

Production ML monitoring with statistical drift detection:

import { DriftService, MonitorService } from '@hazeljs/ml';

// Initialize drift service
const driftService = new DriftService();

// Set reference distribution from training data
driftService.setReferenceDistribution('age', trainingAges);
driftService.setReferenceDistribution('income', trainingIncomes);

// Detect drift in production data
const ageResult = driftService.detectDrift('age', productionAges, {
  method: 'ks', // Kolmogorov-Smirnov test
  threshold: 0.1,
});

if (ageResult.driftDetected) {
  console.warn(`Drift detected: ${ageResult.message}`);
  console.log(`KS statistic: ${ageResult.score}, p-value: ${ageResult.pValue}`);
}

// Run full drift report on multiple features
const report = driftService.detectDriftReport(
  {
    age: productionAges,
    income: productionIncomes,
    creditScore: productionCreditScores,
  },
  {
    method: 'psi', // Population Stability Index
    threshold: 0.25,
  }
);

console.log(`Drift detected in ${report.driftedFeatures}/${report.totalFeatures} features`);
console.log(`Overall drift: ${report.overallDrift}`);

// Detect prediction drift
const predDrift = driftService.detectPredictionDrift(
  trainingPredictions,
  productionPredictions
);

// Set up continuous monitoring
const monitorService = new MonitorService(driftService);

monitorService.registerModel({
  modelName: 'credit-risk-model',
  modelVersion: '1.0.0',
  featureDrift: {
    method: 'ks',
    threshold: 0.1,
  },
  accuracyMonitor: {
    threshold: 0.85,
    windowSize: 100,
  },
  checkIntervalMinutes: 60,
});

// Set up alert handler
monitorService.onAlert(async (alert) => {
  console.error(`[${alert.severity}] ${alert.alertType}: ${alert.message}`);
  // Send to Slack, PagerDuty, etc.
});

// Record accuracy for monitoring
monitorService.recordAccuracy('credit-risk-model', 0.92);

Drift Detection Methods

Method	Use Case	Range
PSI (Population Stability Index)	Overall distribution shift	0–∞ (>0.25 = significant)
KS (Kolmogorov-Smirnov)	Continuous features	0–1 (D statistic + p-value)
JSD (Jensen-Shannon Divergence)	Symmetric distribution comparison	0–0.693
Chi-square	Categorical features	Chi² statistic + p-value
Wasserstein	Earth Mover's Distance	0–∞ (normalized by std)

All statistical tests are implemented in pure TypeScript with no Python dependencies.

Service Summary

Service	Purpose
`ModelRegistry`	Register and lookup models by name/version
`TrainerService`	Discover and invoke `@Train` methods
`PredictorService`	Discover and invoke `@Predict` methods
`PipelineService`	Data preprocessing (inline `run(data, steps)` or named pipelines)
`BatchService`	Batch prediction with configurable batch size (results in input order)
`MetricsService`	Model evaluation via `evaluate()` and metrics tracking
`FeatureStoreService`	Manage ML features with online/offline storage
`ExperimentService`	Track experiments, runs, metrics, and artifacts
`DriftService`	Statistical drift detection (PSI, KS, JSD, Chi², Wasserstein)
`MonitorService`	Continuous model monitoring with alerting

Recipes

Recipe: Sentiment Analysis Model

// File: src/ml/sentiment.model.ts
import { Model, Train, Predict } from '@hazeljs/ml';
import { Service } from '@hazeljs/core';

@Model({ name: 'sentiment', version: '1.0' })
@Service()
export class SentimentModel {
  @Train()
  async train(data: { text: string; label: string }[]) {
    // Train on labeled sentiment data
    return { accuracy: 0.92, samples: data.length };
  }

  @Predict()
  async predict(input: { text: string }) {
    // Returns sentiment prediction
    return { label: 'positive', confidence: 0.87 };
  }
}

Recipe: Serve ML Predictions via REST

// File: src/ml/ml.controller.ts
import { Controller, Post, Body } from '@hazeljs/core';
import { ModelRegistry } from '@hazeljs/ml';

@Controller('ml')
export class MLController {
  constructor(private readonly registry: ModelRegistry) {}

  @Post('predict')
  async predict(@Body() body: { model: string; input: any }) {
    const model = this.registry.get(body.model);
    const prediction = await model.predict(body.input);
    return { model: body.model, prediction };
  }
}

Recipe: Feature Store with Online/Offline Access

// File: src/ml/features.service.ts
import { Service } from '@hazeljs/core';
import { FeatureStore } from '@hazeljs/ml';

@Service()
export class FeatureService {
  constructor(private readonly features: FeatureStore) {}

  async storeUserFeatures(userId: string, features: Record<string, number>) {
    await this.features.set(`user:${userId}`, features);
  }

  async getUserFeatures(userId: string) {
    return this.features.get(`user:${userId}`);
  }
}

AI Package – LLM integration for hybrid AI/ML workflows
Cache Package – Cache model outputs and embeddings
Config Package – Model paths and API keys
hazeljs-ml-starter – Full app with sentiment, spam, intent classifiers, REST API, and scripts
example/src/ml – Minimal runnable example of @Model, @Train, @Predict (run: npm run ml:decorators)