DocumentationReference

HazelJS ML Package

npm downloads

@hazeljs/ml provides machine learning model management for HazelJS with a model registry, decorator-based training/prediction APIs, batch inference, metrics tracking, feature store, experiment tracking, and drift detection.

Quick Reference

  • Purpose: @hazeljs/ml provides ML model lifecycle management: registration, training, prediction, batch inference, metrics, feature store, experiment tracking, and drift detection.
  • When to use: Use @hazeljs/ml for managing ML models (training, serving, versioning). Use @hazeljs/ai for LLM integration (OpenAI, Anthropic). Use @hazeljs/data for data preprocessing before ML.
  • Key concepts: Model registry, @Train decorator, @Predict decorator, batch inference, metrics tracking, feature store, experiment tracking, drift detection.
  • Dependencies: @hazeljs/core.
  • Common patterns: Register model in registry → train with @Train → serve predictions with @Predict → track metrics → detect drift.
  • Common mistakes: Not versioning models; not tracking experiment metrics; not monitoring for data drift in production; confusing @hazeljs/ml (classical ML) with @hazeljs/ai (LLM integration).

Purpose

Building ML-powered applications requires model registration, training pipelines, inference services, and evaluation metrics. The @hazeljs/ml package simplifies this by providing:

  • Model Registry – Register and discover models by name and version
  • Decorator-Based API@Model, @Train, @Predict for declarative ML classes
  • Feature Store – TypeScript-native feature store with online/offline storage and point-in-time retrieval
  • Experiment Tracking – MLflow-style experiment and run tracking with metrics, params, and artifacts
  • Drift Detection – Production ML monitoring with statistical drift tests (PSI, KS, Jensen-Shannon, Chi-square, Wasserstein)
  • Training Pipeline – PipelineService for data preprocessing (normalize, filter)
  • Inference – PredictorService for single and batch predictions
  • Metrics – MetricsService for evaluation, A/B testing, and monitoring
  • Framework-Agnostic – Works with TensorFlow.js, ONNX, Transformers.js, or custom backends

Architecture

The package uses a registry-based architecture with decorator-driven model registration:

graph TD
  A["MLModule.forRoot()<br/>(Model Registration)"] --> B["MLModelBootstrap<br/>(Discovers @Train, @Predict)"]
  B --> C["ModelRegistry<br/>(Name/Version Lookup)"]
  
  D["@Model Decorator<br/>(Metadata)"] --> E["@Train / @Predict<br/>(Method Discovery)"]
  E --> B
  
  C --> F["TrainerService<br/>(Training)"]
  C --> G["PredictorService<br/>(Inference)"]
  C --> H["BatchService<br/>(Batch Predictions)"]
  C --> I["MetricsService<br/>(Evaluation)"]
  
  G --> J["Single / Batch Prediction"]
  F --> K["Training Pipeline"]
  
  style A fill:#3b82f6,stroke:#60a5fa,stroke-width:2px,color:#fff
  style B fill:#8b5cf6,stroke:#a78bfa,stroke-width:2px,color:#fff
  style C fill:#10b981,stroke:#34d399,stroke-width:2px,color:#fff
  style D fill:#8b5cf6,stroke:#a78bfa,stroke-width:2px,color:#fff

Key Components

  1. MLModule – Registers ModelRegistry, TrainerService, PredictorService, BatchService, MetricsService
  2. ModelRegistry – Stores and retrieves models by name and version
  3. TrainerService – Discovers and invokes @Train methods
  4. PredictorService – Discovers and invokes @Predict methods
  5. PipelineService – Data preprocessing for training
  6. MetricsService – Model evaluation and metrics tracking
  7. Decorators@Model, @Train, @Predict for declarative ML

ML Decorators

Three decorators define an ML model and how it is trained and used. The registry and services discover them via reflection—no manual wiring.

@Model (class)

Attaches registry metadata so the model can be registered and looked up by name and version.

PropertyTypeRequiredDescription
namestringYesUnique model id (e.g. sentiment-classifier)
versionstringYesSemver (e.g. 1.0.0)
frameworkstringYestensorflow | onnx | custom
descriptionstringNoHuman-readable description
tagsstring[]NoTags for filtering (default: [])

Use one @Model per class and add @Injectable() so the app can construct the model.

@Train (method)

Marks the single method that trains the model. TrainerService.train(modelName, data) invokes it.

OptionTypeDefaultDescription
pipelinestringdefaultName of a registered PipelineService pipeline to run before training
batchSizenumber32Hint for batching (optional)
epochsnumber10Hint for epochs (optional)

Exactly one @Train() method per model; it receives training data and can return TrainingResult (e.g. accuracy, loss).

@Predict (method)

Marks the single method that runs inference. PredictorService.predict(modelName, input) invokes it.

OptionTypeDefaultDescription
batchbooleanfalseHint that the method supports batch input
endpointstring/predictHint for route naming

Exactly one @Predict() method per model; it receives one input and returns a prediction object (e.g. { sentiment, confidence }).

Rules

  • One model class = one @Model, one @Train method, one @Predict method.
  • Order: Apply @Model on the class, then @Train and @Predict on the methods. Use @Injectable() from @hazeljs/core.
  • Discovery: When you pass model classes to MLModule.forRoot({ models: [...] }), the bootstrap finds the decorated methods and registers the model.

Advantages

1. Declarative ML

Define models with decorators—training and prediction methods are discovered automatically.

2. Model Versioning

Register multiple versions of a model; the registry supports lookup by name and version.

3. Framework Flexibility

Use TensorFlow.js, ONNX, Transformers.js, or custom implementations—the package is backend-agnostic.

4. Batch Inference

BatchService for efficient batch predictions with configurable batch size. Results preserve input order.

5. Evaluation Built-In

MetricsService with evaluate() to run predictions on test data and compute accuracy, F1, precision, and recall. Supports custom label and prediction keys.

Installation

npm install @hazeljs/ml @hazeljs/core

Optional Peer Dependencies

# TensorFlow.js
npm install @tensorflow/tfjs-node

# ONNX Runtime
npm install onnxruntime-node

# Hugging Face Transformers (embeddings, sentiment)
npm install @huggingface/transformers

Quick Start

1. Import MLModule

import { HazelApp } from '@hazeljs/core';
import { MLModule } from '@hazeljs/ml';

const app = new HazelApp({
  imports: [
    MLModule.forRoot({
      models: [SentimentClassifier, SpamClassifier],
    }),
  ],
});

app.listen(3000);

2. Define a Model

import { Service } from '@hazeljs/core';
import { Model, Train, Predict, ModelRegistry } from '@hazeljs/ml';

@Model({ name: 'sentiment-classifier', version: '1.0.0', framework: 'custom' })
@Service()
export class SentimentClassifier {
  private labels = ['positive', 'negative', 'neutral'];
  private weights: Record<string, number[]> = {};

  constructor(private registry: ModelRegistry) {}

  @Train()
  async train(data: { text: string; label: string }[]): Promise<void> {
    // Your training logic – e.g. bag-of-words, embeddings
    const vocab = this.buildVocabulary(data);
    this.weights = this.computeWeights(data, vocab);
  }

  @Predict()
  async predict(input: { text: string }): Promise<{ sentiment: string; confidence: number }> {
    const scores = this.score(input.text);
    const idx = scores.indexOf(Math.max(...scores));
    return {
      sentiment: this.labels[idx],
      confidence: scores[idx],
    };
  }
}

3. Predict from a Controller

import { Controller, Post, Body, Inject } from '@hazeljs/core';
import { PredictorService } from '@hazeljs/ml';

@Controller('ml')
export class MLController {
  constructor(private predictor: PredictorService) {}

  @Post('predict')
  async predict(@Body() body: { text: string; model?: string }) {
    const result = await this.predictor.predict(
      body.model ?? 'sentiment-classifier',
      body
    );
    return { result };
  }
}

Training Pipeline

Preprocess data before training with PipelineService. Use inline steps (no registration) or named pipelines:

import { PipelineService } from '@hazeljs/ml';

const pipeline = new PipelineService();

// Inline steps (no registration required)
const steps = [
  { name: 'normalize', transform: (d: unknown) => ({ ...(d as object), text: (d as { text: string }).text?.toLowerCase() }) },
  { name: 'filter', transform: (d: unknown) => (d as { text: string }).text?.length ? d : null },
];
const processed = await pipeline.run(data, steps);
await model.train(processed);

// Or register a named pipeline for reuse
pipeline.registerPipeline('default', steps);
const processed2 = await pipeline.run('default', data);

Batch Predictions

BatchService processes inputs in batches with configurable concurrency. Results are returned in the same order as inputs.

import { BatchService } from '@hazeljs/ml';

const batchService = new BatchService(predictorService);
const results = await batchService.predictBatch('sentiment-classifier', items, {
  batchSize: 32,
  concurrency: 4,
});
// results[i] corresponds to items[i]

Metrics and Evaluation

Inject MetricsService via MLModule (it receives PredictorService and ModelRegistry). Use evaluate() to run predictions on test data and compute metrics:

import { Injectable } from '@hazeljs/core';
import { MetricsService } from '@hazeljs/ml';

@Injectable()
class EvaluationService {
  constructor(private metricsService: MetricsService) {}

  async runEvaluation() {
    const testData = [
      { text: 'great product', label: 'positive' },
      { text: 'terrible', label: 'negative' },
    ];
    const evaluation = await this.metricsService.evaluate('sentiment-classifier', testData, {
      metrics: ['accuracy', 'f1', 'precision', 'recall'],
      labelKey: 'label',           // key in test sample for ground truth
      predictionKey: 'sentiment',  // key in prediction result (auto-detect: label, sentiment, class)
    });
    // evaluation.metrics: { accuracy, precision, recall, f1Score }
    // Result is automatically recorded via recordEvaluation()
  }
}

Manual Model Registration

When not using MLModule.forRoot({ models: [...] }):

import { registerMLModel, ModelRegistry, TrainerService, PredictorService } from '@hazeljs/ml';

registerMLModel(
  sentimentInstance,
  modelRegistry,
  trainerService,
  predictorService
);

Feature Store

TypeScript-native feature store for managing ML features with online and offline storage:

import {
  FeatureStoreService,
  Feature,
  FeatureView,
  MemoryOnlineStore,
  RedisOnlineStore,
  FileOfflineStore,
  PostgresOfflineStore,
} from '@hazeljs/ml';

// Define features with decorators
@FeatureView({
  name: 'user-behavior',
  entities: ['user'],
  description: 'Features derived from user behavior',
  online: true,
  offline: true,
})
class UserBehaviorFeatures {
  @Feature({ valueType: 'number', description: 'Total login count' })
  loginCount: number;

  @Feature({ valueType: 'number', description: 'Average session duration in seconds' })
  avgSessionDuration: number;

  @Feature({ valueType: 'string', tags: ['demographic'] })
  userSegment: string;
}

// Configure feature store
const featureStore = new FeatureStoreService();
featureStore.configure({
  online: {
    type: 'redis',
    redis: { host: 'localhost', port: 6379 },
  },
  offline: {
    type: 'postgres',
    postgres: {
      host: 'localhost',
      port: 5432,
      database: 'features',
      user: 'user',
      password: 'pass',
    },
  },
  enablePointInTime: true, // Prevents data leakage in training
});

// Get features for online inference (low-latency)
const onlineFeatures = await featureStore.getOnlineFeatures(
  ['user123', 'user456'],
  ['loginCount', 'avgSessionDuration']
);

// Get historical features for training (point-in-time correct)
const trainingFeatures = await featureStore.getOfflineFeatures(
  ['user123'],
  ['loginCount', 'avgSessionDuration'],
  new Date('2024-01-01') // Features as they were on this date
);

// Push features to online store
await featureStore.pushOnlineFeatures('user123', {
  loginCount: 42,
  avgSessionDuration: 320,
});

// Write features to offline store
await featureStore.writeOfflineFeatures(
  'user123',
  { loginCount: 42, avgSessionDuration: 320 },
  new Date()
);

Feature Store Benefits

  • Point-in-Time Correctness – Prevents data leakage by retrieving features as they existed at training time
  • Dual Storage – Online store (Redis/Memory) for low-latency inference, offline store (Postgres/File) for training
  • Type-Safe – Decorator-driven feature definitions with TypeScript types
  • Zero Python Dependencies – Pure TypeScript implementation, no Feast or Python required

Experiment Tracking

MLflow-style experiment tracking with runs, metrics, parameters, and artifacts:

import { ExperimentService, Experiment } from '@hazeljs/ml';

// Configure experiment service
const experimentService = new ExperimentService();
experimentService.configure({
  storage: 'file',
  file: { directory: './experiments' },
});

// Create an experiment
const experiment = experimentService.createExperiment('sentiment-classifier', {
  description: 'Training sentiment classification models',
  tags: ['nlp', 'classification'],
});

// Start a training run
const run = experimentService.startRun(experiment.id, {
  name: 'run-v1',
  params: { learningRate: 0.01, epochs: 10, batchSize: 32 },
  tags: ['baseline'],
});

// Log metrics during training
experimentService.logMetric(run.id, 'accuracy', 0.95);
experimentService.logMetric(run.id, 'loss', 0.05);
experimentService.logMetrics(run.id, {
  precision: 0.94,
  recall: 0.96,
  f1Score: 0.95,
});

// Log artifacts (models, plots, logs)
experimentService.logArtifact(
  run.id,
  'model',
  'model',
  modelBuffer,
  { framework: 'tensorflow', size: modelBuffer.length }
);

// End the run
experimentService.endRun(run.id, 'completed');

// Find best run by metric
const bestRun = experimentService.getBestRun(experiment.id, 'accuracy', 'max');
console.log('Best accuracy:', bestRun.metrics.accuracy);

// Compare runs
const comparison = experimentService.compareRuns([run1.id, run2.id, run3.id]);
// [{ runId, params, metrics, durationMs }, ...]

Experiment Tracking with @Experiment Decorator

@Experiment({
  name: 'sentiment-classifier',
  description: 'Training sentiment classification models',
  tags: ['nlp'],
  autoLogParams: true,
  autoLogMetrics: true,
})
@Model({ name: 'sentiment', version: '1.0.0', framework: 'custom' })
@Injectable()
class SentimentClassifier {
  @Train()
  async train(data: TrainingData) {
    // Training runs are automatically tracked
  }
}

Drift Detection & Monitoring

Production ML monitoring with statistical drift detection:

import { DriftService, MonitorService } from '@hazeljs/ml';

// Initialize drift service
const driftService = new DriftService();

// Set reference distribution from training data
driftService.setReferenceDistribution('age', trainingAges);
driftService.setReferenceDistribution('income', trainingIncomes);

// Detect drift in production data
const ageResult = driftService.detectDrift('age', productionAges, {
  method: 'ks', // Kolmogorov-Smirnov test
  threshold: 0.1,
});

if (ageResult.driftDetected) {
  console.warn(`Drift detected: ${ageResult.message}`);
  console.log(`KS statistic: ${ageResult.score}, p-value: ${ageResult.pValue}`);
}

// Run full drift report on multiple features
const report = driftService.detectDriftReport(
  {
    age: productionAges,
    income: productionIncomes,
    creditScore: productionCreditScores,
  },
  {
    method: 'psi', // Population Stability Index
    threshold: 0.25,
  }
);

console.log(`Drift detected in ${report.driftedFeatures}/${report.totalFeatures} features`);
console.log(`Overall drift: ${report.overallDrift}`);

// Detect prediction drift
const predDrift = driftService.detectPredictionDrift(
  trainingPredictions,
  productionPredictions
);

// Set up continuous monitoring
const monitorService = new MonitorService(driftService);

monitorService.registerModel({
  modelName: 'credit-risk-model',
  modelVersion: '1.0.0',
  featureDrift: {
    method: 'ks',
    threshold: 0.1,
  },
  accuracyMonitor: {
    threshold: 0.85,
    windowSize: 100,
  },
  checkIntervalMinutes: 60,
});

// Set up alert handler
monitorService.onAlert(async (alert) => {
  console.error(`[${alert.severity}] ${alert.alertType}: ${alert.message}`);
  // Send to Slack, PagerDuty, etc.
});

// Record accuracy for monitoring
monitorService.recordAccuracy('credit-risk-model', 0.92);

Drift Detection Methods

MethodUse CaseRange
PSI (Population Stability Index)Overall distribution shift0–∞ (>0.25 = significant)
KS (Kolmogorov-Smirnov)Continuous features0–1 (D statistic + p-value)
JSD (Jensen-Shannon Divergence)Symmetric distribution comparison0–0.693
Chi-squareCategorical featuresChi² statistic + p-value
WassersteinEarth Mover's Distance0–∞ (normalized by std)

All statistical tests are implemented in pure TypeScript with no Python dependencies.

Service Summary

ServicePurpose
ModelRegistryRegister and lookup models by name/version
TrainerServiceDiscover and invoke @Train methods
PredictorServiceDiscover and invoke @Predict methods
PipelineServiceData preprocessing (inline run(data, steps) or named pipelines)
BatchServiceBatch prediction with configurable batch size (results in input order)
MetricsServiceModel evaluation via evaluate() and metrics tracking
FeatureStoreServiceManage ML features with online/offline storage
ExperimentServiceTrack experiments, runs, metrics, and artifacts
DriftServiceStatistical drift detection (PSI, KS, JSD, Chi², Wasserstein)
MonitorServiceContinuous model monitoring with alerting

Recipes

Recipe: Sentiment Analysis Model

// File: src/ml/sentiment.model.ts
import { Model, Train, Predict } from '@hazeljs/ml';
import { Service } from '@hazeljs/core';

@Model({ name: 'sentiment', version: '1.0' })
@Service()
export class SentimentModel {
  @Train()
  async train(data: { text: string; label: string }[]) {
    // Train on labeled sentiment data
    return { accuracy: 0.92, samples: data.length };
  }

  @Predict()
  async predict(input: { text: string }) {
    // Returns sentiment prediction
    return { label: 'positive', confidence: 0.87 };
  }
}

Recipe: Serve ML Predictions via REST

// File: src/ml/ml.controller.ts
import { Controller, Post, Body } from '@hazeljs/core';
import { ModelRegistry } from '@hazeljs/ml';

@Controller('ml')
export class MLController {
  constructor(private readonly registry: ModelRegistry) {}

  @Post('predict')
  async predict(@Body() body: { model: string; input: any }) {
    const model = this.registry.get(body.model);
    const prediction = await model.predict(body.input);
    return { model: body.model, prediction };
  }
}

Recipe: Feature Store with Online/Offline Access

// File: src/ml/features.service.ts
import { Service } from '@hazeljs/core';
import { FeatureStore } from '@hazeljs/ml';

@Service()
export class FeatureService {
  constructor(private readonly features: FeatureStore) {}

  async storeUserFeatures(userId: string, features: Record<string, number>) {
    await this.features.set(`user:${userId}`, features);
  }

  async getUserFeatures(userId: string) {
    return this.features.get(`user:${userId}`);
  }
}
  • AI Package – LLM integration for hybrid AI/ML workflows
  • Cache Package – Cache model outputs and embeddings
  • Config Package – Model paths and API keys
  • hazeljs-ml-starter – Full app with sentiment, spam, intent classifiers, REST API, and scripts
  • example/src/ml – Minimal runnable example of @Model, @Train, @Predict (run: npm run ml:decorators)