HazelJS Memory System
The HazelJS Memory System provides persistent context and conversation management for AI applications: conversation history, entities, facts, and working memory. Memory lives in @hazeljs/rag (BufferMemory, VectorMemory, HybridMemory) and can optionally be backed by @hazeljs/memory for a unified, multi-store model shared by RAG and agents.
Quick Reference
- Purpose: The HazelJS Memory System manages conversation history, entity tracking, facts, and working memory for AI applications. It provides persistent context across multi-turn interactions for both RAG pipelines and AI agents.
- When to use: Use the HazelJS Memory System when AI agents or RAG pipelines need to remember conversation history across turns, track entities (people, companies), store facts, or maintain working memory for complex multi-step reasoning.
- Key concepts:
MemoryManager, 5 memory types (Conversation, Entity, Fact, Event, Working), 3 storage strategies (BufferMemory, VectorMemory, HybridMemory), semantic search over memories, auto-summarization, importance scoring, memory decay, shared memory between RAG and agents,@hazeljs/memorybackend adapter. - Inputs: User messages, assistant responses, entity data, facts, sessionId for per-session isolation.
- Outputs: Retrieved conversation context, relevant memories, entity profiles, fact lists, memory statistics.
- Dependencies:
@hazeljs/rag(primary), optionally@hazeljs/memoryfor persistent backends (Prisma, Redis). - Common patterns: Create
MemoryManagerwith storage config → pass to bothRagServiceandAgentRuntimefor shared context → usesessionIdto isolate conversations → useHybridMemoryfor best retrieval quality. - Common mistakes: Not sharing the same
MemoryManagerinstance between RAG and agents (leads to fragmented context); usingBufferMemoryalone for long conversations (loses old context); not setting asessionId(mixes conversations); forgetting to configure embeddings forVectorMemory.
When to Use: Storage Strategy Decision Guide
| Strategy | Speed | Retrieval quality | Use when |
|---|---|---|---|
BufferMemory | Fastest | Recency-only (FIFO) | Short conversations, prototyping, low-latency requirements |
VectorMemory | Slower (embedding lookups) | Best semantic relevance | Long conversations, fact-heavy workloads, agent reasoning |
HybridMemory | Medium | Best overall (recent + semantic) | Production applications that need both recent and relevant memories |
Overview
The HazelJS Memory System provides:
- 5 Memory Types: Conversation, Entity, Fact, Event, and Working memory
- 3 Storage Strategies: BufferMemory (fast), VectorMemory (semantic), HybridMemory (best of both)
- Semantic Search: Find relevant memories using embeddings
- Auto-Summarization: Compress old conversations automatically
- Entity Tracking: Remember people, companies, and relationships
- Importance Scoring: Prioritize relevant information
- RAG Integration: Combine document retrieval with conversation context
- Shared memory (RAG + Agent): One
MemoryManagerin-process, passed to both RAG and every AgentRuntime so they share the same conversation and context (see Shared memory: RAG + Agent) - Optional @hazeljs/memory backend: Use the Memory package (in-memory, Prisma, Redis) and the adapter from
@hazeljs/rag/memory-hazelto back RAG/agent memory with one store
Architecture
graph TD A["User Message"] --> B["Memory Manager"] B --> C["Conversation Memory"] B --> D["Entity Memory"] B --> E["Fact Memory"] B --> F["Working Memory"] C --> G["Buffer Store<br/>(Recent)"] C --> H["Vector Store<br/>(Long-term)"] I["Query"] --> J["Memory Search"] J --> G J --> H H --> K["Semantic Search"] K --> L["Relevant Memories"] M["RAG Pipeline"] --> N["Document Retrieval"] M --> L N --> O["Enhanced Context"] L --> O O --> P["LLM Response"] style A fill:#3b82f6,stroke:#60a5fa,stroke-width:2px,color:#fff style B fill:#3b82f6,stroke:#60a5fa,stroke-width:2px,color:#fff style M fill:#10b981,stroke:#34d399,stroke-width:2px,color:#fff style P fill:#8b5cf6,stroke:#a78bfa,stroke-width:2px,color:#fff
Memory Types
Conversation Memory
Track multi-turn conversations with automatic summarization.
import { MemoryManager, BufferMemory } from '@hazeljs/rag';
const memoryStore = new BufferMemory({ maxSize: 100 });
const memoryManager = new MemoryManager(memoryStore, {
maxConversationLength: 20,
summarizeAfter: 50,
});
await memoryManager.initialize();
// Add messages
await memoryManager.addMessage(
{ role: 'user', content: 'What is HazelJS?' },
'session-123'
);
await memoryManager.addMessage(
{ role: 'assistant', content: 'HazelJS is an AI-native framework...' },
'session-123'
);
// Get history
const history = await memoryManager.getConversationHistory('session-123', 10);
// Summarize
const summary = await memoryManager.summarizeConversation('session-123');
Features:
- Sliding window for recent messages
- Automatic summarization of old conversations
- Token-aware context management
- Multi-session support
Entity Memory
Track entities (people, companies, concepts) mentioned in conversations.
// Track an entity
await memoryManager.trackEntity({
name: 'Alice',
type: 'person',
attributes: {
role: 'engineer',
company: 'TechCorp',
},
relationships: [
{ type: 'works_at', target: 'TechCorp' },
],
firstSeen: new Date(),
lastSeen: new Date(),
mentions: 1,
});
// Retrieve entity
const alice = await memoryManager.getEntity('Alice');
// Update entity
await memoryManager.updateEntity('Alice', {
attributes: { ...alice.attributes, status: 'premium' },
});
// Get all entities
const entities = await memoryManager.getAllEntities('session-123');
Use Cases:
- Customer relationship management
- Personalized recommendations
- Knowledge graph construction
- Context-aware responses
Semantic Memory (Facts)
Store and recall facts with semantic understanding.
// Store facts
await memoryManager.storeFact(
'User prefers dark mode',
{ userId: 'user-123', category: 'preference' }
);
await memoryManager.storeFact(
'HazelJS supports TypeScript decorators',
{ category: 'framework-feature' }
);
// Recall facts semantically
const facts = await memoryManager.recallFacts('user preferences', {
topK: 5,
minScore: 0.7,
});
// Update a fact
await memoryManager.updateFact(factId, 'User prefers light mode');
Features:
- Semantic search across facts
- Time-based relevance
- Conflict detection
- Automatic consolidation
Working Memory
Temporary scratchpad for current task context.
// Set context
await memoryManager.setContext('current_task', 'checkout', 'session-123');
await memoryManager.setContext('cart_items', ['item1', 'item2'], 'session-123');
// Get context
const task = await memoryManager.getContext('current_task', 'session-123');
const items = await memoryManager.getContext('cart_items', 'session-123');
// Clear context
await memoryManager.clearContext('session-123');
Use Cases:
- Multi-step workflows
- State management
- Temporary calculations
- Task coordination
Storage Strategies
BufferMemory
Fast FIFO in-memory buffer for recent memories.
import { BufferMemory } from '@hazeljs/rag';
const buffer = new BufferMemory({
maxSize: 100,
ttl: 3600000, // 1 hour in milliseconds
});
Best For:
- Development and testing
- Recent conversation history
- Low-latency requirements
- Temporary context
Advantages:
- Extremely fast (in-memory)
- Zero setup
- No external dependencies
- Automatic TTL expiration
Limitations:
- Data lost on restart
- Limited capacity
- No semantic search
VectorMemory
Stores memories as embeddings for semantic search.
import { VectorMemory, MemoryVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new MemoryVectorStore(embeddings);
const vectorMemory = new VectorMemory(vectorStore, embeddings, {
collectionName: 'memories',
});
Best For:
- Long-term memory storage
- Semantic search requirements
- Production deployments
- Large memory volumes
Advantages:
- Semantic search
- Persistent storage
- Scalable
- Works with any vector store
Limitations:
- Slower than buffer
- Requires embeddings
- External dependencies
HybridMemory
Combines buffer and vector storage for optimal performance.
import { HybridMemory, BufferMemory, VectorMemory } from '@hazeljs/rag';
const buffer = new BufferMemory({ maxSize: 20 });
const vectorMemory = new VectorMemory(vectorStore, embeddings);
const hybrid = new HybridMemory(buffer, vectorMemory, {
archiveThreshold: 15, // Archive after 15 messages
});
Best For:
- Production applications
- Balancing speed and persistence
- Large-scale deployments
- Best of both worlds
How It Works:
- Recent memories stay in fast buffer
- Old memories automatically archive to vector store
- Searches check both stores
- Deduplication ensures consistency
RAG Integration
Combine memory with document retrieval for context-aware responses.
import {
RAGPipelineWithMemory,
MemoryManager,
HybridMemory,
BufferMemory,
VectorMemory,
MemoryVectorStore,
OpenAIEmbeddings,
} from '@hazeljs/rag';
// Setup memory
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const buffer = new BufferMemory({ maxSize: 20 });
const memoryVectorStore = new MemoryVectorStore(embeddings);
const vectorMemory = new VectorMemory(memoryVectorStore, embeddings);
const hybridMemory = new HybridMemory(buffer, vectorMemory);
const memoryManager = new MemoryManager(hybridMemory, {
maxConversationLength: 20,
summarizeAfter: 50,
entityExtraction: true,
});
// Setup RAG
const documentVectorStore = new MemoryVectorStore(embeddings);
const rag = new RAGPipelineWithMemory(
{
vectorStore: documentVectorStore,
embeddingProvider: embeddings,
topK: 5,
},
memoryManager,
llmFunction
);
await rag.initialize();
// Add documents
await rag.addDocuments([
{
content: 'HazelJS is a modern TypeScript framework...',
metadata: { source: 'docs' },
},
]);
// Query with memory context
const response = await rag.queryWithMemory(
'What did we discuss about pricing?',
'session-123',
'user-456'
);
console.log(response.answer);
console.log('Sources:', response.sources);
console.log('Memories:', response.memories);
console.log('History:', response.conversationHistory);
Enhanced Context
The RAG pipeline with memory combines three sources of context:
- Document Retrieval: Relevant documents from knowledge base
- Conversation History: Recent messages in the conversation
- Relevant Memories: Semantically similar past interactions
// Automatic fact extraction
const response = await rag.queryWithLearning(
'Tell me about HazelJS features',
'session-123',
'user-456'
);
// Facts from response are automatically stored
// Get conversation summary
const summary = await rag.getConversationSummary('session-123');
// Recall specific facts
const facts = await rag.recallFacts('user preferences', 5);
// Memory statistics
const stats = await rag.getMemoryStats('session-123');
Shared memory: RAG + Agent
RAG and agents can share the same memory in one process: create one store and one MemoryManager, then pass that MemoryManager to both RAGPipelineWithMemory and every AgentRuntime. Same sessionId means the same conversation history and context for both.
- In-process — No separate memory service, no HTTP. Everything runs in the same Node.js process.
- Central provisioning — Create the store and
MemoryManageronce at app startup (e.g. in a module or bootstrap), then inject or pass the same instance into RAG and all agents.
import { MemoryManager, RAGPipelineWithMemory } from '@hazeljs/rag';
import { AgentRuntime } from '@hazeljs/agent';
import { BufferMemory } from '@hazeljs/rag';
// One store, one MemoryManager
const ragStore = new BufferMemory({ maxSize: 50 });
const memoryManager = new MemoryManager(ragStore, { maxConversationLength: 30 });
await memoryManager.initialize();
// Pass the same MemoryManager to RAG and to every agent
const rag = new RAGPipelineWithMemory(config, memoryManager, llmFunction);
const agentA = new AgentRuntime({ memoryManager, llmProvider: openAIProvider });
const agentB = new AgentRuntime({ memoryManager, llmProvider: openAIProvider });
// Same sessionId => same conversation and context for RAG and agents
const sessionId = 'user-123-session';
await rag.queryWithMemory('What is HazelJS?', sessionId, 'user-123');
await agentA.execute('my-agent', 'What did we just discuss?', { sessionId, userId: 'user-123', enableMemory: true });
// Agent sees the RAG conversation via memoryManager.getConversationHistory(sessionId)
Using @hazeljs/memory as the backend
To back RAG and agent memory with the Memory package (in-memory, Prisma, Redis, or composite), use the HazelMemoryStoreAdapter so RAG's MemoryManager talks to @hazeljs/memory:
- Install the optional peer:
npm install @hazeljs/memory - Create a store and
MemoryServicefrom@hazeljs/memory, wrap it with the adapter from@hazeljs/rag/memory-hazel, then create oneMemoryManagerand pass it to RAG and all agents.
import { MemoryManager, RAGPipelineWithMemory } from '@hazeljs/rag';
import { createHazelMemoryStoreAdapter } from '@hazeljs/rag/memory-hazel';
import { MemoryService, createDefaultMemoryStore } from '@hazeljs/memory';
const hazelStore = createDefaultMemoryStore();
const memoryService = new MemoryService(hazelStore);
const ragStore = createHazelMemoryStoreAdapter(memoryService);
const memoryManager = new MemoryManager(ragStore);
const rag = new RAGPipelineWithMemory(config, memoryManager, llmFunction);
const agent = new AgentRuntime({ memoryManager, llmProvider });
- In-process: RAG, agents, and memory run in the same Node process; no HTTP.
- Shared memory: One store and one
MemoryManagercreated at app level and passed to RAG and everyAgentRuntime. - For Prisma or other backends, use the appropriate store factory from
@hazeljs/memory(or@hazeljs/memory/prisma) before wrapping withcreateHazelMemoryStoreAdapter.
See the Memory Package for categories, multi-store options, and the full API.
Advanced Features
Memory Search
Search across all memories semantically:
const relevantMemories = await memoryManager.relevantMemories(
'pricing and discounts',
{
sessionId: 'session-123',
types: [MemoryType.CONVERSATION, MemoryType.FACT],
topK: 5,
minScore: 0.7,
}
);
Importance Scoring
Automatically calculate and use importance scores:
const memoryManager = new MemoryManager(memoryStore, {
importanceScoring: true, // Enable automatic scoring
});
// Memories with higher importance are retained longer
// Questions and long content get higher scores
Memory Decay
Time-based relevance scoring:
const memoryManager = new MemoryManager(memoryStore, {
memoryDecay: true,
decayRate: 0.1, // 10% decay per time unit
});
// Older memories gradually become less relevant
Memory Statistics
Monitor memory usage:
const stats = await memoryManager.getStats('session-123');
console.log(`Total memories: ${stats.totalMemories}`);
console.log(`By type:`, stats.byType);
console.log(`Average importance: ${stats.averageImportance}`);
console.log(`Oldest: ${stats.oldestMemory}`);
console.log(`Newest: ${stats.newestMemory}`);
Memory Pruning
Clean up old or low-importance memories:
// Prune memories older than 30 days
const pruned = await memoryManager.prune({
olderThan: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000),
});
// Prune low-importance memories
const pruned = await memoryManager.prune({
minImportance: 0.5,
});
Configuration
Memory Manager Config
const config = {
maxConversationLength: 20, // Max messages in buffer
summarizeAfter: 50, // Summarize after N messages
entityExtraction: true, // Auto-extract entities
importanceScoring: true, // Calculate importance scores
memoryDecay: false, // Enable time-based decay
decayRate: 0.1, // Decay rate (if enabled)
maxWorkingMemorySize: 10, // Max working memory items
};
const memoryManager = new MemoryManager(memoryStore, config);
Buffer Memory Config
const bufferConfig = {
maxSize: 100, // Max memories in buffer
ttl: 3600000, // Time to live (ms)
};
const buffer = new BufferMemory(bufferConfig);
Hybrid Memory Config
const hybridConfig = {
bufferSize: 20, // Buffer size
archiveThreshold: 15, // Archive after N messages
ttl: 3600000, // Buffer TTL
};
const hybrid = new HybridMemory(buffer, vectorMemory, hybridConfig);
Use Cases
Customer Support Bot
// Remember customer information
await memoryManager.trackEntity({
name: 'Jane Smith',
type: 'customer',
attributes: { tier: 'premium', accountId: 'ACC-123' },
// ...
});
// Store support history
await memoryManager.storeFact(
'Customer reported login issues on 2024-01-15',
{ customerId: 'ACC-123', category: 'support' }
);
// Context-aware responses
const response = await rag.queryWithMemory(
'What was my previous issue?',
'session-123',
'ACC-123'
);
Personal AI Assistant
// Remember preferences
await memoryManager.storeFact('User prefers concise responses');
await memoryManager.storeFact('User timezone is PST');
// Track tasks
await memoryManager.setContext('active_tasks', ['email', 'meeting'], 'session-123');
// Personalized responses
const response = await rag.queryWithMemory(
'What should I focus on today?',
'session-123'
);
Educational Tutor
// Track learning progress
await memoryManager.trackEntity({
name: 'Student-123',
type: 'student',
attributes: {
level: 'intermediate',
completedLessons: ['intro', 'basics'],
},
// ...
});
// Remember misconceptions
await memoryManager.storeFact(
'Student confused about async/await',
{ studentId: 'Student-123', topic: 'javascript' }
);
Best Practices
Choose the Right Store
- Development: Use
BufferMemoryfor fast iteration - Production: Use
HybridMemoryfor best performance - Semantic Search: Use
VectorMemorywhen search is critical
Set Appropriate Limits
- Configure
maxConversationLengthbased on LLM token limits - Set
archiveThresholdto balance performance and memory - Use
summarizeAfterto compress long conversations
Enable Features Selectively
entityExtraction: For tracking people and thingsimportanceScoring: For prioritizationmemoryDecay: For time-based relevance
Monitor Memory Usage
// Regular monitoring
const stats = await memoryManager.getStats();
console.log(`Memory usage: ${stats.totalMemories}`);
// Periodic pruning
setInterval(async () => {
await memoryManager.prune({ olderThan: thirtyDaysAgo });
}, 24 * 60 * 60 * 1000); // Daily
Session Management
// Use consistent session IDs
const sessionId = `user-${userId}-${Date.now()}`;
// Clear sessions when done
await memoryManager.clearConversation(sessionId);
await memoryManager.clearContext(sessionId);
Examples
Check out the memory examples for complete working code:
- Basic Memory: Core features and memory types
- RAG with Memory: Integration with document retrieval
- Chatbot with Memory: Complete context-aware chatbot
- Shared Memory (RAG + Agent): One
MemoryManagershared by RAG and Agent in-process (npm run memory:sharedin the example app)
API Reference
MemoryManager
class MemoryManager {
// Conversation
addMessage(message: Message, sessionId: string): Promise<string>
getConversationHistory(sessionId: string, limit?: number): Promise<Message[]>
summarizeConversation(sessionId: string): Promise<string>
clearConversation(sessionId: string): Promise<void>
// Entity
trackEntity(entity: Entity): Promise<void>
getEntity(name: string): Promise<Entity | null>
updateEntity(name: string, updates: Partial<Entity>): Promise<void>
getAllEntities(sessionId?: string): Promise<Entity[]>
// Facts
storeFact(fact: string, metadata?: Record<string, any>): Promise<string>
recallFacts(query: string, options?: MemorySearchOptions): Promise<string[]>
updateFact(id: string, newContent: string): Promise<void>
// Working Memory
setContext(key: string, value: any, sessionId: string): Promise<void>
getContext(key: string, sessionId: string): Promise<any>
clearContext(sessionId: string): Promise<void>
// Search & Stats
relevantMemories(query: string, options: MemorySearchOptions): Promise<Memory[]>
getStats(sessionId?: string): Promise<MemoryStats>
}
RAGPipelineWithMemory
class RAGPipelineWithMemory extends RAGPipeline {
queryWithMemory(
query: string,
sessionId: string,
userId?: string,
options?: RAGQueryOptions
): Promise<RAGResponseWithMemory>
queryWithLearning(
query: string,
sessionId: string,
userId?: string,
options?: RAGQueryOptions
): Promise<RAGResponseWithMemory>
clearSessionMemory(sessionId: string): Promise<void>
getConversationSummary(sessionId: string): Promise<string>
storeFact(fact: string, sessionId?: string, userId?: string): Promise<string>
recallFacts(query: string, topK?: number): Promise<string[]>
getMemoryStats(sessionId?: string): Promise<MemoryStats>
}
Next Steps
- Memory Package — Multi-store memory with in-memory default, categories, and optional Prisma/Redis/vector
- Shared memory — One MemoryManager for RAG and agents; @hazeljs/memory as backend
- Explore RAG Patterns for advanced retrieval strategies
- Check out Vector Stores for storage options
- See the RAG Package for complete API reference