Ops Agent Package
@hazeljs/ops-agent is a production-ready AI-powered DevOps assistant. Describe an incident in plain English and the agent creates Jira tickets, posts Slack notifications, and coordinates response — with optional human approval before any action runs.
What is an Ops Agent?
Traditional incident workflows require engineers to manually:
- Triage the alert
- Open a browser and create a Jira ticket
- Copy/paste context into a Slack channel
- Follow up with comments as the situation evolves
The Ops Agent collapses all of that into a single natural-language instruction. It understands context, decides which tools to use, executes them in the right order, and asks for approval before taking potentially disruptive actions.
Why @hazeljs/ops-agent?
| Challenge | Manual Workflow | With ops-agent |
|---|---|---|
| Creating Jira tickets | Copy context to browser, fill form | Single natural language command |
| Slack notifications | Manually compose and post | Agent posts with full incident context |
| Follow-up comments | Back to Jira, find ticket, add comment | "Add comment to the ticket" in the same session |
| Human oversight | No gates — actions taken immediately | requiresApproval: true for destructive ops |
| Incident context | Lost between tools | sessionId keeps memory across turns |
| Development/testing | Needs real Jira/Slack credentials | Mock responses when env vars not set |
Architecture
The runtime wires together the LLM, agent loop, and tool adapters. Natural language input is processed by the AI service, which decides which tools to call. Each tool call may require human approval before executing:
stateDiagram-v2 [*] --> Idle Idle --> Thinking: runOpsAgent(input) Thinking --> CallingTool: LLM selects tool CallingTool --> WaitingApproval: requiresApproval=true WaitingApproval --> CallingTool: approveToolExecution() WaitingApproval --> Idle: rejectToolExecution() CallingTool --> Thinking: tool result returned Thinking --> Completed: LLM produces final response Completed --> [*] style Thinking fill:#6366f1,color:#fff style CallingTool fill:#f59e0b,color:#fff style WaitingApproval fill:#ec4899,color:#fff style Completed fill:#10b981,color:#fff
Key Components
createOpsRuntime— Assembles the AI service, agent runtime, Jira tool, and Slack tool into a ready-to-run runtimerunOpsAgent— Runs the agent with a natural language input and session ID; returns the LLM responsecreateJiraTool— Adapter for Jira Cloud API (create, comment, get) with env-var configurationcreateSlackTool— Adapter for Slack Web API (post to channel or thread) with env-var configuration- Human-in-the-loop —
create_jira_ticketandpost_to_slackrequire approval before executing - Memory — In-memory
BufferMemoryper session; samesessionIdpreserves context across turns
Installation
npm install @hazeljs/ops-agent @hazeljs/ai @hazeljs/agent @hazeljs/rag
Configuration
Environment Variables
| Variable | Description | Required |
|---|---|---|
OPENAI_API_KEY | OpenAI API key (or other LLM provider) | Yes |
JIRA_HOST | Jira host, e.g. https://your-domain.atlassian.net | No* |
JIRA_EMAIL | Email for Basic auth | No* |
JIRA_API_TOKEN | API token from Atlassian | No* |
SLACK_BOT_TOKEN | Slack bot token (xoxb-...) from app OAuth | No* |
*When not configured, tools return placeholder responses so you can develop and test without real credentials.
Quick Start
import { AIEnhancedService } from '@hazeljs/ai';
import { createOpsRuntime, runOpsAgent, createJiraTool, createSlackTool } from '@hazeljs/ops-agent';
const runtime = createOpsRuntime({
aiService: new AIEnhancedService(),
tools: {
jira: createJiraTool(), // reads JIRA_HOST, JIRA_EMAIL, JIRA_API_TOKEN
slack: createSlackTool(), // reads SLACK_BOT_TOKEN
},
model: 'gpt-4',
});
const result = await runOpsAgent(runtime, {
input: 'Payment API is returning 504s in prod. Create a P1 Jira ticket in OPS and post to #incidents.',
sessionId: 'incident-2025-001',
});
console.log(result.response);
// => "I've created OPS-1234 'Payment API 504 errors in prod' with priority P1
// and posted a summary to #incidents."
Available Tools
| Tool | Description | Requires Approval |
|---|---|---|
create_jira_ticket | Create issue with project, summary, description, and issue type | Yes |
add_jira_comment | Add a comment to an existing Jira issue by key | No |
get_jira_ticket | Fetch issue details including status, priority, and assignee | No |
post_to_slack | Post message to a channel; optionally in a thread | Yes |
Tools marked Requires Approval pause execution and emit a tool.approval.requested event. The agent resumes only after approveToolExecution() or fails after rejectToolExecution().
Human-in-the-Loop
For any tool with requiresApproval: true, the agent pauses and waits for an explicit approval signal. Subscribe to the event and implement your approval logic (Slack DM, web UI, email — anything):
runtime.on('tool.approval.requested', async (event) => {
const { requestId, toolName, parameters } = event.data;
console.log(`Approval requested: ${toolName}`);
console.log('Parameters:', JSON.stringify(parameters, null, 2));
// In production: send DM to on-call engineer, wait for response
const approved = await askOnCallEngineer(toolName, parameters);
if (approved) {
runtime.approveToolExecution(requestId, 'on-call-engineer');
} else {
runtime.rejectToolExecution(requestId);
}
});
Complete Example: Incident Response Bot
A full Express HTTP endpoint that accepts incident descriptions and manages the full ops workflow:
import express from 'express';
import { AIEnhancedService } from '@hazeljs/ai';
import { createOpsRuntime, runOpsAgent, createJiraTool, createSlackTool } from '@hazeljs/ops-agent';
// ─── Runtime (create once, reuse) ────────────────────────────────────────────
const runtime = createOpsRuntime({
aiService: new AIEnhancedService(),
tools: {
jira: createJiraTool(),
slack: createSlackTool(),
},
model: 'gpt-4',
});
// ─── Auto-approve read-only tools, route write tools to Slack DM ─────────────
runtime.on('tool.approval.requested', async (event) => {
const { requestId, toolName, parameters } = event.data;
// read-only tools don't actually hit this — only create/post do
// Here: notify #ops-approvals and wait for reaction
const approved = await sendApprovalRequest({
channel: '#ops-approvals',
text: `*Approval needed:* \`${toolName}\`\n\`\`\`${JSON.stringify(parameters, null, 2)}\`\`\``,
requestId,
});
if (approved) {
runtime.approveToolExecution(requestId, 'ops-team');
} else {
runtime.rejectToolExecution(requestId);
}
});
// ─── Express API ─────────────────────────────────────────────────────────────
const app = express();
app.use(express.json());
// POST /incident { input, sessionId? }
app.post('/incident', async (req, res) => {
const { input, sessionId = `incident-${Date.now()}` } = req.body;
if (!input) {
return res.status(400).json({ error: 'input is required' });
}
try {
const result = await runOpsAgent(runtime, { input, sessionId });
res.json({
response: result.response,
sessionId, // return so client can continue the conversation
});
} catch (err) {
console.error('Ops agent error:', err);
res.status(500).json({ error: 'Agent failed to process the request' });
}
});
// POST /incident/followup { input, sessionId } — continue an existing incident
app.post('/incident/followup', async (req, res) => {
const { input, sessionId } = req.body;
if (!sessionId) return res.status(400).json({ error: 'sessionId is required for followup' });
// Same sessionId = agent remembers the previous context
const result = await runOpsAgent(runtime, { input, sessionId });
res.json({ response: result.response });
});
app.listen(4000, () => console.log('Ops agent API on :4000'));
Example interactions in the same session
POST /incident
{ "input": "Create a P1 Jira in OPS for 'Redis OOM in prod'", "sessionId": "incident-42" }
POST /incident/followup
{ "input": "Post the ticket link to #sre-incidents", "sessionId": "incident-42" }
POST /incident/followup
{ "input": "Add a comment that we're scaling the Redis cluster", "sessionId": "incident-42" }
Each follow-up uses the same sessionId — the agent knows which ticket was created and which Slack channel was used.
Multi-Turn Memory
createOpsRuntime configures the agent with BufferMemory by default. Every turn for a given sessionId is stored and included in subsequent prompts. This means:
- "Post the ticket link to #incidents" works after "Create the ticket" — the agent remembers the ticket key
- "Update the status to In Progress" works without specifying the ticket key again
- Context is kept in-memory — restart the process and memory resets (plug in a persistent memory backend for cross-restart memory)
Tool Configuration Reference
createJiraTool
Reads credentials from environment variables by default. Override by passing config directly:
const jiraTool = createJiraTool({
host: 'https://your-domain.atlassian.net', // or JIRA_HOST
email: 'bot@company.com', // or JIRA_EMAIL
apiToken: process.env.JIRA_API_TOKEN!, // or JIRA_API_TOKEN
});
createSlackTool
const slackTool = createSlackTool({
token: process.env.SLACK_BOT_TOKEN!, // or SLACK_BOT_TOKEN
});
createOpsRuntime Options
interface OpsRuntimeConfig {
aiService: AIEnhancedService; // LLM service instance
tools: {
jira: JiraTool;
slack: SlackTool;
};
model?: string; // LLM model, default: 'gpt-4'
systemPrompt?: string; // Override the default ops system prompt
}
Best Practices
Use descriptive session IDs. A meaningful sessionId like incident-2025-001 makes logs easier to trace than a random UUID. Use one session per incident.
Wire up approvals before calling runOpsAgent. The agent may immediately request an approval — if no handler is registered, the approval times out.
Test without credentials. When JIRA_HOST or SLACK_BOT_TOKEN are not set, tools return placeholder responses. Use this in unit tests and local development.
Add RAG for runbooks. Pass a @hazeljs/rag retriever to createOpsRuntime so the agent can reference runbooks and post-mortems as part of its response.
Related
- Agent Package — The underlying agent runtime that powers ops-agent
- AI Package — LLM providers, prompt management, and AI-enhanced services
- RAG Package — Retrieval-augmented generation for runbook and document search
For full API reference, see the Ops Agent package on GitHub.