Ops Agent Package

npm downloads

@hazeljs/ops-agent is a production-ready AI-powered DevOps assistant. Describe an incident in plain English and the agent creates Jira tickets, posts Slack notifications, and coordinates response — with optional human approval before any action runs.

What is an Ops Agent?

Traditional incident workflows require engineers to manually:

  1. Triage the alert
  2. Open a browser and create a Jira ticket
  3. Copy/paste context into a Slack channel
  4. Follow up with comments as the situation evolves

The Ops Agent collapses all of that into a single natural-language instruction. It understands context, decides which tools to use, executes them in the right order, and asks for approval before taking potentially disruptive actions.

Why @hazeljs/ops-agent?

ChallengeManual WorkflowWith ops-agent
Creating Jira ticketsCopy context to browser, fill formSingle natural language command
Slack notificationsManually compose and postAgent posts with full incident context
Follow-up commentsBack to Jira, find ticket, add comment"Add comment to the ticket" in the same session
Human oversightNo gates — actions taken immediatelyrequiresApproval: true for destructive ops
Incident contextLost between toolssessionId keeps memory across turns
Development/testingNeeds real Jira/Slack credentialsMock responses when env vars not set

Architecture

The runtime wires together the LLM, agent loop, and tool adapters. Natural language input is processed by the AI service, which decides which tools to call. Each tool call may require human approval before executing:

stateDiagram-v2
  [*] --> Idle
  Idle --> Thinking: runOpsAgent(input)
  Thinking --> CallingTool: LLM selects tool
  CallingTool --> WaitingApproval: requiresApproval=true
  WaitingApproval --> CallingTool: approveToolExecution()
  WaitingApproval --> Idle: rejectToolExecution()
  CallingTool --> Thinking: tool result returned
  Thinking --> Completed: LLM produces final response
  Completed --> [*]

  style Thinking fill:#6366f1,color:#fff
  style CallingTool fill:#f59e0b,color:#fff
  style WaitingApproval fill:#ec4899,color:#fff
  style Completed fill:#10b981,color:#fff

Key Components

  • createOpsRuntime — Assembles the AI service, agent runtime, Jira tool, and Slack tool into a ready-to-run runtime
  • runOpsAgent — Runs the agent with a natural language input and session ID; returns the LLM response
  • createJiraTool — Adapter for Jira Cloud API (create, comment, get) with env-var configuration
  • createSlackTool — Adapter for Slack Web API (post to channel or thread) with env-var configuration
  • Human-in-the-loopcreate_jira_ticket and post_to_slack require approval before executing
  • Memory — In-memory BufferMemory per session; same sessionId preserves context across turns

Installation

npm install @hazeljs/ops-agent @hazeljs/ai @hazeljs/agent @hazeljs/rag

Configuration

Environment Variables

VariableDescriptionRequired
OPENAI_API_KEYOpenAI API key (or other LLM provider)Yes
JIRA_HOSTJira host, e.g. https://your-domain.atlassian.netNo*
JIRA_EMAILEmail for Basic authNo*
JIRA_API_TOKENAPI token from AtlassianNo*
SLACK_BOT_TOKENSlack bot token (xoxb-...) from app OAuthNo*

*When not configured, tools return placeholder responses so you can develop and test without real credentials.

Quick Start

import { AIEnhancedService } from '@hazeljs/ai';
import { createOpsRuntime, runOpsAgent, createJiraTool, createSlackTool } from '@hazeljs/ops-agent';

const runtime = createOpsRuntime({
  aiService: new AIEnhancedService(),
  tools: {
    jira: createJiraTool(),   // reads JIRA_HOST, JIRA_EMAIL, JIRA_API_TOKEN
    slack: createSlackTool(), // reads SLACK_BOT_TOKEN
  },
  model: 'gpt-4',
});

const result = await runOpsAgent(runtime, {
  input: 'Payment API is returning 504s in prod. Create a P1 Jira ticket in OPS and post to #incidents.',
  sessionId: 'incident-2025-001',
});

console.log(result.response);
// => "I've created OPS-1234 'Payment API 504 errors in prod' with priority P1
//    and posted a summary to #incidents."

Available Tools

ToolDescriptionRequires Approval
create_jira_ticketCreate issue with project, summary, description, and issue typeYes
add_jira_commentAdd a comment to an existing Jira issue by keyNo
get_jira_ticketFetch issue details including status, priority, and assigneeNo
post_to_slackPost message to a channel; optionally in a threadYes

Tools marked Requires Approval pause execution and emit a tool.approval.requested event. The agent resumes only after approveToolExecution() or fails after rejectToolExecution().

Human-in-the-Loop

For any tool with requiresApproval: true, the agent pauses and waits for an explicit approval signal. Subscribe to the event and implement your approval logic (Slack DM, web UI, email — anything):

runtime.on('tool.approval.requested', async (event) => {
  const { requestId, toolName, parameters } = event.data;

  console.log(`Approval requested: ${toolName}`);
  console.log('Parameters:', JSON.stringify(parameters, null, 2));

  // In production: send DM to on-call engineer, wait for response
  const approved = await askOnCallEngineer(toolName, parameters);

  if (approved) {
    runtime.approveToolExecution(requestId, 'on-call-engineer');
  } else {
    runtime.rejectToolExecution(requestId);
  }
});

Complete Example: Incident Response Bot

A full Express HTTP endpoint that accepts incident descriptions and manages the full ops workflow:

import express from 'express';
import { AIEnhancedService } from '@hazeljs/ai';
import { createOpsRuntime, runOpsAgent, createJiraTool, createSlackTool } from '@hazeljs/ops-agent';

// ─── Runtime (create once, reuse) ────────────────────────────────────────────

const runtime = createOpsRuntime({
  aiService: new AIEnhancedService(),
  tools: {
    jira: createJiraTool(),
    slack: createSlackTool(),
  },
  model: 'gpt-4',
});

// ─── Auto-approve read-only tools, route write tools to Slack DM ─────────────

runtime.on('tool.approval.requested', async (event) => {
  const { requestId, toolName, parameters } = event.data;

  // read-only tools don't actually hit this — only create/post do
  // Here: notify #ops-approvals and wait for reaction
  const approved = await sendApprovalRequest({
    channel: '#ops-approvals',
    text: `*Approval needed:* \`${toolName}\`\n\`\`\`${JSON.stringify(parameters, null, 2)}\`\`\``,
    requestId,
  });

  if (approved) {
    runtime.approveToolExecution(requestId, 'ops-team');
  } else {
    runtime.rejectToolExecution(requestId);
  }
});

// ─── Express API ─────────────────────────────────────────────────────────────

const app = express();
app.use(express.json());

// POST /incident  { input, sessionId? }
app.post('/incident', async (req, res) => {
  const { input, sessionId = `incident-${Date.now()}` } = req.body;

  if (!input) {
    return res.status(400).json({ error: 'input is required' });
  }

  try {
    const result = await runOpsAgent(runtime, { input, sessionId });
    res.json({
      response: result.response,
      sessionId, // return so client can continue the conversation
    });
  } catch (err) {
    console.error('Ops agent error:', err);
    res.status(500).json({ error: 'Agent failed to process the request' });
  }
});

// POST /incident/followup  { input, sessionId }  — continue an existing incident
app.post('/incident/followup', async (req, res) => {
  const { input, sessionId } = req.body;

  if (!sessionId) return res.status(400).json({ error: 'sessionId is required for followup' });

  // Same sessionId = agent remembers the previous context
  const result = await runOpsAgent(runtime, { input, sessionId });
  res.json({ response: result.response });
});

app.listen(4000, () => console.log('Ops agent API on :4000'));

Example interactions in the same session

POST /incident
{ "input": "Create a P1 Jira in OPS for 'Redis OOM in prod'", "sessionId": "incident-42" }

POST /incident/followup
{ "input": "Post the ticket link to #sre-incidents", "sessionId": "incident-42" }

POST /incident/followup
{ "input": "Add a comment that we're scaling the Redis cluster", "sessionId": "incident-42" }

Each follow-up uses the same sessionId — the agent knows which ticket was created and which Slack channel was used.

Multi-Turn Memory

createOpsRuntime configures the agent with BufferMemory by default. Every turn for a given sessionId is stored and included in subsequent prompts. This means:

  • "Post the ticket link to #incidents" works after "Create the ticket" — the agent remembers the ticket key
  • "Update the status to In Progress" works without specifying the ticket key again
  • Context is kept in-memory — restart the process and memory resets (plug in a persistent memory backend for cross-restart memory)

Tool Configuration Reference

createJiraTool

Reads credentials from environment variables by default. Override by passing config directly:

const jiraTool = createJiraTool({
  host: 'https://your-domain.atlassian.net',  // or JIRA_HOST
  email: 'bot@company.com',                   // or JIRA_EMAIL
  apiToken: process.env.JIRA_API_TOKEN!,      // or JIRA_API_TOKEN
});

createSlackTool

const slackTool = createSlackTool({
  token: process.env.SLACK_BOT_TOKEN!, // or SLACK_BOT_TOKEN
});

createOpsRuntime Options

interface OpsRuntimeConfig {
  aiService: AIEnhancedService;     // LLM service instance
  tools: {
    jira: JiraTool;
    slack: SlackTool;
  };
  model?: string;                   // LLM model, default: 'gpt-4'
  systemPrompt?: string;            // Override the default ops system prompt
}

Best Practices

Use descriptive session IDs. A meaningful sessionId like incident-2025-001 makes logs easier to trace than a random UUID. Use one session per incident.

Wire up approvals before calling runOpsAgent. The agent may immediately request an approval — if no handler is registered, the approval times out.

Test without credentials. When JIRA_HOST or SLACK_BOT_TOKEN are not set, tools return placeholder responses. Use this in unit tests and local development.

Add RAG for runbooks. Pass a @hazeljs/rag retriever to createOpsRuntime so the agent can reference runbooks and post-mortems as part of its response.

Related

  • Agent Package — The underlying agent runtime that powers ops-agent
  • AI Package — LLM providers, prompt management, and AI-enhanced services
  • RAG Package — Retrieval-augmented generation for runbook and document search

For full API reference, see the Ops Agent package on GitHub.