Skip to main content
The Observability system automatically collects execution traces for analysis and debugging, and provides a track() API for emitting custom business events that power real-time dashboards.

Automatic Tracing (Agent)

// Traces are collected automatically
const agent = new Agent({
  name: 'Support Agent',
  instructions: 'Help customers.',
  model: openai('gpt-4o'),
});

// Each execution automatically generates traces
await agent.process({
  message: 'Help me',
  companyId: 'company_123',   // Optional
  sessionId: 'session_456',   // Optional
  executionId: 'exec_123',    // Optional
  threadId: 'thread_789',     // Optional
});

Automatic Tracing (Workflow)

Workflows automatically trace every step with full hierarchy:
const workflow = flow({
  id: 'validate-order',
  name: 'Order Validation',
  inputSchema: z.object({ orderId: z.string() }),
  outputSchema: z.any(),
})
  .step('fetch', async (input) => db.getOrder(input.orderId))
  .step('validate', async (input) => ({ valid: input.status === 'active' }))
  .agent('respond', responseAgent)
  .build();

await workflow.execute({ orderId: 'ORD-123' });
// Generates trace hierarchy:
//   workflow_execution > "Order Validation"
//     ├── workflow_step > "fetch" [function]
//     ├── workflow_step > "validate" [function]
//     └── workflow_step > "respond" [agent]
//         └── agent_execution
//             └── llm_call
Each step type (function, agent, connector, condition, switch, foreach, parallel) is traced with its own color and label in the portal.

Verbose Tracing Mode

Control how much data is saved in traces. Works identically for Agent and Workflow. Modes:
  • full: Complete data including prompts and responses (default)
  • standard: Balanced metadata with truncation
  • minimal: Disables tracing entirely (no traces sent)
Simple API (string preset):
const agent = new Agent({
  name: 'My Agent',
  model: openai('gpt-4o'),
  observability: 'minimal'  // Disable traces completely
});

const workflow = flow({
  id: 'my-workflow',
  inputSchema, outputSchema,
  observability: 'standard'  // Truncate large inputs/outputs
}).step('a', handler).build();
Granular Control (object config):
const agent = new Agent({
  name: 'My Agent',
  model: openai('gpt-4o'),
  observability: {
    mode: 'standard',           // Base mode
    verboseLLM: true,            // Override: save complete prompts
    verboseMemory: false,        // Override: keep memory minimal
    verboseTools: true,          // Override: save tool data (default)
    maxInputLength: 5000,        // Truncate large inputs
    maxOutputLength: 5000,       // Truncate large outputs
  }
});

Trace Interceptor (onTrace)

Intercept, modify, or cancel traces before they are sent. Available in Agent, Workflow, and standalone logging.
// Agent
const agent = new Agent({
  observability: {
    onTrace: (trace) => {
      // Remove sensitive data
      if (trace.input?.cpf) delete trace.input.cpf;
      if (trace.input?.password) delete trace.input.password;
      return trace;
    }
  }
});

// Workflow
const workflow = flow({
  id: 'pipeline',
  inputSchema, outputSchema,
  observability: {
    onTrace: (trace) => {
      // Cancel LLM traces (only keep step-level)
      if (trace.type === 'llm_call') return null;
      return trace;
    }
  }
}).step('a', handler).build();

// Standalone logging
import { configureLogging } from '@runflow-ai/sdk';

configureLogging({
  onTrace: (trace) => {
    // Send to external system
    datadog.sendTrace(trace);
    return trace;
  }
});
Return values:
  • Return the trace (modified or not) to send it
  • Return null to cancel (trace is not sent)
  • Return void to send unchanged

Trace Hierarchy (startSpan)

Create parent-child relationships between custom logs for structured traces:
import { startSpan, log } from '@runflow-ai/sdk';

async function processBatch(items: any[]) {
  // Create a parent span
  const batch = startSpan('process-batch');

  for (const item of items) {
    // Child logs grouped under the parent
    log('process-item', {
      input: { id: item.id },
      output: { status: 'ok' }
    }, { parentId: batch.traceId });
  }

  // Close the parent span
  batch.end({ output: { total: items.length } });
}
This produces a hierarchical trace in the portal:
custom_event > "process-batch"
  ├── custom_event > "process-item"
  ├── custom_event > "process-item"
  └── custom_event > "process-item"

Custom Executions (Non-Agent Flows)

For scenarios without agent.process() (document analysis, batch processing, etc.):
import { identify, startExecution, log } from '@runflow-ai/sdk/observability';

export async function analyzeDocument(docId: string) {
  // 1. Identify context
  identify({ type: 'document', value: docId });

  // 2. Start custom execution
  const exec = startExecution({
    name: 'document-analysis',
    input: { documentId: docId }
  });

  try {
    // 3. Process with LLM calls
    const llm = LLM.openai('gpt-4o');

    const text = await llm.chat("Extract text from document...");
    exec.log('text_extracted', { length: text.length });

    const category = await llm.chat(`Classify this: ${text}`);
    exec.log('document_classified', { category });
    // exec.log() automatically parents to the execution span

    const summary = await llm.chat(`Summarize: ${text}`);

    // 4. Finish with custom output
    await exec.end({
      output: {
        summary,
        category,
        documentId: docId
      }
    });

    return { summary, category };

  } catch (error) {
    exec.setError(error);
    await exec.end();
    throw error;
  }
}

Custom Logging

Log custom events within any execution:
import { log, logEvent, logError } from '@runflow-ai/sdk/observability';

// Simple log
log('cache_hit', { key: 'user_123' });

// Structured log with parent
log('step_completed', {
  input: { orderId: '123' },
  output: { valid: true },
}, { parentId: parentSpan.traceId });

// Structured log
logEvent('validation', {
  input: { orderId: '123', amount: 100 },
  output: { valid: true, score: 0.95 },
  metadata: { rule: 'fraud_detection' }
});

// Error log
try {
  await riskyOperation();
} catch (error) {
  logError('operation_failed', error);
  throw error;
}

Conversation Messages

Available since @runflow-ai/sdk@1.1.10.
Use message() to record a turn of a conversation. Each call emits a conversation_message trace that the Runflow portal renders as a chat bubble: user inbound on the left, assistant outbound on the right. The portal switches automatically to chat view when an execution has at least one conversation_message trace — no flag, no channel hint. The thread sidebar preview also updates to show the latest user/assistant text instead of raw envelope JSON.
import { message } from '@runflow-ai/sdk/observability';

message({ role: 'user',      content: 'Quais acomodações?' });
message({ role: 'assistant', content: 'Oferecemos duas opções...' });

When to use

  • Custom workflows (WhatsApp handlers, webhook routers) where you control the message flow without agent.process().
  • LLM agents when you want to also expose the conversation as chat (wrap agent.process() calls).
  • Anywhere you want the execution to render as a conversation in the portal.
If you never call message(), nothing changes — your existing traces and rendering keep working exactly as before.

Wrapping an agent call (LLM)

import { Agent, message, openai } from '@runflow-ai/sdk';

const agent = new Agent({ name: 'concierge', model: openai('gpt-4o'), instructions });

export async function main(input: AgentInput) {
  message({ role: 'user', content: input.message });

  const reply = await agent.process({ message: input.message, sessionId: input.sessionId });

  message({ role: 'assistant', content: reply.message });
  return reply;
}

Custom workflow (no LLM)

import { message, startSpan, log } from '@runflow-ai/sdk/observability';

export async function handleWebhook(input: AgentInput) {
  const turn = startSpan('turn');

  try {
    message({ role: 'user', content: input.message });          // "48337725826"

    const verify = startSpan('verify_cpf');
    const result = await api.searchUser(input.message);
    log('cpf_verified', { output: result });
    verify.end({});

    const reply = result.user_exists
      ? `Bem-vindo de volta, ${result.name}!`
      : 'CPF não encontrado. Vamos te cadastrar! Qual seu nome?';

    message({ role: 'assistant', content: reply });
    return { message: reply };
  } finally {
    turn.end({});
  }
}
The startSpan call is optional but recommended — it groups the technical traces under the turn so the drill-down drawer in the portal stays organized.

Multiple assistant messages per turn

A turn can emit any number of assistant messages — they render as consecutive bubbles in chronological order, exactly like WhatsApp:
message({ role: 'user', content: 'Oi' });

message({ role: 'assistant', content: 'Oi! Tudo bem?' });
message({ role: 'assistant', content: 'Como posso te ajudar?' });

Structured content (buttons, audio, image)

content accepts a string OR an object with a type field. The portal renders text natively and falls back to a JSON view for structured content (buttons / media renderers are on the roadmap):
message({
  role: 'assistant',
  content: {
    type: 'buttons',
    text: 'O que deseja?',
    items: [
      { id: 'support', label: 'Suporte' },
      { id: 'sales',   label: 'Vendas' },
    ],
  },
});

Hierarchy and grouping

Messages follow the same parenting rules as log() and startSpan():
You writeWhere the message goes
message({...}) inside a startSpan() blockChild of the active span
message({..., parentId: span.traceId })Child of that span explicitly
message({...}) with no active spanRoot (no grouping)
In the portal, clicking any bubble of a turn opens the drill-down with all traces of that execution — the hierarchy you create only affects how the trace tree looks in the drawer.

Parameters

NameTypeRequiredDescription
data.rolestringYes'user', 'assistant', 'system', 'tool', or any custom string
data.contentstring | objectYesText (renders natively) or structured object with a type field
data.metadataRecord<string, any>NoExtra fields (citations, confidence, custom flags)
data.parentIdstringNoExplicit parent span’s traceId
options.parentIdstringNoSame as data.parentId (positional override)

Business Event Tracking

Use track() to emit custom business events from your agent. These events power the Metrics dashboard in the portal, where you can build KPI cards, charts, and real-time feeds without writing any backend code.
import { track } from '@runflow-ai/sdk/observability';

// Inside your agent's tool or logic
track('alert_received', {
  company: 'NW Telecom',
  severity: 'High',
  source: 'Zabbix',
});

track('ticket_resolved', {
  duration: 45,
  answered: true,
  category: 'network',
});
Events are buffered and sent in batches automatically (up to 50 events or every 2 seconds). No manual flushing needed during normal execution.

How It Works

  1. Call track(eventName, properties) anywhere in your agent code
  2. The SDK buffers events and sends them in batches to the Runflow API
  3. Open the Metrics tab in the portal to create dashboard cards
  4. Cards auto-discover your event names and properties — no configuration needed

Parameters

ParameterTypeRequiredDescription
eventNamestringYesName of the event (e.g. 'alert_received', 'order_placed')
propertiesRecord<string, any>NoKey-value pairs with event data
optionsTrackOptionsNoOverride threadId, executionId, or timestamp

Options

track('payment_processed', { amount: 150.00 }, {
  threadId: 'custom-thread-123',     // Override auto-resolved thread
  executionId: 'custom-exec-456',    // Override auto-resolved execution
  timestamp: '2026-01-15T10:30:00Z', // Custom timestamp
});

Flushing Before Exit

For short-lived scripts or CLI tools, call flushTrackEvents() before exiting to ensure all events are sent:
import { track, flushTrackEvents } from '@runflow-ai/sdk/observability';

track('batch_completed', { total: 500, errors: 2 });

// Ensure events are sent before process exits
await flushTrackEvents();

Dashboard Cards

In the portal, navigate to your agent’s Metrics tab to create cards:
  • Number — KPI with a single aggregated value (count, sum, avg)
  • Rate — Percentage based on a filtered property value
  • Line / Bar — Time-series charts grouped by hour, day, week, or month
  • Pie — Distribution chart over time periods
Cards support drag-to-resize, custom colors, and multiple aggregation types:
AggregationDescriptionExample
countTotal eventsTotal alerts received
ratePercentage where property matches a value% of alerts answered
sumSum of a numeric propertyTotal revenue
avgAverage of a numeric propertyAverage response time
distinct_countCount of unique property valuesUnique companies served

Best Practices

Use snake_case names that describe what happened: alert_received, ticket_resolved, payment_processed. Avoid generic names like event or action.
Properties are stored as JSON and queried via keys. Flat key-value pairs work best for dashboard aggregations:
// Good
track('order_placed', { amount: 99.90, category: 'electronics', customer_id: 'c_123' });

// Avoid nested objects
track('order_placed', { order: { amount: 99.90, details: { category: 'electronics' } } });
Keep the same property as the same type across events. If duration is a number in one event, don’t send it as a string in another — aggregations like sum and avg rely on numeric values.

Execution Reviews

Available since @runflow-ai/sdk@1.1.13. Requires an API client that exposes the reviews namespace.
Reviews exposes the execution-review feedback loop programmatically — the same surface used by the portal QA queue and the MCP tools (create_execution_review, list_execution_reviews, …). Use it from LLM-judge agents, KB curators, or scheduled jobs to flag bad executions, triage them, and feed corrected outputs back into training datasets.
import { Reviews } from '@runflow-ai/sdk';

const reviews = new Reviews();

// Flag a bad execution
const { reviewId } = await reviews.create({
  executionId: 'exec-uuid',
  agentId: 'agent-uuid',
  rating: 'bad',
  comment: 'Bot gave the wrong business hours',
  priority: 'high',
  tags: ['hours_wrong'],
});

// Resolve it with a corrected answer (auto-stamps resolvedBy / resolvedAt)
await reviews.update(reviewId, {
  status: 'resolved',
  actionTaken: 'knowledge_base_updated',
  correctedOutput: 'We are open 9–18, Monday to Friday.',
});
Each execution can have only one review — call checkHasReview first if you don’t want a 409 surfaced to the caller. Authentication uses your RUNFLOW_API_KEY; reviewedBy is auto-populated from the API key label on the backend.

Methods

MethodDescription
create(args)Create a review (executionId + agentId + rating + comment ≥ 10 chars).
checkHasReview(executionId)Idempotent check before create().
list(filters?)List reviews — filter by agentId, status, rating, priority, dateFrom/dateTo, search (PT full-text).
get(reviewId)Fetch a single review.
update(reviewId, args)Update status, actionTaken, resolutionNotes, correctedOutput, etc.
delete(reviewId)Hard delete.
stats({ agentId })Aggregated counts and avgResolutionHours.
exportForTraining({ agentId, status?, rating? })Export resolved reviews as OpenAI-conversational fine-tuning examples.

Feedback loop pattern

// Nightly job: pull resolved reviews into a training set
const dataset = await reviews.exportForTraining({
  agentId,
  status: 'resolved',
});

// Each `dataset.training_examples[]` is an OpenAI conversational
// example with `messages: [{ role, content }, …]` and metadata
// (review_id, execution_id, rating, was_corrected).
fs.writeFileSync('training.jsonl', dataset.training_examples
  .map((ex) => JSON.stringify(ex))
  .join('\n'));

Errors

Reviews throws typed errors so callers can branch cleanly:
  • ReviewAlreadyExistsError (HTTP 409) — the execution already has a review.
  • ReviewNotFoundError (HTTP 404) — the reviewId doesn’t exist or belongs to another tenant.
  • ReviewsError (any other status) — generic error with status + body.

Reading traces from the SDK (with pagination)

The cross-agent SDK’s Executions.getDetails(executionId) returns the same hierarchical trace tree the portal shows on the “execution detail” page — but bounded to protect the DB. See Cross-Agent SDK → Executions for the full surface. Two modes, depending on the caller:
CallerModePer-request limitWhy
Portal (/api/v1/observability/executions/:id)Bulk10 000 (hard cap)UI renders aggregations from the trace array. Cap is a safety net — normal traffic never hits it.
SDK runtime (/api/v1/runtime/v1/observability/executions/:id/details)Paginated500 default, 1 000 max per pageForces SDK consumers to walk pages instead of pulling everything.
import { Executions } from '@runflow-ai/sdk/executions';

const executions = new Executions();

// Default: 500 traces per page, with pagination metadata
const { execution, traces, tracesTotal, tracesHasMore } =
  await executions.getDetails(executionId);

// Iterate every page without writing the loop yourself
for await (const page of executions.iterateTraces(executionId, { pageSize: 500 })) {
  console.log(`offset=${page.tracesOffset} / total=${page.tracesTotal}`);
  for (const root of page.traces) audit(root);
}
The portal behavior is fully backwards-compatible — the existing ObservabilityController.getExecutionDetails keeps returning the entire trace array, with the new pagination fields ignored. The 10 000 cap only kicks in if an execution actually generates that many traces (which would be a bug to investigate, not a regression).

Observability Comparison

FeatureAgentWorkflowStandalone (log)
Automatic tracingYesYesManual
Mode (full/standard/minimal)YesYes
onTrace interceptorYesYesconfigureLogging()
Truncation controlYesYes
Trace hierarchyAutomaticAutomatic (steps)startSpan + parentId
Chat rendering in portalVia message() wrapperVia message() wrapperVia message()
Default modefullfull

Next Steps

Workflows

Workflow tracing and step types

Configuration

Configure observability