Cross-Agent SDK

The Cross-Agent SDK lets one agent operate on another agent’s data within the same tenant. Read executions, walk conversation threads, write reviews against an agent’s work, and curate its memory — all from inside agent code. Without these primitives, client.chat() always targets the current agent. The cross-agent surface closes that gap and unlocks three common patterns:

Reviewer agents — judge another agent’s recent executions and write reviews automatically
Follow-up agents — find conversations idle for >24h and ping the original agent to re-engage
Metrics / curator agents — emit custom domain events from another agent’s logs, or curate its long-term memory

All operations are tenant-scoped via the runtime API key. Cross-tenant references return 404 (not 403) so existence is never leaked across tenants.

The four modules

Agents

Invoke other agents (invoke / invokeAsync), list, get.

Executions

Read execution rows and the hierarchical trace tree, with pagination.

Threads

Walk grouped conversations by entity (phone, email, contact). getFullThread returns thread + executions + traces in one call.

MemoryAdmin

Cross-agent memory: get / set / append / clear / search / list / summarize on another agent’s slots.

Universal agent reference

Every cross-agent endpoint accepts the same three identifier forms for the target agent. Resolution is server-side; the database always sees the canonical UUID.

UUID — always works

Keyed on (id, tenantId, ACTIVE). Single-match.

Slug — recommended

Exact match on (tenantId, slug, ACTIVE). Slugs are URL-friendly identifiers, unique per tenant. Auto-generated from the agent name on creation when not set explicitly.

Name — case-insensitive

Must be unambiguous. If two active agents share the name, the call returns 409 with a “use UUID or assign each a unique slug” hint.

// All three work — pick whatever reads best at the call site:
await agents.invoke('a1b2c3d4-...', { message: 'oi' });   // UUID
await agents.invoke('customer-support', { message: 'oi' }); // slug
await agents.invoke('Customer Support', { message: 'oi' }); // name

This applies to Agents.*, Reviews.{create,list,stats,exportForTraining}, Executions.list({ agentId }), Threads.list({ agentId }), and MemoryAdmin.*.

The slug field on the agent is optional in the create modal. Leave it blank and the backend derives one from the name (Customer Support Bot → customer-support-bot). Manual edits make the slug “sticky” — renaming the agent won’t overwrite a slug you customized.

Agents

Cross-agent invocation and discovery.

import { Agents } from '@runflow-ai/sdk/agents';

const agents = new Agents();

// Sync — DEFAULT. Wait for the target agent to finish (timeout 60s default).
const result = await agents.invoke('customer-support', {
  message: 'Resumo das últimas 24h',
  userId: 'reviewer-agent',
  channel: 'review',
});
console.log(result.output);

// Async — fire and forget; backend returns the executionId immediately.
const { executionId } = await agents.invokeAsync('customer-support', {
  message: 'Olá! Notei que estamos sem falar há 24h. Tudo bem?',
  userId: '+5511999999999',
  channel: 'follow-up',
});

// Discovery
await agents.list({ limit: 50 });
await agents.get('customer-support');

Sync vs async — when to pick which

Use case	Method
Reviewer / metrics / eval agent (needs output)	`invoke()`
Follow-up agent (ping and forget)	`invokeAsync()`
Fire side-effect (audit log, webhook)	`invokeAsync()`
RAG/judge chain that needs result	`invoke()`

Flexible input — anything goes to `request.*`

The entire input object is forwarded to the Go executor and exposed under request.* in the target’s handler. message is not required — pass arbitrary structured payloads.

await agents.invoke('order-processor', {
  orderId: 'ord-123',
  action: 'fulfill',
  items: [{ sku: 'A', qty: 2 }],
  metadata: { source: 'reviewer' },
});

// Inside order-processor:
export async function main(input) {
  const { orderId, action, items } = input.request;
  // ...
}

Executions

Read execution rows across agents in the caller’s tenant.

import { Executions } from '@runflow-ai/sdk/executions';

const executions = new Executions();

const { data } = await executions.list({ agentId: 'customer-support', limit: 100 });
for (const exec of data) {
  const detail = await executions.get(exec.id);
  // detail.input, detail.output, detail.duration, detail.cost, ...
}

`getDetails` — execution + full trace tree

Returns the execution row plus the hierarchical trace tree (LLM calls, tool calls, sub-spans). What you see on the “execution detail” page in the portal.

const { execution, traces, tracesTotal, tracesHasMore } =
  await executions.getDetails(executionId);

walkTrace(traces);   // each trace node has .children

Trace pagination — protect the DB

A pathological execution (deep workflow, tool loop, RAG-heavy turn) can produce thousands of traces. Before pagination, one bad execution could pin Postgres and return a multi-megabyte payload.

Caller	Mode	Per-request limit	Why
Portal	Bulk	10 000 (hard cap)	UI renders aggregations from the trace array. Cap is a safety net — normal traffic never hits it.
SDK runtime	Paginated	500 default, 1 000 max	Forces SDK consumers to walk pages instead of pulling everything.

// Page 1
const page1 = await executions.getDetails(execId, { traceLimit: 500, traceOffset: 0 });
if (page1.tracesHasMore) {
  // Page 2
  const page2 = await executions.getDetails(execId, { traceLimit: 500, traceOffset: 500 });
}

`iterateTraces` — walk every page without boilerplate

Async generator that walks pages until tracesHasMore is false. Yields one page at a time so memory doesn’t spike on huge executions.

for await (const page of executions.iterateTraces(execId, { pageSize: 500 })) {
  console.log(`offset=${page.tracesOffset} / total=${page.tracesTotal}`);
  for (const root of page.traces) audit(root);
}

iterateTraces has a hard safety cap of 100 pages (~100k traces). If you hit it, something is wrong upstream — investigate the agent, don’t crank the limit.

Threads

Threads are grouped executions by entity (phone, email, contact). One conversation that spans multiple executions over time = one thread.

import { Threads } from '@runflow-ai/sdk/threads';

const threads = new Threads();

// Threads idle for >24h, ordered by last activity
const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
const { threads: list } = await threads.list({
  agentId: 'customer-support',
  dateTo: cutoff,
  limit: 50,
});

// Drill into one conversation
const { executions: timeline } = await threads.getExecutions(list[0].thread_id, {
  limit: 50,
  order: 'asc',
});

Granularity map

Thread       conversation between user Y and agent X       (1 per entity)
 └─ Execution   one turn in that conversation               (1 per user message)
     └─ Trace      step inside that turn                    (many per execution)
          ├─ agent_execution
          ├─ llm_call            (GPT/Claude call)
          ├─ tool_call           (knowledge_search, connector, ...)
          └─ tool_call (etc)

`getFullThread` — thread + executions + traces in one call

const full = await threads.getFullThread(threadId, {
  agentId: 'customer-support',
  maxTraces: 20,         // cap on executions to deep-fetch (default 20)
  traceLimit: 500,        // per-execution trace page size
});

// full.threadId
// full.total                              ← total executions
// full.executions[].execution             ← row (input/output/cost/duration)
// full.executions[].traces[]              ← trace tree
// full.executions[].traces[].children[]   ← sub-spans

Fetches in two stages: list executions, then executions.getDetails for each (concurrency capped at 5). Individual failures are silently dropped so a single bad execution doesn’t break the whole batch.

Memory Admin

The default Memory module is scoped to the caller’s agent. To curate another agent’s memory (audit messages, inject system context, clear stale sessions, summarize), use MemoryAdmin.

import { MemoryAdmin } from '@runflow-ai/sdk/memory-admin';

const admin = new MemoryAdmin();

// Inventory — list every memory slot owned by the target agent
const { sessions } = await admin.list('customer-support');

// Filter by activity window
const since = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();
const recent = await admin.list('customer-support', {
  dateFrom: since,
  dateField: 'updated_at',
  limit: 50,
});

// Read another agent's memory
const data = await admin.get('customer-support', 'phone:+5511999999999');

// Inject a system message (curator agent)
await admin.append('customer-support', 'phone:+5511999999999', {
  role: 'system',
  content: 'IMPORTANTE: cliente prioritário, responder em <2min.',
});

// Search within a time window
const hits = await admin.search(
  'customer-support',
  'phone:+5511999999999',
  'erro',
  {
    dateFrom: new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString(),
    limit: 100,
  },
);

// Summarize the slot — generates an LLM summary and persists it
const { summary } = await admin.summarize(
  'customer-support',
  'phone:+5511999999999',
  { prompt: 'Resume em até 5 bullets, em português:' },
);

// Clear (deletes the session)
await admin.clear('customer-support', 'phone:+5511999999999');

Memory keys are prefixed with the target agent’s id (not the caller’s), so the existing data isolation model stays intact. Each agent has its own namespace; MemoryAdmin just flips which namespace you target.

Reviews — full lifecycle

The existing Reviews module covers production execution reviews. Two new ergonomic helpers for the common verdict transitions:

import { Reviews } from '@runflow-ai/sdk/reviews';

const reviews = new Reviews();

// Create (auto-judge case)
await reviews.create({
  executionId,
  agentId: 'customer-support',       // UUID, slug, or name
  rating: 'bad',
  comment: 'Bot deu horário de funcionamento errado.',
  priority: 'high',
  tags: ['hours-wrong'],
});

// List, filter, stats — accept slug/name too
await reviews.list({ agentId: 'customer-support', status: 'pending_review', limit: 100 });
await reviews.stats({ agentId: 'customer-support' });

// NEW — resolve a review (= update with status='resolved')
await reviews.resolve(reviewId, {
  actionTaken: 'knowledge_base_updated',
  correctedOutput: 'O horário correto é 9h–18h, seg-sex.',
  resolutionNotes: 'Adicionei o doc faltante no vector store.',
});

// NEW — dismiss a review (= update with status='wont_fix')
await reviews.dismiss(reviewId, { resolutionNotes: 'edge case, ignoring' });

Reviews stamped by SDK callers show up in the UI as reviewedBy: apikey:<name> — easy to filter from human reviews.

Recipes

Reviewer agent — automated quality control

const recent = await executions.list({ agentId: TARGET, limit: 100 });

for (const exec of recent.data) {
  const { exists } = await reviews.checkHasReview(exec.id);
  if (exists) continue;

  const detail = await executions.get(exec.id);
  const verdict = await runJudge(detail);   // your LLM judge

  await reviews.create({
    executionId: exec.id,
    agentId: TARGET,
    rating:    verdict.rating,
    comment:   verdict.reason,
    tags:      ['auto-judge'],
  });
}

Wire it to a daily CRON trigger and humans only see the bad reviews. See Auto-reviewer agent for the full walkthrough.

Follow-up agent — decoupled from the conversation flow

const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
const { threads: stale } = await threads.list({
  agentId: TARGET,
  dateTo:  cutoff,
  limit:   50,
});

for (const t of stale) {
  await agents.invokeAsync(TARGET, {
    message:  'Olá! Notei que estamos sem falar há 24h. Tudo bem?',
    userId:   t.entity_value,
    channel:  'follow-up',
    metadata: { followUpFor: t.thread_id, reason: 'idle-24h' },
  });
}

The follow-up agent is just a normal agent with its own cron trigger — none of the logic leaks into the original conversation.

Metrics agent — custom KPIs from execution logs

import { track } from '@runflow-ai/sdk/observability';

const recent = await executions.list({ agentId: TARGET, limit: 500 });
for (const exec of recent.data) {
  const detail = await executions.get(exec.id);
  if (looksLikeBooking(detail.output)) {
    track('booking_completed', {
      agentId:    TARGET,
      threadId:   detail.threadId,
      durationMs: detail.duration,
      cost:       detail.cost,
    });
  }
}

Dashboards (or external BI) get a real booking_completed event without touching the target agent at all.

Security model

All cross-agent endpoints sit behind RuntimeAuthGuard or SdkAuthGuard and resolve tenantId from the credential — never from the body or query.

Endpoint	How the tenant is enforced
`POST /runtime/agents/:ref/invoke[-async]`	Agent lookup keyed on `(id\|slug\|name, tenantId, ACTIVE)`. Ambiguous name → 409, miss → 404
`GET /runtime/v1/observability/threads`	Service-layer `WHERE tenant_id = $1` in raw SQL
`GET .../threads/:id/executions`	Same tenant filter at service layer
`GET .../executions/:id/details`	Controller-side `tenantId !== sdkContext.tenantId` → 404
`POST .../executions/:id/reviews`	`req.sdkContext.tenantId` carried into the service
`/runtime/v1/agents/:ref/memory/*`	Resolves target via the same agent gate — memory keys prefixed with the target’s id

Cross-tenant access returns 404 (not 403) — existence is never leaked.

SDK version requirement

@runflow-ai/sdk >= 1.2.0. Older versions don’t have the new namespaces on the API client and throw a clear “namespace missing” error at construction time.

Next steps

Auto-reviewer agent

Full walkthrough of an automated quality-control agent.

Memory

Single-agent memory module (the default).

Observability

Tracing and business events.

Standalone Modules

All standalone SDK exports.

​The four modules

Agents

Executions

Threads

MemoryAdmin

​Universal agent reference

​Agents

​Sync vs async — when to pick which

​Flexible input — anything goes to request.*

​Executions

​getDetails — execution + full trace tree

​Trace pagination — protect the DB

​iterateTraces — walk every page without boilerplate

​Threads

​Granularity map

​getFullThread — thread + executions + traces in one call

​Memory Admin

​Reviews — full lifecycle

​Recipes

​Reviewer agent — automated quality control

​Follow-up agent — decoupled from the conversation flow

​Metrics agent — custom KPIs from execution logs

​Security model

​SDK version requirement

​Next steps

Auto-reviewer agent

Memory

Observability

Standalone Modules

The four modules

Universal agent reference

Agents

Sync vs async — when to pick which

Flexible input — anything goes to `request.*`

Executions

`getDetails` — execution + full trace tree

Trace pagination — protect the DB

`iterateTraces` — walk every page without boilerplate

Threads

Granularity map

`getFullThread` — thread + executions + traces in one call

Memory Admin

Reviews — full lifecycle

Recipes

Reviewer agent — automated quality control

Follow-up agent — decoupled from the conversation flow

Metrics agent — custom KPIs from execution logs

Security model

SDK version requirement

Next steps