Documentation Index
Fetch the complete documentation index at: https://docs.runflow.ai/llms.txt
Use this file to discover all available pages before exploring further.
Stream responses in real-time using processStream(). Supports content chunks, thinking/reasoning, tool calls, and memory persistence.
Basic Streaming
const stream = await agent.processStream({
message: 'Tell me a story',
sessionId: 'session_123',
});
for await (const chunk of stream) {
if (chunk.type === 'content') {
process.stdout.write(chunk.data.content);
}
}
Chunk Types
Your stream can receive different chunk types:
| Type | Description | Data |
|---|
content | Text response from the model | { content: string, done: boolean } |
thinking | Reasoning/thinking content | { content: string, done: boolean } |
internal_process | Tool call start/complete, memory load/save | { processType, status, process } |
done | Stream complete | { message, metadata } |
error | Error occurred | { error: string } |
Streaming with Thinking
When thinking is enabled, reasoning content arrives as separate chunks before the final response:
const agent = new Agent({
name: 'Analyst',
model: anthropic('claude-sonnet-4-6'),
modelConfig: {
thinking: { type: 'enabled', budgetTokens: 5000 }
}
});
const stream = await agent.processStream({ message: 'Why is the sky blue?' });
for await (const chunk of stream) {
switch (chunk.type) {
case 'thinking':
console.log('[Thinking]', chunk.data.content);
break;
case 'content':
process.stdout.write(chunk.data.content);
break;
case 'internal_process':
if (chunk.data.status === 'started') {
console.log(`[${chunk.data.processType}] started...`);
}
break;
}
}
Tool calls are reported as internal_process chunks. The agent handles the tool loop automatically:
const agent = new Agent({
name: 'Assistant',
model: openai('gpt-4o'),
tools: {
get_weather: {
name: 'get_weather',
description: 'Get weather for a location',
parameters: {
location: { type: 'string', description: 'City name', required: true }
},
execute: async ({ location }) => {
return { temp: 22, condition: 'sunny' };
}
}
}
});
const stream = await agent.processStream({ message: 'Weather in Tokyo?' });
for await (const chunk of stream) {
if (chunk.type === 'content') {
process.stdout.write(chunk.data.content);
} else if (chunk.type === 'internal_process') {
const proc = chunk.data;
if (proc.processType === 'tool_call' && proc.status === 'started') {
console.log(`\nCalling tool: ${proc.process.name}`);
}
if (proc.processType === 'tool_call' && proc.status === 'completed') {
console.log(`Tool result: ${JSON.stringify(proc.result)}`);
}
}
}
Streaming with Memory
Memory is automatically loaded before and saved after streaming:
const agent = new Agent({
name: 'Chat',
model: openai('gpt-4o'),
memory: { maxTurns: 20 }
});
// Memory chunks appear as internal_process
const stream = await agent.processStream({
message: 'Continue our conversation',
sessionId: 'session_abc',
});
for await (const chunk of stream) {
if (chunk.type === 'internal_process' && chunk.data.processType === 'memory_load') {
console.log('Memory loaded:', chunk.data.result?.messagesCount, 'messages');
}
if (chunk.type === 'content') {
process.stdout.write(chunk.data.content);
}
}
LLM Standalone Streaming
Direct LLM streaming without agents:
const llm = LLM.anthropic('claude-sonnet-4-6', {
thinking: { type: 'enabled', budgetTokens: 3000 }
});
for await (const chunk of llm.generateStream('Explain quantum computing')) {
if (chunk.thinking) {
console.log('[Think]', chunk.thinking);
}
if (chunk.text) {
process.stdout.write(chunk.text);
}
}
Testing in Prompt Studio
You can test streaming behavior directly in the Portal’s Prompt Studio:
- Open Prompts and select or create a prompt
- Click the config icon and enable Thinking
- Send a message — you’ll see the thinking content appear as a collapsible block above the response
Next Steps
Reasoning
Extended thinking for complex tasks
Memory
Conversation persistence