Documentation Index Fetch the complete documentation index at: https://docs.runflow.ai/llms.txt
Use this file to discover all available pages before exploring further.
The LLM module lets you call language models directly — no agents, no memory, no tools. Use it when you need a single LLM call for tasks like classification, data extraction, translation, or content generation.
When to Use LLM vs Agent
Scenario Use Conversation with memory and tools Agent Single classification or categorization LLM Standalone Extract structured data from text LLM Standalone Generate content (emails, summaries) LLM Standalone Translate text LLM Standalone Pre-process input before an agent LLM Standalone
Basic Usage
import { LLM } from '@runflow-ai/sdk' ;
const llm = LLM . openai ( 'gpt-4o' , {
temperature: 0.7 ,
maxTokens: 2000 ,
});
const response = await llm . generate ( 'What is the capital of Brazil?' );
console . log ( response . text );
console . log ( 'Tokens:' , response . usage );
With System Prompt
Use a system prompt to control the LLM’s behavior:
const response = await llm . generate (
'The product arrived broken and I want my money back.' ,
{
system: `Classify the customer message into exactly one category:
- REFUND_REQUEST
- TECHNICAL_ISSUE
- GENERAL_QUESTION
- COMPLAINT
- PRAISE
Respond with ONLY the category name, nothing else.` ,
temperature: 0 ,
}
);
console . log ( response . text ); // "REFUND_REQUEST"
With Messages
For multi-turn prompts or few-shot examples:
const response = await llm . generate ([
{
role: 'system' ,
content: `Extract structured data from customer messages. Return valid JSON only.` ,
},
{
role: 'user' ,
content: 'My name is João, email joao@test.com, I need help with order ORD-789' ,
},
]);
const data = JSON . parse ( response . text );
// { name: "João", email: "joao@test.com", orderId: "ORD-789" }
Streaming
For real-time output (long responses, content generation):
const stream = llm . generateStream ( 'Write a product description for a wireless headphone' );
for await ( const chunk of stream ) {
if ( ! chunk . done ) {
process . stdout . write ( chunk . text );
}
}
Available Models
import { LLM } from '@runflow-ai/sdk' ;
// OpenAI
const gpt4 = LLM . openai ( 'gpt-4o' , { temperature: 0.7 });
const gpt4mini = LLM . openai ( 'gpt-4o-mini' , { temperature: 0.3 });
// Anthropic (Claude)
const claude = LLM . anthropic ( 'claude-sonnet-4-20250514' , {
temperature: 0.9 ,
maxTokens: 4000 ,
});
// AWS Bedrock
const bedrockClaude = LLM . bedrock ( 'anthropic.claude-3-5-sonnet-20241022-v2:0' , {
temperature: 0.8 ,
});
// Groq (ultra-fast inference)
const fast = LLM . groq ( 'llama-3.3-70b-versatile' , { temperature: 0.3 });
// Google Gemini
const flash = LLM . gemini ( 'gemini-2.5-flash' , { temperature: 0.5 });
// xAI (Grok)
const research = LLM . xai ( 'grok-4-1-fast-reasoning' , { temperature: 0.3 });
// Custom (OpenAI-compatible: Ollama, vLLM, LiteLLM, etc.)
const local = LLM . custom ( 'llama3' , 'Ollama Local' , { temperature: 0.7 });
See LLM Providers for all supported providers and configuration options.
Structured Output
Force responses into valid JSON format using responseFormat:
const extractor = LLM . openai ( 'gpt-4o' , {
responseFormat: { type: 'json_object' }
});
const result = await extractor . generate ( 'List 3 colors with hex codes' , {
system: 'Respond with valid JSON only.'
});
const data = JSON . parse ( result . text );
For schema-validated JSON:
const extractor = LLM . openai ( 'gpt-4o' , {
responseFormat: {
type: 'json_schema' ,
json_schema: {
type: 'object' ,
properties: {
name: { type: 'string' },
age: { type: 'integer' },
},
required: [ 'name' , 'age' ],
additionalProperties: false ,
}
}
});
See Structured Output for full provider support details.
Thinking / Reasoning
Enable extended thinking for complex tasks:
const thinker = LLM . anthropic ( 'claude-sonnet-4-6' , {
thinking: { type: 'enabled' , budgetTokens: 10000 }
});
const result = await thinker . generate ( 'What is 17! / 15!?' );
Or use reasoning models that think natively:
const reasoner = LLM . openai ( 'o4-mini' );
const researcher = LLM . xai ( 'grok-4-1-fast-reasoning' );
See Reasoning for all provider options.
A common pattern is using LLM Standalone inside a tool to classify intent before the agent decides what to do:
import { createTool } from '@runflow-ai/sdk' ;
import { LLM } from '@runflow-ai/sdk' ;
import { z } from 'zod' ;
const classifier = LLM . openai ( 'gpt-4o-mini' , { temperature: 0 });
export const classifyIntentTool = createTool ({
id: 'classify-intent' ,
description: 'Classify customer message intent' ,
inputSchema: z . object ({
message: z . string (). describe ( 'The customer message to classify' ),
}),
execute : async ( params ) => {
try {
const response = await classifier . generate ( params . message , {
system: `Classify the message into one category:
- ORDER_STATUS: asking about an order, delivery, or tracking
- REFUND: requesting money back or return
- TECHNICAL: product issue or bug report
- BILLING: payment, invoice, or charge question
- GENERAL: anything else
Respond with JSON: { "intent": "CATEGORY", "confidence": 0.0-1.0 }` ,
});
return JSON . parse ( response . text );
} catch ( error ) {
return { intent: 'GENERAL' , confidence: 0 };
}
},
});
Real-World Example: Pre-Processing in main.ts
Use LLM Standalone to pre-process or enrich input before passing it to your agent:
import { LLM } from '@runflow-ai/sdk' ;
import { identify , track } from '@runflow-ai/sdk/observability' ;
import { supportAgent } from './agent' ;
const classifier = LLM . openai ( 'gpt-4o-mini' , { temperature: 0 });
async function detectLanguage ( text : string ) : Promise < string > {
const response = await classifier . generate ( text , {
system: 'Detect the language of this text. Respond with only the ISO 639-1 code (e.g., "pt", "en", "es").' ,
});
return response . text . trim (). toLowerCase ();
}
export async function main ( input : any ) {
if ( ! input ?. message ) {
return { error: 'message is required' };
}
identify ( input . email || input . phone || 'anonymous' );
// Pre-process: detect language
const language = await detectLanguage ( input . message );
const result = await supportAgent . process ({
message: input . message ,
sessionId: input . sessionId ,
});
track ( 'message_processed' , { language });
return { message: result . message , language };
}
Real-World Example: Content Generation
Generate structured content without needing an agent:
import { createTool } from '@runflow-ai/sdk' ;
import { LLM } from '@runflow-ai/sdk' ;
import { z } from 'zod' ;
const writer = LLM . openai ( 'gpt-4o' , { temperature: 0.7 });
export const generateEmailTool = createTool ({
id: 'generate-email' ,
description: 'Generate a professional email based on context' ,
inputSchema: z . object ({
to: z . string (). describe ( 'Recipient name' ),
subject: z . string (). describe ( 'Email subject' ),
context: z . string (). describe ( 'What the email should communicate' ),
tone: z . enum ([ 'formal' , 'friendly' , 'urgent' ]). describe ( 'Email tone' ),
}),
execute : async ( params ) => {
try {
const response = await writer . generate (
`Write an email to ${ params . to } about: ${ params . context } ` ,
{
system: `You are a professional email writer.
Tone: ${ params . tone }
Subject: ${ params . subject }
Write the email body only (no subject line, no "From/To" headers).
Keep it concise — 2-3 paragraphs max.` ,
}
);
return {
success: true ,
subject: params . subject ,
body: response . text ,
tokensUsed: response . usage ?. totalTokens ,
};
} catch ( error ) {
return {
success: false ,
error: error instanceof Error ? error . message : 'Failed to generate email' ,
};
}
},
});
Next Steps
Agents When you need memory and tools
Tools Use LLM inside tools
Media Processing Process audio and images
Best Practices Tips for effective agents