Skip to main content

Multi-Modal (Images)

const agent = new Agent({
  name: 'Vision Agent',
  instructions: 'You can analyze images.',
  model: openai('gpt-4o'),
});

await agent.process({
  message: 'What is in this image?',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is in this image?' },
        {
          type: 'image_url',
          image_url: { url: 'https://example.com/image.jpg' },
        },
      ],
    },
  ],
});

Next Steps