Skip to main content
RPA (Robotic Process Automation) lets your agents navigate websites, fill forms, click buttons, extract data, and download files using a real browser. Built on top of Playwright.

Installation

Playwright is an optional peer dependency. Install it alongside the browser binaries:
npm install playwright
npx playwright install chromium
Or use the CLI shortcut:
rf rpa install
Check your setup:
rf rpa status

Quick Start

The simplest way to use RPA is with createBrowserTool — it manages the browser lifecycle automatically.
import { createBrowserTool } from '@runflow-ai/sdk/rpa';
import { z } from 'zod';

const scrapeTool = createBrowserTool({
  id: 'scrape-products',
  description: 'Scrape product listings from a website',
  inputSchema: z.object({
    url: z.string().url(),
  }),
  browser: {
    headless: true,
    screenshotsDir: './screenshots',
  },
  execute: async ({ context, browser }) => {
    const page = browser.page;
    await page.goto(context.url);
    
    const products = await page.$$eval('.product', (els) =>
      els.map((el) => ({
        name: el.querySelector('.name')?.textContent?.trim(),
        price: el.querySelector('.price')?.textContent?.trim(),
      }))
    );

    await browser.screenshot('products-page');
    return { products };
  },
});
Then add it to your agent:
import { Agent } from '@runflow-ai/sdk';

const agent = new Agent({
  name: 'scraper',
  instructions: 'You scrape product data from websites.',
  model: openai('gpt-4o'),
  tools: { scrapeTool },
});

createBrowserTool

Factory function that wraps your browser logic into a Runflow tool with automatic lifecycle management.
import { createBrowserTool } from '@runflow-ai/sdk/rpa';

const myTool = createBrowserTool({
  id: 'tool-id',
  description: 'What this tool does (shown to LLM)',
  inputSchema: z.object({ /* ... */ }),
  outputSchema: z.object({ /* ... */ }),  // optional
  browser: {
    headless: true,
    viewport: { width: 1440, height: 900 },
    timeout: 30000,
    screenshotsDir: './screenshots',
  },
  execute: async ({ context, browser, projectId, companyId, userId, sessionId }) => {
    const page = browser.page;
    // your automation logic
    return { /* result */ };
  },
});
What it handles for you:
  • Launches the browser before your execute runs
  • Closes the browser after (even on errors)
  • Takes an error screenshot automatically if screenshotsDir is configured
  • Attaches RPA trace data to the output for observability
  • Validates input/output with Zod schemas

Browser Configuration

OptionTypeDefaultDescription
headlessbooleantrueRun browser without visible window
viewport{ width, height }1440x900Browser viewport size
timeoutnumber30000Default timeout in milliseconds
acceptDownloadsbooleantrueAllow file downloads
slowMonumber-Slow down actions by N ms (useful for debugging)
screenshotsDirstring-Directory to save screenshots
userAgentstring-Custom user agent string
localestring-Browser locale (e.g., pt-BR)
timezoneIdstring-Timezone (e.g., America/Sao_Paulo)
extraHTTPHeadersRecord<string, string>-Custom HTTP headers
launchArgsstring[]-Extra Chromium launch arguments

BrowserSession

For more control, use BrowserSession directly. This is useful when you need multiple pages, custom lifecycle, or manual tracing.
import { BrowserSession } from '@runflow-ai/sdk/rpa';

const session = new BrowserSession({
  headless: true,
  viewport: { width: 1920, height: 1080 },
  screenshotsDir: './screenshots',
});

await session.launch();

const page = session.page;
await page.goto('https://example.com');

// Take a screenshot
await session.screenshot('home-page');

// Traced action (appears in observability)
const title = await session.traced('get-title', async () => {
  return page.title();
});

await session.close();

Key Methods

MethodDescription
launch(config?)Start the browser
close()Close browser and cleanup
screenshot(name)Take a full-page screenshot, returns file path
waitForNavigation(urlPattern, timeout?)Wait for URL to match a RegExp
waitForSelector(selector, timeout?)Wait for a CSS selector to appear
getContent()Get page HTML content
newPage()Open a new page/tab
traced(action, fn, meta?)Wrap an operation in a traced span

Properties

PropertyTypeDescription
pagePageCurrent Playwright page
isLaunchedbooleanWhether the browser is running
artifactsBrowserSessionArtifact[]Screenshots, downloads, PDFs created
actionSpansBrowserActionSpan[]All traced actions

Observability

Every BrowserSession tracks actions and artifacts. Get a summary with:
const trace = session.getTraceSummary();
// {
//   totalActions: 5,
//   totalDurationMs: 3200,
//   actions: [{ action: 'login', durationMs: 1200, ... }, ...],
//   artifacts: [{ type: 'screenshot', name: 'home', path: '...', ... }],
//   errors: []
// }
When using createBrowserTool, this trace is automatically attached to the tool output as _rpaTrace.

High-Level Actions

Helper functions for common browser patterns. Import from @runflow-ai/sdk/rpa.

login

Automate login flows with smart field detection.
import { login } from '@runflow-ai/sdk/rpa';

await login(page, {
  url: 'https://app.example.com/login',
  username: 'user@example.com',
  password: 'secret123',
  waitAfterLogin: /dashboard/,  // wait until URL matches
});
Options:
OptionTypeDefaultDescription
urlstringrequiredLogin page URL
usernamestringrequiredUsername/email value
passwordstringrequiredPassword value
usernameSelectorstringauto-detectCSS selector for username field
passwordSelectorstringauto-detectCSS selector for password field
submitSelectorstringauto-detectCSS selector for submit button
waitAfterLoginRegExp | string-URL pattern to wait for after login
timeoutnumber30000Timeout in ms
Auto-detection works for most login pages. It finds the first text input for username, the password input, and the submit button by common labels (Login, Entrar, Sign In, etc.).

fillForm

Fill multiple form fields with flexible locators.
import { fillForm } from '@runflow-ai/sdk/rpa';

await fillForm(page, [
  { selector: '#name', value: 'John Doe' },
  { label: 'Email', value: 'john@example.com' },
  { role: 'combobox', label: 'Country', value: 'Brazil', type: 'select' },
  { selector: '#terms', value: 'true', type: 'check' },
]);
FormField options:
OptionTypeDescription
selectorstringCSS selector
rolestringAria role (textbox, combobox, checkbox, etc.)
labelstringAccessible label or placeholder
nthnumberIndex when multiple elements match
valuestringValue to set
type'fill' | 'select' | 'check' | 'uncheck'Action type (default: fill)
Locator priority: selector > role + label > role > label.

clickButton

Click a button by its visible label.
import { clickButton } from '@runflow-ai/sdk/rpa';

await clickButton(page, 'Submit');

waitAndClick

Wait for an element to appear, then click it.
import { waitAndClick } from '@runflow-ai/sdk/rpa';

await waitAndClick(page, '.modal-confirm-button', 5000);

extractTable

Extract an HTML table into structured data.
import { extractTable } from '@runflow-ai/sdk/rpa';

const rows = await extractTable(page, 'table.results');
// [
//   { "Name": "Product A", "Price": "$10", "Stock": "42" },
//   { "Name": "Product B", "Price": "$25", "Stock": "7" },
// ]
Returns an array of objects where keys are column headers.

extractText

Extract text content from matching elements.
import { extractText } from '@runflow-ai/sdk/rpa';

const titles = await extractText(page, 'h2.title');
// ["First Title", "Second Title", "Third Title"]

downloadFile

Click an element to trigger a download and wait for it to complete.
import { downloadFile } from '@runflow-ai/sdk/rpa';

const filePath = await downloadFile(page, '#export-btn', './downloads');
// "./downloads/report.xlsx"

screenshotPage

Take a full-page screenshot.
import { screenshotPage } from '@runflow-ai/sdk/rpa';

const path = await screenshotPage(page, './screenshots/page.png');

waitForResponse

Wait for a network response matching a URL pattern.
import { waitForResponse } from '@runflow-ai/sdk/rpa';

const response = await waitForResponse(page, /api\/products/, 10000);

Full Example: CRM Login + Data Extraction

import { Agent } from '@runflow-ai/sdk';
import { openai } from '@runflow-ai/sdk/models';
import { createBrowserTool, login, extractTable } from '@runflow-ai/sdk/rpa';
import { z } from 'zod';

const crmScrapeTool = createBrowserTool({
  id: 'crm-contacts',
  description: 'Login to CRM and extract contact list',
  inputSchema: z.object({
    searchTerm: z.string().describe('Term to search in CRM'),
  }),
  browser: {
    headless: true,
    screenshotsDir: './screenshots',
    locale: 'pt-BR',
    timezoneId: 'America/Sao_Paulo',
  },
  execute: async ({ context, browser }) => {
    const page = browser.page;

    // 1. Login
    await browser.traced('login', () =>
      login(page, {
        url: 'https://crm.example.com/login',
        username: process.env.CRM_USER!,
        password: process.env.CRM_PASS!,
        waitAfterLogin: /contacts/,
      })
    );

    // 2. Search
    await browser.traced('search', async () => {
      await page.fill('#search-input', context.searchTerm);
      await page.click('#search-button');
      await page.waitForSelector('table.contacts');
    });

    // 3. Extract data
    const contacts = await browser.traced('extract', () =>
      extractTable(page, 'table.contacts')
    );

    await browser.screenshot('results');

    return { contacts, total: contacts.length };
  },
});

const agent = new Agent({
  name: 'crm-agent',
  instructions: 'You extract contact data from the CRM system.',
  model: openai('gpt-4o'),
  tools: { crmScrapeTool },
});

export default agent;

Agent-Level RPA Config

You can also configure RPA at the agent level:
const agent = new Agent({
  name: 'scraper',
  instructions: '...',
  model: openai('gpt-4o'),
  tools: { scrapeTool },
  rpa: {
    enabled: true,
    browser: {
      headless: true,
      viewport: { width: 1440, height: 900 },
    },
    screenshotOnError: true,
    artifactsDir: './rpa-artifacts',
  },
});
OptionTypeDefaultDescription
enabledbooleanfalseEnable RPA capability for this agent
browserBrowserSessionConfig-Default browser config for all tools
maxConcurrentPagesnumber-Limit concurrent browser pages
screenshotOnErrorbooleanfalseAuto-screenshot on errors
artifactsDirstring-Directory for all RPA artifacts
Agents with RPA tools are automatically detected during deploy and receive the rpa capability flag. This routes them to RPA-enabled workers with Chromium pre-installed.

Debugging

Use slowMo and headless: false during development to watch the browser:
const tool = createBrowserTool({
  // ...
  browser: {
    headless: false,
    slowMo: 500,  // 500ms delay between actions
    screenshotsDir: './debug-screenshots',
  },
  // ...
});
Use rf test to run your agent locally with a visible browser.