Web automation has always promised to save us time and effort, but the reality is often a tangled mess of brittle scripts. Traditional tools force us to rely on fragile CSS selectors and XPath queries that break with the slightest website update. Building a simple workflow, like logging into a site and downloading a report, can turn into a maintenance nightmare. What if you could just describe what you wanted to do, and an AI agent would handle the rest?
At browse.do, we're turning that "what if" into a reality. Our AI-powered agent lets you transform complex browser actions into simple API calls. You provide a high-level objective, and our agent navigates, clicks, types, and extracts data just like a human would.
Today, we're moving beyond single commands to explore one of the most powerful features of browse.do: building robust, multi-step workflows.
Think about the classic approach to automating a multi-step task, like checking an order status on an e-commerce site. Your script would look something like this:
This entire chain is a house of cards. If a developer changes a class name from .btn-primary to .btn-main, your script breaks. If a new marketing modal pops up, your "wait" logic fails. This isn't automation; it's a constant, reactive chore.
With browse.do, you stop thinking in selectors and start thinking in objectives. Instead of providing a rigid set of instructions, you tell the AI a story about what you want to accomplish. The agent understands context, handles interruptions, and intelligently finds the elements it needs to complete the task.
Let's see how we can chain actions together into a single, powerful command.
Imagine you want to automate the process of getting a quote from a competitor's website. The workflow involves multiple steps: navigating to the pricing page, selecting a plan, filling out a form, and capturing the confirmation.
With traditional tools, this is a complex, multi-part script. With browse.do, it's a single objective.
Here’s how you'd do it with our API:
import { browse } from "@do-inc/agents";
async function getSaaSQuote() {
const objective = `
Go to the pricing page, find the "Enterprise" plan, and click
the button to get a quote. On the contact form, fill in the
following details:
- Company Name: ACME Industries
- Number of Employees: 500+
- Work Email: lead@acme-industries.com
After submitting the form, find the confirmation message and
return its text.
`;
const result = await browse.do({
url: "https://fictional-saas-website.com",
objective: objective,
});
// The AI agent returns structured data based on your request.
console.log(result.data);
// Expected output:
// {
// "confirmationMessage": "Thank you for your request! Our sales team will be in touch shortly."
// }
return result.data;
}
getSaaSQuote();
That’s it. You described the entire multi-step workflow in plain English.
When browse.do receives this objective, it doesn't just look for keywords. Our AI agent performs a series of intelligent actions in a full, stateful headless browser environment:
By chaining actions within a single, descriptive objective, you're not just automating a task—you're encapsulating an entire human workflow into one simple, robust API call. This is the future of Robotic Process Automation (RPA) and data extraction.
Ready to stop wrestling with brittle selectors and start building powerful web automation that just works?