Web automation has long been a powerful tool for developers, but it's often a double-edged sword. While it promises to streamline repetitive tasks, the reality has traditionally involved writing brittle scripts, wrestling with finicky CSS selectors, and performing constant maintenance as websites inevitably change their layout. What if you could bypass the fragile code and simply describe what you want to achieve in plain English?
Welcome to the new paradigm of web automation. With browse.do, you can build sophisticated automation and data extraction features directly into your application using an AI-powered agent that understands your objectives.
This guide will walk you through why and how you can integrate browse.do to create powerful, resilient, and scalable web automation services for your users.
Building a web automation feature from scratch means managing a complex infrastructure of headless browsers, proxies, and session handling. Tools like Puppeteer and Playwright are powerful, but they still require you to write code that's tightly coupled to a website's specific HTML structure.
browse.do offers a fundamentally different approach.
Let's look at a practical example. Imagine you want to get the title of the top story from Hacker News.
The Old Way (with a selector-based tool):
// This will break if the CSS class changes from 'titleline'
const puppeteer = require('puppeteer');
async function getTopStoryTheOldWay() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com');
const story = await page.evaluate(() => {
const anchor = document.querySelector('.titleline > a');
return {
title: anchor.innerText,
url: anchor.href
};
});
await browser.close();
return story;
}
This code is fragile. If the titleline class is ever changed, the script fails.
The browse.do Way:
import { browse } from "@do-inc/agents";
async function getTopHackerNewsStory() {
const result = await browse.do({
url: "https://news.ycombinator.com",
objective: "Find the title of the top story and its URL."
});
console.log(result.data);
// Output: { "title": "...", "url": "..." }
return result.data;
}
getTopHackerNewsStory();
Here, your instruction is the objective. The AI agent handles the navigation, element identification, and data extraction. It's concise, readable, and resilient.
Let's build a simple service that allows your users to extract information from any URL.
First, decide what you want to empower your users to do. The possibilities are endless:
For our example, we'll build a service that lets a user provide a URL and a data extraction goal.
The browse.do SDK should be called from a secure backend server to protect your API key.
First, install the agent library:
npm install @do-inc/agents
Next, set your API key as an environment variable in your project. This is more secure than hardcoding it.
.env file:
BROWSE_DO_API_KEY=your_api_key_here
Now, create a backend endpoint that will receive requests from your application's frontend. We'll use a simple Express.js server for this example.
// server.js
const express = require('express');
const { browse } = require('@do-inc/agents');
require('dotenv').config();
const app = express();
app.use(express.json());
// Configure the agent with your API key
browse.config({ apiKey: process.env.BROWSE_DO_API_KEY });
app.post('/api/automate', async (req, res) => {
const { url, objective } = req.body;
if (!url || !objective) {
return res.status(400).json({ error: 'URL and objective are required.' });
}
try {
console.log(`Starting job for ${url} with objective: "${objective}"`);
const result = await browse.do({ url, objective });
if (result.status === "SUCCESS") {
res.json(result.data);
} else {
res.status(500).json({ error: 'Automation failed.', details: result.error });
}
} catch (error) {
console.error(error);
res.status(500).json({ error: 'An unexpected error occurred.' });
}
});
const PORT = process.env.PORT || 3001;
app.listen(PORT, () => console.log(`Server running on port ${PORT}`));
This code sets up a /api/automate route that:
On your frontend, create a form that allows users to input their target URL and objective. When submitted, this form will make a POST request to the backend endpoint you just created.
// Example frontend function
async function runAutomation() {
const url = document.getElementById('url-input').value;
const objective = document.getElementById('objective-input').value;
const response = await fetch('/api/automate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ url, objective })
});
const data = await response.json();
// Display the result to the user
document.getElementById('results').textContent = JSON.stringify(data, null, 2);
}
With these few steps, you've successfully created a web automation service powered by browse.do. Your users can now perform complex web interactions through a simple interface in your application.
Q: What kind of web tasks can I automate with browse.do?
A: You can automate any task a human can perform in a browser, including data extraction from dynamic sites, filling out and submitting forms, monitoring competitor websites, and performing automated end-to-end testing.
Q: How does browse.do handle dynamic sites and logins?
A: Our AI-powered agent uses a full, headless browser environment capable of rendering JavaScript, handling cookies, and managing sessions. This allows it to interact with complex Single-Page Applications (SPAs) just like a real user.
Q: Can the agent perform multi-step actions on a website?
A: Yes. The agent is designed to understand context and make decisions. You can provide high-level objectives like "log in with username 'user@example.com' and password 'secure_password', then navigate to the dashboard," and it will find the necessary fields, input credentials, and handle the multi-step process seamlessly.
Q: How is this different from traditional web scraping tools?
A: Instead of writing fragile CSS selectors or XPath queries, you simply describe what you want in natural language. The agent intelligently locates the correct elements, extracts the data, and returns it in a structured JSON format, adapting to minor UI changes automatically. This saves you countless hours of maintenance.
Ready to stop writing brittle scrapers and start building intelligent automations? Get your API key and explore the documentation at browse.do to integrate AI-powered web navigation into your application today.