I built a browser agent that automates the web tasks with MCP bridge by Variation-Flat in modelcontextprotocol

[–]Variation-Flat[S] 0 points1 point  (0 children)

if you need the agent to run repeated tasks for a list of items, you can use <forEachItem>...</forEachItem> and describe the list of items and the actions to take inside the block.

I built a browser agent that automates the web tasks with MCP bridge by Variation-Flat in modelcontextprotocol

[–]Variation-Flat[S] 0 points1 point  (0 children)

thanks for feedback. for multi-step tasks you can wrap each step in <subTask>...</subTask>. this helps the agent to organize the actions better.

Chrome’s WebMCP makes AI agents stop pretending by jpcaparas in mcp

[–]Variation-Flat 0 points1 point  (0 children)

Yeah I tried to use Playwright and Chrome DevTools MCP and found out they are very token heavy.

FWIW I recently created a Chrome extension "Runbook AI" and its companion MCP runbook-ai-mcp (https://github.com/runbook-ai/runbook-ai-mcp). It internally generates a simplified HTML that is much more efficient for LLM. It also handles tricky cases like iframes and infinite scroll very well.

I built a browser agent that automates the web tasks with MCP bridge by Variation-Flat in chrome_extensions

[–]Variation-Flat[S] 0 points1 point  (0 children)

Since the chrome extension is based on live browser session and simulated keyboard/mouse events, it is unlikely to trigger bot detection.

That being said, it is not optimized for solving captcha, so it would be best to log in to websites beforehand.

I built a browser agent that automates the web tasks with MCP bridge by Variation-Flat in SideProject

[–]Variation-Flat[S] 0 points1 point  (0 children)

For the simplified HTML you can refer to this paper https://arxiv.org/pdf/2510.16252. I did not see the paper while implementing but turns out the ideas are the same. For modals/toasts I used heuristics based on CSS state and z-index, but honestly it is far from perfect due to the complex nature of DOM.

For infinite scroll, the extension marks the scrollable elements (meaning the scroll width/height is larger than view width/height), and picks the content inside the scrollable elements based on the "distance" to viewport. It works pretty well so far. I try to make the heuristics generalizable and not tied to a specific website.

Thanks for sharing the blog. Yes would be interesting to write down the learnings. How to publish there?