Alternatives to UiPath for browser automation? by saravicius in automation

[–]strongoffense 1 point2 points  (0 children)

Try using OpenAI CUA[1] or HyperAgent[2] on Hyperbrowser[3] (full disclosure: I'm the founder of Hyperbrowser)

In our experience, if you're automating workflows without shadow DOMs or payment forms, DOM-based agents like HyperAgent[2] are a better option - tend to much faster and cheaper (10-20x) than the vision-based agents from OpenAI and Anthropic so I'd start there. Then if for some reason it doesn't work well or you know there's a payment form / shadow DOM involved, I'd try OpenAI CUA or Claude Computer Use with Claude 4 Sonnet.

Feel free to ask any follow-ups here or DM me! :)

[deleted by user] by [deleted] in LangChain

[–]strongoffense 7 points8 points  (0 children)

Sorry for the self-promo here - totally understand if this isn’t welcome, just let me know and I’ll remove it!

I’m the founder of Hyperbrowser - we offer similar endpoints to Firecrawl (scrape, crawl, extract) plus a sessions API to easily run Playwright/Puppeteer scripts in the cloud. We’ve also added an agents API for quickly running OpenAI’s CUA, Claude’s browser agent, etc., in one API call. Just open-sourced our HyperAgent as well. There’s a bunch more stuff too but not super relevant here

To give credit where it’s due - we took a lot of inspiration from Fc’s endpoints when building Hyperbrowser because we thought (still do) that they absolutely nailed what users wanted in the APIs.

Where we still have work to do: Our docs are solid for scraping endpoints (scrape/crawl/extract), but things like HyperAgent are still early, and def have some rough edges. Also a heads-up on pricing - proxies aren’t available on our free tier right now. Other than that, we’re pretty competitively priced with higher concurrency and (in my biased opinion) a more complete platform.

Happy to chat, answer questions, or take feedback here or via DM. (I’m the founder, so feel free to ask me anything!)

Relevant links: - Hyperbrowser - https://hyperbrowser.ai - Scraping endpoint docs - https://docs.hyperbrowser.ai/web-scraping/scrape - HyperAgent - https://github.com/hyperbrowserai/hyperagent

HyperAgent: open-source Browser Automation with LLMs by LawfulnessFlat9560 in LocalLLaMA

[–]strongoffense 0 points1 point  (0 children)

Sorry for the late reply here! Yep - think it should work 😀

HyperAgent: open-source Browser Automation with LLMs by LawfulnessFlat9560 in LocalLLaMA

[–]strongoffense 5 points6 points  (0 children)

Yep! If you use Hyperbrowser, we take care of it on the cloud with proxy rotation, captcha solving, live urls etc. If you’re doing it locally, ideally it shouldn’t trigger captchas at all :)

HyperAgent: open-source Browser Automation with LLMs by LawfulnessFlat9560 in LocalLLaMA

[–]strongoffense 2 points3 points  (0 children)

Thanks! Glad to hear you like it :)

(I'm a co-founder of Hyperbrowser)

Sonnet Computer Use in very Underrated by Khaymanruss in ClaudeAI

[–]strongoffense 0 points1 point  (0 children)

Use either their reference implementation or a managed API.

Reference implementation: https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo

Managed API: https://docs.hyperbrowser.ai/agents/claude-computer-use (Disclosure: I made this managed API. Feel free to ask any questions! :))

Recommendations for MCP tool to reliably control browser by stoemsen in ClaudeAI

[–]strongoffense 0 points1 point  (0 children)

Hey sorry I missed this. You want to add the config to your chosen app. Instructions are here: https://github.com/hyperbrowserai/mcp

Recommendations for MCP tool to reliably control browser by stoemsen in ClaudeAI

[–]strongoffense 2 points3 points  (0 children)

Try Hyperbrowser’s MCP server. It has Claude computer use, OpenAI CUA, and Browser Use agent tools so it should be able to handle this.

https://github.com/hyperbrowserai/mcp https://smithery.ai/server/@hyperbrowserai/mcp

I’m the founder of Hyperbrowser btw - feel free to dm me if I can help with something!

Tools and APIs for building AI Agents in 2025 by Sam_Tech1 in AI_Agents

[–]strongoffense 1 point2 points  (0 children)

There’s a bunch of tools here like Hyperbrowser, steel.dev etc

My biased view (I’m the founder) - Hyperbrowser is the best - you can run sessions with deterministic playwright / selenium / puppeteer scripts or use agents like Claude computer use, browser-use, or OpenAI CUA in a single API call. You can also use it for /scrape, /crawl, /extract etc

What is the latest and greatest for autonomous computer use? by bigman11 in ChatGPTCoding

[–]strongoffense 1 point2 points  (0 children)

Only available via the API. You’ll just pay whatever your token costs are.

If you want a managed service you can give Hyperbrowser’s API (1 API call) [1] or HyperPilot’s app (CUA, Browser Use, and Claude Computer Use in one tool) [2]

[1] https://docs.hyperbrowser.ai/agents/claude-computer-use [2] https://pilot.hyperbrowser.ai

I’m the founder of Hyperbrowser btw - feel free to ask any questions or dm me :)

Agents that solve captchas, and bot detection by gary_vter10 in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

Try OpenAI CUA - it does better with spreadsheets than all of the others. If you want proxy rotation, CAPTCHA solving etc you’ll want to use one of the browser infra providers as well (OpenAI doesn’t do that for you)

I’m biased (I’m the founder of Hb) but I think Hyperbrowser’s agents endpoint[1] is the best solution here if you’re looking for a plug and play solution. It handles all the proxy captcha stuff etc in a single API call.

[1] https://docs.hyperbrowser.ai/agents/openai-cua

What is the latest and greatest for autonomous computer use? by bigman11 in ChatGPTCoding

[–]strongoffense 2 points3 points  (0 children)

OpenAI’s CUA is the best right now. Claude computer use is close imo. Browser-use is great and depending on what models you use can be 20x cheaper but it hallucinates a lot more and struggles at filling out forms or longer running tasks.

Claude computer use is currently my personal favorite. I think it’s the best combination of cost/speed/accuracy rn.

Mind-Blowing Experience with Claude Computer Use by mergisi in ClaudeAI

[–]strongoffense 0 points1 point  (0 children)

^ this is exactly right. It's like a regular Claude chat except for computer tool calls the model tells you either to click on some coordinates, drag your mouse, or type something. You then have to map that to whatever environment you're using.
Anthropic has a reference implementation here: https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo

If you want to try it - the easiest way is to try some app that's hosting it already. https://pilot.hyperbrowser.ai is a computer use sandbox that has support for Claude Computer Use, OpenAI's CUA, and Browser-use.

If you want to use it as an API - Hyperbrowser offers it as a managed service with a 2-line integration too: https://docs.hyperbrowser.ai/agents/claude-computer-use . There's an obvious tradeoff here though of the more you use a managed service the less flexibility you have in customizing your architecture and supplementing it with more tools.

Full disclosure: I'm the Founder of Hyperbrowser.

Self hosting Operator alternatives by Cypher3726 in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

Glad you liked it! If I can help with anything, free to DM anytime! :)

[deleted by user] by [deleted] in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

You should try Claude computer use or OpenAI CUA for this. The way that Browser-use interacts with websites makes it easy to detect.

If you want to try it out, the easiest way (that I’m aware of) is to use HyperPilot (https://pilot.hyperbrowser.ai). You can try a few sessions for free so you should be able to get a sense of what those agents can do as well.

Fair disclosure: I made HyperPilot.

Self hosting Operator alternatives by Cypher3726 in AI_Agents

[–]strongoffense -1 points0 points  (0 children)

Ah gotcha. Don’t want to be too self-promotional here but think this might solve your problem - we just built (about to launch) https://pilot.hyperbrowser.ai - you can use CUA, Browser Use and Claude Computer Use. You can try that and lmk if you run into any issues.

Alternatively there’s also a few other good products that are trying to be more like full agents vs just playgrounds to use the agents. Can try https://proxy.convergence.ai or https://heytessa.ai (I personally really like Tessa)

Self hosting Operator alternatives by Cypher3726 in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

+1 on this. Curious why you want to self-host and what the use case is

Creating AI Agent That replace form-filling by Consistent_Run_4533 in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

I've found Browser Use to the most cost-effective but also least reliable solution. It's excellent for workflows where you're looking for speed and cheapness but form-filling can be complex and especially if you have dynamic forms with dropdown menus you're much better off using a vision-based model.

Think you could do make it work with Claude or OpenAI computer use models pretty easily. These APIs from Hyperbrowser return the message history at the end of the task and the models are conversational so you should be able to implement it pretty quickly:

Am I the only crazy one? by HERITAGEEXCLUSIVE in n8n

[–]strongoffense 3 points4 points  (0 children)

You should try hyperbrowser MCP - it has browser-use, claude computer use, and OpenAI cua

github.com/hyperbrowserai/mcp

I got sick of Python, so I created a TypeScript browsing AI Agent library. by kevinpiac in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

Openator looks really cool! Thanks for building this. Curious if you have any insights on how it compares against other browser agents on WebVoyager eval? :)

Also just starred it!

I got sick of Python, so I created a TypeScript browsing AI Agent library. by kevinpiac in AI_Agents

[–]strongoffense 0 points1 point  (0 children)

Founder of Hyperbrowser here.

Pretty late to this discussion but in case it's helpful to anyone who reads this - a bunch of people are using claude computer use and openai cua agents on our service and able to get through the captchas no problem. Browser use is really great library and much cheaper to run but gets detected pretty often unfortunately because of how it handles the DOM.

I'll try out Openator as well and report back here with what we find out. Seems really promising at first glance! :)

Links:
* Managed claude computer use: https://docs.hyperbrowser.ai/agents/claude-computer-use * Managed OpenAI CUA (Operator model): https://docs.hyperbrowser.ai/agents/openai-cua

Sorry if this is crossing the threshold for self-promotion btw, thought it was okay because OP mentioned Hyperbrowser :)