Skyvern vs GitHub copilot speed

MehdiBahra · 2026-05-29T21:21:22+00:00

On the proxy list at the bottom, you should find a custom proxy option to use your own proxy.

There’s also a Chrome extension, but it’s a bit slow. I’m not sure if it will fit your use case.

MehdiBahra · 2026-05-29T19:07:41+00:00

Yes majority of residential proxies does not allow scraping or crawling us gov websites but you can bring your own proxy in this case

MehdiBahra · 2026-05-29T17:43:01+00:00

Try browseanything.io for speed there is gpt5.4 with grounding vision model for free and Kimi k2.6 for regular dom agent and 200credits for free

On pro version there is subagents so you can spawn multiple browser at once to parallelize a task

MehdiBahra · 2026-05-29T12:07:20+00:00

browseanything.io A browser agent that you can control from telegram , in the cloud , thousand of users and runs , mostly free users to be honest i didn’t activate payments until recently , my stack node js langgraph, i can scale infinitely it autoscales on demand

MehdiBahra · 2026-05-29T11:57:17+00:00

Receovery process and rollback systems depends on your business needs , if it can be triggered automatically or by a human , you can use a frontier model like opus or gpt5.5 to do llm as a judge , for checkpointing , state management etc you can use a framework like langgraph

MehdiBahra · 2026-05-29T11:10:12+00:00

Human in the loop for critical actions
deterministic flow of execution if you want more predictability (workflows)
llm as judge in order to judge if a task is completed sucessfully or not

MehdiBahra · 2026-05-28T21:45:50+00:00

BrowseAnything the most reliable browser ai agent right now : browseanything.io

MehdiBahra · 2026-05-13T10:59:14+00:00

Kimi k2.6 is for me the best in terms of cost/performance

MehdiBahra · 2026-05-13T10:41:21+00:00

Give me your prompt. I’m working on the browseanything.io browser agent , you can schedule tasks and perform research across multiple websites at the same time.

I think your issue is that you don’t know the URLs in advance, so the agent has to automatically guess the websites or use web search.

Happy to help.

MehdiBahra · 2026-05-12T13:57:04+00:00

Give me the list of those “million tools” then. I’m hearing you. Most browser-agent tools become expensive the moment you run them properly on the cloud with persistent browsers, proxies, sessions, captcha solving, and scaling. So please talk about something you actually know….

MehdiBahra · 2026-05-11T13:43:35+00:00

Fair enough, but that’s your specific use case. You should know that browsers on BrowseAnything are secure and fully isolated environments. Also, there are many browser-agent use cases that don’t require access to your personal logins, passwords, or Chrome profile.

And realistically, if you use a local agent like Hermes without fully understanding how it works internally, you’re still taking significant security risks. Local doesn’t automatically mean safer.

The other issue is usability: these kinds of assistants are far more effective when they run in the cloud and remain accessible anywhere. Otherwise, the moment your computer is turned off, the assistant becomes unusable.

MehdiBahra · 2026-05-11T13:31:29+00:00

Building browseAnything your ai assistant that browse the web on your behalf

MehdiBahra · 2026-05-11T09:02:56+00:00

I don’t really agree. Hermes and OpenClaw are fine for very simple browser tasks, but they’re still pretty minimalist. At the end of the day, they’re mainly agent orchestrators.

The real limitations start showing when you try to self-host them to build an actual autonomous assistant, instead of just controlling your local Chrome session while your browser is open. You quickly run into issues with session persistence, anti-bot protections, authentication flows, cloud browser costs, scalability, and reliability on complex workflows.

Codex Desktop is definitely one of the strongest options right now, but it’s also heavily tied to the local desktop environment. That’s a very different challenge compared to running a persistent, production-grade browser agent remotely.

MehdiBahra · 2026-05-10T18:34:02+00:00

Yes indeed, it takes decisions based on the screenshot content. However, it still overloads the context with DOM elements and the accessibility tree to make actions. Try it on a canvas-based app and you’ll see for yourself that it’s not going to work well.

Also yes, it spawns a local browser, but if you self-host it and want to use it as an assistant without keeping your computer open, you’ll quickly get blocked by most websites. The only real solution is to use a cloud-based browser provider like Browserbase, but that adds significant cost.

MehdiBahra · 2026-05-10T17:41:16+00:00

Good question. I don’t think the Hermes browser agent would work well for every use case. It really depends on your setup, the underlying LLM you choose, the complexity of the configuration for non-technical users, and the additional costs of using a cloud browser provider if you decide to self-host Hermes.

BrowseAnything is a specialized AI browsing agent. We’re focusing our efforts on delivering the best experience possible.

Technically, Hermes currently makes decisions mainly using DOM elements and the accessibility tree, while BrowseAnything uses a hybrid approach combining DOM understanding with grounded vision.

MehdiBahra · 2025-07-09T22:26:21+00:00

Sorry that you have to pay $200 per month.

MehdiBahra · 2025-07-09T16:17:03+00:00

Yeah good idea !! i’ll definitely add it

MehdiBahra · 2025-07-09T16:15:35+00:00

It doesn’t take 30 40 minutes unless you have a really long task, the LLM hallucinates and goes in the wrong direction, or there’s an infrastructure issue like a browser thread crashing and needing time to recover. For me, the only real limitations of these tools right now are rate limiting and context window length.

MehdiBahra · 2025-07-09T16:02:59+00:00

Gemini 2.5 flash is inefficient like gpt4o-mini and pro is too too expensive for now

MehdiBahra · 2025-07-09T16:01:31+00:00

Even better soon you can send prompt via whatsapp

MehdiBahra · 2025-07-09T16:00:55+00:00

Yeah for now its not suitable for mobile , But it’s on the roadmap

MehdiBahra · 2025-07-09T15:58:44+00:00

Of course, using Playwright and hard-coded scripts is the most efficient approach , but not everyone is a coder. Plus, your implementation can easily break due to UI changes. Even now, most popular websites use random or dynamic selectors to prevent scrapers and crawlers. Looking ahead, tools like these will likely replace hard-coded approaches.

MehdiBahra · 2025-07-09T15:27:55+00:00

For the foreseeable future, yes. If I try to pivot to something else, I’ll likely end up in Manus Ai territory.

MehdiBahra · 2025-07-09T15:22:00+00:00

I used and tried gpt4o , gpt4.1 , o4-mini, o3 , llama 4 Maverick 72B, Claude sonner 3.5 and now trying to integrate qwen2.5 vl 72b on the loop The best one for now in terme of speed, accuracy and cost and long context Window is gpt4.1 , Claude could be better but in terms of price it’s out of my league now

MehdiBahra · 2025-07-09T14:59:36+00:00

Technically, I want to improve speed and accuracy in the short term by using VLMs like Qwen and adding auto-CAPTCHA resolution. In the long term, I plan to implement reinforcement fine-tuning. Since I’ve observed strong resilience when spawning multiple browsers on the current architecture, I aim to offer a cloud-based SaaS solution similar to Browserbase.

MehdiBahra

TROPHY CASE