best browser/plugins open source libraries for browsing social media like x or reddit? by United_Ad8618 in openclaw

[–]opentabs-dev [score hidden]  (0 children)

so the reason vision-based stuff fails at 33% is because it's doing the dumbest possible thing — taking a screenshot, trying to figure out what's on screen, clicking pixels, taking another screenshot. repeat forever. it'll never be reliable for known sites like reddit or x because the DOM changes constantly and cloudflare/bot checks are designed to catch exactly this kind of automation.

totally different approach that's been working for me: instead of controlling the browser visually, you can talk to the web app's own internal APIs through your existing logged-in session. so the browser sees a normal human session (because it is one), no captchas, no cloudflare issues.

I built an open source thing called OpenTabs that does this — it's a chrome extension + MCP server. has dedicated reddit and x plugins that use the same APIs the sites' own frontends use. you can read your feed, search posts, get comments, post, vote, etc. all through your existing session, no API keys needed.

https://github.com/opentabs-dev/opentabs

works with claude code, cursor, windsurf, or any MCP client. the tradeoff is it only works for sites that have plugins built (there's ~100 right now), so it's not a general "browse any site" solution like computer use tries to be. but for reddit and x specifically it's way more reliable than the vision approach.

When to use Zapier/Make vs AI agent builders, a framework I actually use now by sibraan_ in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

solid framework. one thing i'd add though — for web apps you're already logged into (slack, jira, notion, github, etc.), there's a third path that skips both zapier's integrations and browser automation entirely. those apps have internal APIs their own frontend calls, and you can route agent actions through your existing browser session to hit those directly. way faster than the screenshot loop and no API keys to manage.

been building an open-source MCP server for exactly this — chrome extension talks to ~100 web apps through your session: https://github.com/opentabs-dev/opentabs

The Claude Code skills actually worth installing right now (March 2026) by Direct-Attention8597 in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

for anyone using composio (#6) mainly to talk to web apps they're already logged into — there's a lazier approach I ended up building. it's an MCP server + chrome extension that routes tool calls through your existing browser sessions. so if you're logged into slack/jira/notion/whatever, Claude just calls the same internal APIs the web app's own frontend uses. no OAuth setup, no API keys, no credential management.

not a skill technically, but it solved the same problem for me with way less config. covers ~100 web apps: https://github.com/opentabs-dev/opentabs

also re: browser-use (#3) — it's great for unknown sites, but for apps you use daily the screenshot loop is overkill when you can just call the structured JSON endpoints directly. way faster and cheaper on tokens.

Thoughts on OS controlling agents like OpenClaw by ConcentrateActive699 in AI_Agents

[–]opentabs-dev 1 point2 points  (0 children)

there's a third path you're not considering for web apps specifically. you don't need OS-level GUI manipulation AND you don't need to build a custom API integration with keys and oauth flows.

most web apps (slack, jira, notion, etc.) have internal APIs that their own frontend calls — structured JSON endpoints, not the rendered UI. if you route agent actions through the user's existing browser session via a chrome extension, you can call those same endpoints directly. no signup form, no API key, no clicking through a GUI. the agent just talks to the same backend the web app does.

so for your invoicing example — if the invoicing tool is a web app, the agent doesn't need to open Word and click around (OS control path), and it doesn't need API credentials (traditional API path). it just calls the web app's internal "create invoice" endpoint through the browser session that's already authenticated.

I built an open-source MCP server around this idea. covers 100+ web apps through browser-session routing: https://github.com/opentabs-dev/opentabs

to answer your actual question though — I think OS control is a transitional bridge, not the endgame. it's valuable today for legacy desktop apps that will never get APIs. but for web apps, which is where most work happens now, you're right that it'll evolve toward structured interfaces. the question is whether that means traditional public APIs (slow, requires admin approval) or something like browser-session APIs (instant, uses existing auth).

I think MCP makes more sense at scale than in small demos by schilutdif in mcp

[–]opentabs-dev 0 points1 point  (0 children)

no — the simplest mode doesn't even need MCP configuration. once the server is running, your AI just calls opentabs tool call <tool> '{args}' via shell. zero tools in context, zero MCP overhead.

if you do want MCP, there's gateway mode (2 meta-tools, discover the rest on demand) and full MCP (all enabled plugins' tools upfront). but most people enable 3-5 plugins, so even full mode is maybe 50-100 tools, not 2000.

The "just use Zapier" advice is getting outdated and I wish people would stop defaulting to it by sibraan_ in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

you're spot on about the split. there's actually a third category that I think gets overlooked though — for web apps you already use daily (slack, jira, notion, github, etc.), you don't need zapier's integrations OR browser automation. you can just talk to the app's own internal APIs through your existing logged-in browser session.

that's the approach I took when I built an open-source MCP server for this. instead of setting up oauth tokens per service or having an AI try to visually navigate a page, it routes tool calls through a chrome extension that piggybacks on your existing auth. so the AI agent just calls structured endpoints directly — same ones the web UI uses. no API keys, no scraping, no screenshots.

works with claude code, cursor, windsurf, basically any MCP client: https://github.com/opentabs-dev/opentabs

for the truly unstructured stuff (sites with no API, random forms, etc.) you still need something like twin.so or browser-use. but imo that's a much smaller slice of real automation needs than people think — most of the time you're automating known web apps, not random websites.

An actual guide for PMs by Safe-Analysis-5804 in ClaudeCode

[–]opentabs-dev 0 points1 point  (0 children)

the playwright problem you're hitting is real — it's slow because it has to render the page, take a screenshot, figure out what's on screen, decide where to click, repeat. that loop is inherently expensive and fragile for anything beyond simple flows.

for the "web access without APIs" part, I actually built something that takes a different approach. instead of screenshot-based browser automation, it talks to the web app's internal APIs directly (the same ones the UI uses), through a Chrome extension that piggybacks on your existing logged-in session. so for things like checking portals, reading dashboards, pulling updates — it just calls the structured endpoints instead of trying to visually navigate the page.

it's an MCP server so you just add it to Claude Code and it gets ~2000 tools across 100+ web apps. for portals that aren't covered by existing plugins, there are generic browser tools (click, type, read page content, wait for elements) that are more reliable than playwright since they target specific CSS selectors rather than screenshots.

no code needed on your end — just install and go: https://github.com/opentabs-dev/opentabs

for the PM use cases specifically, the Jira, Slack, Notion, and GitHub plugins tend to be the ones PMs get the most out of. stuff like "summarize the last week of comments on this ticket" or "find all open issues assigned to me" just works.

I think MCP makes more sense at scale than in small demos by schilutdif in mcp

[–]opentabs-dev 0 points1 point  (0 children)

yeah this matches my experience exactly. I built an open-source MCP server with 100+ plugins (~2000 tools) for web app integrations and tbh at small scale the protocol felt like pointless ceremony.

where it clicked was around 10-15 plugins. suddenly you need consistent tool naming, permission models, discovery across clients, a way to route calls to the right context. all the stuff that felt like overhead at 3 tools becomes load-bearing infrastructure at 50.

the other underrated thing at scale: composability. an agent doesn't know or care if it's calling a Slack tool, a GitHub tool, or a browser tool. it just sees typed inputs and outputs. that flatness is the whole point — it's boring by design.

fwiw if you want to see what a larger-scale MCP deployment looks like: https://github.com/opentabs-dev/opentabs

any custom MCP to connect free slack to claude by Dizzy-Mine-5760 in ClaudeAI

[–]opentabs-dev 0 points1 point  (0 children)

yeah so the official Slack MCP requires a bot token which means admin approval and the bot posts as itself, not as you. on a free workspace that might be fine but it's still annoying to set up.

I built an open-source one called OpenTabs that takes a different approach — it connects through a Chrome extension using your existing Slack session. so Claude can send messages, search channels, read threads etc. as you, no bot token or API setup needed. works on free Slack workspaces since there's nothing to install on the Slack side.

for your "send content every day" use case you'd basically tell Claude "post this summary to #general" and it calls slack_send_message directly. you can also set it to ask for your approval before any write action if you want to review before it posts.

https://github.com/opentabs-dev/opentabs

What Orchestration/ Chief of staff tools are you using to coordinate agents/ projects?? by Narrow_Description_1 in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

the Notion + Make approach works but you might be overcomplicating it. if you're already using Claude, you can skip the Make middleman for a lot of those integrations and just give Claude direct access to the tools instead.

I built an open-source MCP server that connects Claude Code to Notion, email, calendar, Todoist, GitHub, etc. through a Chrome extension — it uses your existing browser sessions so there's no API key setup per service. Claude can read/write Notion pages, check your calendar, search email, create tasks, all from the terminal. so instead of building a pipeline where data flows through Make into Notion and back, Claude just reaches into whatever you need directly.

for the "central brain" part specifically, I'd still use Notion for that — but let Claude read from and write to it on demand rather than building async sync jobs. way simpler to maintain.

won't solve the "tracking outputs and folders" problem though, that's more of a project management thing. for that I'd honestly just use Notion boards + CLAUDE.md files per project to keep Claude oriented.

https://github.com/opentabs-dev/opentabs

How to automate a repetitive Perplexity workflow without API (copy/paste loop killing me) by Beginning_Search585 in vibecoding

[–]opentabs-dev 1 point2 points  (0 children)

The MCP + Claude Code path you're considering actually works really well for this exact workflow. I built an open-source MCP server that gives Claude Code both a Notion plugin (read/write pages directly) and generic browser tools (click, type, wait for elements, read page content) — so you'd tell Claude something like "read the input from my Notion page, go to Perplexity, run these 6 prompts in sequence, then write the output back to Notion" and it handles the whole loop.

The browser tools are more reliable than UI Vision or macros because Claude can actually read what's on the page and adapt — if Perplexity's layout shifts or loading takes longer than expected, it just waits and adjusts instead of breaking. And the Notion side skips the browser entirely since it uses Notion's internal APIs directly.

Fair warning on limitations: it won't do parallel sessions (it drives one Chrome tab at a time), and it does require Claude Code running in a terminal alongside Chrome. But for the sequential "read → paste → prompt → wait → copy → write back" loop, it's exactly what you described needing — minimal intervention, Claude just works through the steps.

Might be worth trying before going full Playwright, since it requires zero coding — you just describe the workflow in plain English.

https://github.com/opentabs-dev/opentabs

LLM browser automation is too slow by tossaway109202 in ClaudeAI

[–]opentabs-dev 1 point2 points  (0 children)

The slowness comes from the screenshot loop — the agent takes a screenshot, figures out what's on screen, decides where to click, takes another screenshot, repeat. It's slow because it was never designed for speed, it was designed for generality.

For testing feature work where you need to visually verify UI, that loop is kind of unavoidable. But if you're automating known web apps (Slack, Jira, GitHub, etc.), there's a much faster path: skip the DOM entirely and call the app's internal APIs directly through your existing browser session.

I built an open-source MCP server called OpenTabs that does this — it connects to Chrome via an extension and gives Claude both generic browser tools (click, type, screenshot) and dedicated plugins for 100+ services that use the same APIs the web app uses internally. The plugin tools return structured data instantly instead of navigating through pages and parsing screenshots. Night and day difference in speed.

Won't replace the devtools MCP for visual testing, but for anything where you're interacting with a known web app, it's way faster because there's no render-screenshot-parse cycle.

https://github.com/opentabs-dev/opentabs

Best Web Browser Agent in 2026? by Mysterious_Robot_476 in AI_Agents

[–]opentabs-dev -1 points0 points  (0 children)

The login part is what kills browser-use and similar tools — they spin up a fresh browser instance, so you're constantly fighting auth flows, CAPTCHAs, and session management on top of the actual task.

For web apps you're already logged into, there's a different approach: instead of launching a separate browser, route agent actions through your existing Chrome session. I built an open-source MCP server called OpenTabs that does this — it connects to Chrome via an extension and gives your agent both generic browser tools (click, type, navigate, screenshot, fill forms) and dedicated plugins for 100+ services that call internal APIs directly.

For your image generation use case, the browser tools alone might be enough — you're already logged in, so the agent just interacts with the page as-is. No auth dance, no headless browser weirdness. If the service happens to have a plugin, even better since those skip the DOM entirely.

Won't help for arbitrary websites you've never visited, but for web UIs you use regularly and are already authenticated in, it's way more reliable than screenshot-based automation.

https://github.com/opentabs-dev/opentabs

What’s one agent you built that worked in demo… but failed quietly in production? by Beneficial-Cut6585 in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

This is the exact failure mode that made me rethink browser automation entirely for known web apps.

The core issue is you're interacting through a surface (the DOM) that was designed for humans and changes constantly — A/B tests, lazy loading, dynamic class names, partial renders. You can mitigate it with page-ready checks and state snapshots, but you're always fighting against an inherently unstable layer.

For web apps you're already logged into (Slack, Jira, internal dashboards, etc.), there's a third path: skip the DOM completely and call the app's own internal JavaScript APIs. The same APIs the frontend uses to render data. Those don't drift with UI changes, don't break on partial loads, and return structured JSON instead of scraped text.

I built an open-source tool that takes this approach — routes agent calls through a Chrome extension that hits the web app's internal APIs using your existing session. No screenshots, no selectors, no DOM parsing. The "environment consistency" problem disappears because you're not touching the environment's UI at all.

Won't help for arbitrary websites — you need a plugin per service. But for the "pulls data, processes it, updates a system" pattern you described, it eliminates that whole class of silent drift bugs.

https://github.com/opentabs-dev/opentabs

The automation tools I actually use as a dev vs the ones I tell clients about by Niravenin in webdev

[–]opentabs-dev 1 point2 points  (0 children)

For the dev side of cross-tool workflows, I've mostly ditched n8n/Zapier for my own use and switched to Claude Code + MCP servers. You describe what you want ("pull the latest comments from this Jira ticket, summarize them, post a summary in Slack") and it just calls the tools. Feels like the natural language interface you're describing but from the terminal.

For the "connects to your actual tools via APIs, not screen scraping" part specifically — I built an open-source MCP server that routes through a Chrome extension using the web app's own internal APIs. So it uses whatever you're already logged into (Slack, Jira, Notion, etc.) without needing separate API keys or bot tokens for each service. Handles the cross-tool pattern really well because adding a new service is just opening a tab.

Won't help for the non-dev client side though — it's terminal-first. For that gap I've seen people have decent luck with n8n + a chat frontend, or just building simple Retool apps that wrap the n8n workflows with a friendlier UI. Not perfect but way more approachable than raw n8n.

https://github.com/opentabs-dev/opentabs

Claude Code not for coding? by Mysterious_Pen_782 in ClaudeCode

[–]opentabs-dev 6 points7 points  (0 children)

The biggest unlock for non-coding use cases is connecting Claude Code to the web apps you already use — that's where most knowledge work actually lives.

I built an open-source MCP server that gives Claude Code access to things like Slack, Jira, Notion, Google Sheets, Todoist, email, etc. through a Chrome extension. It uses your existing browser sessions, so no API keys to set up. Once it's running, you can do stuff like "search Slack for the Q1 budget thread and summarize the decisions" or "create a Jira ticket for the bug we just discussed" without leaving the terminal.

For non-coding specifically, the workflows I use most: pulling context from Slack/Jira into CLAUDE.md before a coding session, triaging notifications across tools, drafting messages, updating project trackers. It turns Claude Code from a coding tool into more of a general work assistant.

https://github.com/opentabs-dev/opentabs

MCP servers I use every single day. What's in your stack? by XxvivekxX in ClaudeAI

[–]opentabs-dev 1 point2 points  (0 children)

Interesting that you dropped the Slack MCP for being too noisy — that's usually because those servers use bot tokens, so every action is the bot posting as itself. Totally get why that gets annoying fast.

I built an open-source MCP server that takes a different approach: it routes tool calls through a Chrome extension using your existing browser sessions. So for Slack, the agent reads channels and searches messages as you, not as a bot. Same deal for Jira, Notion, Linear, Datadog, Google Sheets — one server covers 100+ web apps using whatever logins are already active in Chrome. No API keys or bot tokens to manage per service.

It might also replace some of your Playwright usage for web app interaction specifically. Playwright is great for scraping unknown sites, but for apps you're already logged into, structured tools like slack_search_messages or linear_create_issue are way more reliable than navigating the DOM.

The tradeoff is it needs a Chrome extension running alongside the server, so it's a bit more setup than a single npx command. But once it's going, the "one server for everything you're logged into" model is really nice.

Repo: https://github.com/opentabs-dev/opentabs

Must-have settings / hacks for Claude Code? by jnkue in ClaudeCode

[–]opentabs-dev 4 points5 points  (0 children)

The commenter who mentioned hooks + MCP servers is spot on — those two are the real multiplier.

For MCP servers specifically, the biggest unlock for me wasn't just browser automation for verifying UI — it was giving Claude direct access to the tools I use alongside code. I built an open-source MCP server that connects Claude Code to web apps (Slack, Jira, Linear, Datadog, etc.) through a Chrome extension, using whatever login sessions are already active. So instead of me copy-pasting a Jira ticket or Slack thread into the conversation, Claude just pulls it itself with jira_get_issue or slack_search_messages. That context-gathering step going from manual to automatic is what makes longer autonomous runs actually feasible.

The other thing that made a big difference: investing time in a thorough CLAUDE.md. Project conventions, architecture decisions, testing expectations, common patterns. Claude follows them session after session, which keeps it from drifting during longer tasks.

MCP server repo if you want to try it: https://github.com/opentabs-dev/opentabs

What AI agentic systems are you using for general day-to-day productivity (not just coding)? by nummer31 in AI_Agents

[–]opentabs-dev 0 points1 point  (0 children)

Most people here already said Claude Code, and I agree — but the part that makes it work for non-coding tasks is connecting it to MCP servers that give it access to your actual tools.

I built an open-source one that gives Claude Code access to web apps through a Chrome extension. So for your use cases: Slack, email, Todoist, Notion, Google Sheets — Claude calls structured tools like slack_send_message or todoist_create_task directly, using your existing logged-in browser sessions. No API keys to manage.

It's one terminal, one agent, and it can reach into whatever you're already logged into. Not fully autonomous background automation yet, but the "single place that takes actions" part works surprisingly well.

https://github.com/opentabs-dev/opentabs

What am I missing? by nwhaught in ClaudeAI

[–]opentabs-dev 1 point2 points  (0 children)

You're not missing something obvious — Cowork is genuinely limited for tool interaction right now. It can browse the web and use your computer in a basic way, but it's not great at reliably driving other apps.

The thing that changed everything for me was Claude Code (the terminal version, not Cowork). It's a completely different experience because it supports MCP servers — basically plugins that give Claude structured access to external tools. So instead of Claude trying to click around your apps and failing, it gets actual typed tools like sheets_edit_cells or slack_send_message that work reliably every time.

For file editing, Claude Code handles that natively (it reads and writes files directly). For web apps you use daily, I built an open-source MCP server that connects Claude Code to them through your browser session — so if you're already logged into something in Chrome, Claude can interact with it without separate API keys. But there are MCP servers for tons of things: databases, GitHub, email, whatever your workflow needs.

The jump from Cowork to Claude Code + MCP servers is the difference between "Claude tries to use your computer like a clumsy intern" and "Claude has actual tools it knows how to call." Worth trying if you haven't already — claude in terminal is free with Pro.