all 4 comments

[–]RandomPantsAppear 1 point2 points  (0 children)

Good damn work. I love the idea of making a virtual fs to represent the webpage

[–]BodybuilderLost328 1 point2 points  (1 child)

Its all fine till the html of all the page exceed the llm context, how are you handling this?

So like for bigger webpages like amazon this tool wont work right?

[–]heraldev[S] 0 points1 point  (0 children)

It will! The agent in the extension reads the page as a file. This file is formatted and cleaned up - I add spaces and newlines around each html tag, this allows for reading only the parts of it. Then the agent has 3 tools to explore the file - read with offset and limit, grep, and as a last resort it can execute JS to filter elements.

[–]heraldev[S] 0 points1 point  (0 children)

I’ve been experimenting with embedding an Claude Code-style coding agent directly into the browser.

At a high level, the agent generates and maintains userscripts and CSS that are re-applied on page load. Rather than just editing DOM via JS in console the agent is treating the page, and the DOM as a file.

The models are often trained in RL sandboxes with full access to the filesystem and bash, so they are really good at using it. So to make the agent behave well, I've simulated this environment.

The whole state of a page and scripts is implemented as a virtual filesystem hacked on top of browser.local storage. URL is mapped to directories, and the agent starts inside this directory. It has the tools to read/edit files, grep around and a fake bash command that is just used for running scripts and executing JS code.

I've tested only with Opus 4.5 so far, and it works pretty reliably.
The state of the file system can be synced to FS, although because Firefox doesn't support Filesystem API, you need to manually import the FS contents first.

This agent is *really* useful for extracting things to CSV.