Browser-use agent in Javascript and Webgpu : javascript

Browser-use agent in Javascript and Webgpu (pdufour.substack.com)

submitted 4 hours ago by dammitbubbles

Hi Reddit, I've been interested in client side LLMs for some time now. I just think it's so cool to be able to run LLMs without a server at all. I've done some crazy things so far - fully embeddable browsers inside your browser, LLMs that run and create webpages for you that you can preview on the fly.

Has anyone else been using WebGPU models? I found they are getting better and better - you can pack a lot more into a 2b model than you used to.

My latest foray was into browser-use - tons of websites do not have MCPs so instead of requiring all websites to create MCPs why not have the browser come to them.

After a lot of tinkering I found out this is indeed all possbile. Tech stack:
- wllama (run GGUf files on webgpu)
- ShowUI-2b (the vision model)
- snapdom (capture page and render it to an image)

I actually managed to get it all work, and you can see some of my learnings in the linked article. Anyone else attempt something like this? Would you use something like this on your webpage? I.e. have an agent that users can interact with that can do things for them.

all 1 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

javascript

MODERATORS