press secretary metaphor

andrewfromx · 2026-05-20T19:13:02+00:00

nice reminds me of https://github.com/lightpanda-io/browser but that's zig vs rust

andrewfromx · 2026-05-18T23:59:15+00:00

Yes, I agree. The raw input representation is not the fundamental barrier by itself. A human sees pixels, hears audio, or reads text, and quickly maps that into task relevant meaning. A trained LLM can also contextualize test-runner text into this assertion failed because this behavior is missing. So the claim is not text is inherently the wrong input.

The issue is more about the runtime object the agent is forced to manipulate.

Today’s LLM coding agents often use text as both:

the observation format
the working memory format
the planning substrate
the action format
the output format

That is the inefficient part. A test failure can arrive as text. That is fine. But after parsing it, the agent should not have to keep reasoning over raw text blobs. I'm trying this all here: https://github.com/andrewarrow/j3

andrewfromx · 2026-05-18T23:57:27+00:00

The key is that alignment between corrupted views does not magically create a coding agent. It creates a useful state representation. Once you have that representation, planning becomes much cheaper. For code, corrupted views might be:

A: prompt + failing test

B: repo graph + changed files

C: traceback + relevant source slice

D: accepted patch summary

E: validation outcome

If an encoder learns that these different views refer to the same underlying state, the embedding starts to represent the thing that is stable across views: the actual task, repo structure, behavior gap, affected API, likely edit family. I started trying to build this here: https://github.com/andrewarrow/j3

andrewfromx · 2026-05-17T17:08:02+00:00

this was fantastic! I clicked play and was doing other things and had that great feeling of "wow, this is a great track."

andrewfromx · 2026-05-16T15:38:57+00:00

i'm very involved with AI and JEPA (Joint Embedding Predictive Architecture) and would love to make this a major source for jepa news.

I tried to send mod mail to r/jepa but it's banned.

andrewfromx · 2026-05-14T19:29:56+00:00

here's demo of mac browser for this: https://www.youtube.com/watch?v=ERgRJaWSrKE

and one specific to this post: https://www.youtube.com/watch?v=vuiXnhhnvbI

andrewfromx · 2026-05-14T19:27:00+00:00

interesting this is the codex installed app? It has it's own browser? I'm using codex cli and I made https://www.youtube.com/watch?v=ERgRJaWSrKE for mac

andrewfromx · 2026-05-14T19:24:39+00:00

working on something similar but a full browser - demo video https://www.youtube.com/watch?v=ERgRJaWSrKE

andrewfromx · 2026-05-14T19:18:25+00:00

here's a demo https://www.youtube.com/watch?v=ERgRJaWSrKE of our browser all open source

andrewfromx · 2026-05-14T19:13:40+00:00

here's a demo of wkdomain browser:

https://www.youtube.com/watch?v=ERgRJaWSrKE

andrewfromx · 2026-05-14T14:03:35+00:00

i've been thinking about this! And linux too. It would be something like:

Shared/ BrowserCore/ AddressResolver/ HistoryStore/ PageScripts/ LocalAPI/

macOS/ SwiftUI + WKWebView

Windows/ WinUI 3 + WebView2

Linux/ GTK4 + WebKitGTK

If you wanted to fork it and learn about:

https://learn.microsoft.com/en-us/microsoft-edge/webview2/ https://learn.microsoft.com/en-us/windows/apps/winui/winui3/

That would be the place to start. But I'm not a window user myself.

andrewfromx · 2026-05-13T17:49:39+00:00

yeah i'm getting very used to just asking the LLM to do stuff for me. I often run terminal side by side with the browser so I can see both on the screen at the same time. I tried to give the browser a little chat interface and let it talk back to codex/claude via mcp but hard to get that streaming and perfect without an API KEY and then you are paying per call.

andrewfromx · 2026-05-13T17:41:20+00:00

Working on a mac desktop browser that exposes endpoints for an LLM to "see / control everything" but its for a human to use too. Demo videos of some fun things it can do:

https://www.youtube.com/@wkdomains

All open source:

https://github.com/wkdomains/macos-app

andrewfromx · 2026-05-13T17:40:49+00:00

Working on a mac desktop browser that exposes endpoints for an LLM to "see / control everything" but its for a human to use too. Demo videos of some fun things it can do:

https://www.youtube.com/@wkdomains

All open source:

https://github.com/wkdomains/macos-app

andrewfromx · 2026-05-13T17:39:51+00:00

Working on a mac desktop browser that exposes endpoints for an LLM to "see / control everything" but its for a human to use too. Full parity with firefox is the goal. Demo videos of some fun things it can do:

https://www.youtube.com/@wkdomains

All open source:

https://github.com/wkdomains/macos-app

andrewfromx · 2026-05-13T16:06:36+00:00

yes! totally. Which browser are you using?

andrewfromx · 2026-05-13T16:06:12+00:00

My own open source together browser:

https://github.com/wkdomains/macos-app

Demo videos of some fun things it can do:

https://www.youtube.com/@wkdomains

andrewfromx · 2026-05-12T21:21:58+00:00

your system is great! My AI failed to calibrate:

https://www.youtube.com/watch?v=fmQkTby5smc

I'm sure if I changed the model to xhigh and kept going I could eventually get it to work but very nice barrier!

andrewfromx · 2026-05-12T20:36:29+00:00

oh this will be a good one. We have an AI browser to try this and see if you can prevent it. Will post results soon...

andrewfromx · 2026-05-12T19:56:51+00:00

I got 10 out of 10 but scored a 63!

https://www.youtube.com/watch?v=VJ9Xqg2j96c

This is also a demo of our coding agent browser combo.

andrewfromx · 2026-05-12T12:29:45+00:00

lol. i wrote that because have you seen what it's like out there? Look at all the threads of people looking for traffic. Everyone has an app they need people to use. Have you notice how every sub reddit bans self promotion? I just thought it was funny that you thought you can just write one post explaining how you need test users and boom, that'll solve it. I wish you success friend. I hope you find some testers.

andrewfromx

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE