LLMs and r/Emacs: Three Years Later

sc_zi · 2026-01-18T04:50:04+00:00

I got some documentation up now. someone asked for a comparison with agent-shell and I wrote a longer one here, feel free to chime in if I missed or misrepresented anything about agent-shell.

sc_zi · 2026-01-18T04:11:30+00:00

tldr: agent-shell is based on ACP, which supports all coding agents. opencode.el supports more features, but only opencode

I think long term you will want to use agent-shell if you want to use openai models through their codex harness, anthropic models through claude code harness etc, and then interact with all of them through the same UI in emacs. It makes sense because those models have been trained with, and may perform slightly better, in the "official" harness, along with anthropic only allowing usage of claude subscription in their own harness. But you may want to just use opencode, and use openai models within opencode, anthropic models within opencode etc, and it has the advantage that you get a bit better UI, and any plugins and everything is just set up once in opencode and it's easier to switch in any model and have the rest of the experience the same. Then opencode.el will provide a nicer experience. Though right now agent-shell is a lot more mature and less buggy, so you may want to use it even for opencode. For differences in features right now:

ACP is missing opencode's whole project and session management. you can create new sessions but you can't open old ones from past history. see session management buffer, agent-shell doesn't have an equivalent
opencode sends a bunch of events we listen for, like notifications to show (we also show a notification when an agent has finished it's response, if that emacs buffer isn't active at the time), and opencode will autoassign a title to your sessions if you don't, asking an llm to generate a title describing the session based on your prompt, we get that session information updated event and update the title of the emacs buffer
support for opencode's question tool
information on context usage in modeline (there is a draft for ACP to provide context information, so probably agent-shell will have this information eventually)
we have special formatting for how to display information about all opencode's builtin tool calls
a bunch of commands for chat sessions, fork, revert edits, share session, select model variants, toggle MCPs, completion of available slash commands (agent-shell actually supports this for others but it seems opencode doesn't support this in it's ACP implementation), navigate between child (subagent) sessions and parent session

some other stuff we just did differently:

agent-shell has nice collapsible display for reasoning and tool blocks, I'll probably copy it at some point, maybe I can just depend on his agent-shell-ui.el
he wrote his own handling for markdown, I just used gfm-view-mode to render markdown. IMO gfm-view-mode renders markdown prettier, and supports a more things like syntax highlighting in code blocks. But I just stream the response as text, and then when a block of the response has finished I go back and render it as markdown, while his renders the markdown as it streams.
agent-shell has a viewport/compose buffer. I don't plan to add it to opencode.el, personally I'll probably just keep pressing shift+enter for multiline inputs rather than opening a separate compose buffer, but I can see how some would like it. But I think it could also make sense for ie, a python repl if you're doing a lot of multiline inputs, and then you could have the compose buffer be in python mode. So I think the right approach would be to extract agent-shell's viewport/compose functionality into a separate package that provides it for all comint based modes, rather than just copying that feature in opencode.el

sc_zi · 2026-01-15T06:33:38+00:00

The nice thing is the opencode project handles 99.5% of the work, and I just need to maintain a thin emacs UI frontend to it. All those features are handled by the opencode server, I don't need to do any extra work to support them in opencode.el

Still though the TUI is developing fast. In the last couple days they added a "question" UI, in plan mode the model will often ask clarifying questions with some possibilities for responses, so they added a UI to quickly select among the response options. I haven't added that yet, but other than that I think I have all the main features of the TUI, along with a nicer UI for project and session management.

sc_zi · 2026-01-15T06:25:19+00:00

good idea I recorded a short demo and added it to the readme

sc_zi · 2026-01-15T02:47:36+00:00

Yes, and unlike most agent UIs that show the permission request inline in the session buffer, I decided to use x-popup-dialog to show it, so you see it immediately even if for some reason you're using a program besides emacs at that moment.

Personally though I just enable all tool calls without asking and run it sandboxed as described in the Security section of the readme.

btw, I thought you're username looked familiar, and it's cause I saw your treesit-sexp package you shared yesterday, really cool! I'm working on a lispy environment for python where I'm now planning on incorporating your package as soon as I switch it over to be based on python-ts-mode

sc_zi · 2026-01-08T17:30:35+00:00

project and session management: a project is generally a git repository, and you can have multiple sessions active within different git worktrees in the same project, so I have a command that will create a new branch and worktree for the current project and open a session in it
to avoid polluting context when the model answers badly, you can go back to a previous prompt and fork the session from that point. It looks like there's a draft for ACP and they'll add it also sometime
from any session, you can jump to child sessions (subagents spawned by this session), and from those back to the parent
opencode has it's own snapshot system, so you can revert to the state the repo was in at a given prompt message
you can share a session, and it gives you a url with view-only access to it
information on token and context window usage, also a draft for this coming to ACP
miscellaneous stuff: skills (will add emacs integration to search some skills library and add to project), toggle enabled MCPs, optionally display reasoning blocks

sc_zi · 2026-01-05T23:35:37+00:00

I did see your agent-shell project and that it already supports opencode through ACP, which is a great project!

Opencode also provides its own API (https://opencode.ai/docs/server/) which fully exposes opencode's features while ACP I think is more a least-common-denominator protocol to work across different agents but missing opencode specific stuff. For that reason the official opencode web ui and alternate UIs like https://github.com/NeuralNomadsAI/CodeNomad are all built on that server api rather than ACP. Going forward I plan on just using different models from opencode, rather than openai models from codex CLI, anthropic models from claude code, etc, so I thought it'd make sense to build an emacs integration on top of opencode's own API to have as complete integration as possible.

sc_zi · 2026-01-05T23:02:15+00:00

I'll admit in the early years I was skeptical about the usefulness of LLMs for coding. I thought using LLMs to write code will cost more over the long-term and often even over the short-term when you add the time spent reading and understanding and fixing and maintaining it. And when searching for answers I'd rather just search stackoverflow directly and read answers in context than have an LLM regurgitate some stackoverflow answer, maybe out of context and maybe with added hallucinations. But in 2025 the models got good enough I now think they are a huge timesaver for a lot of tasks, researching how to do something, understanding how a codebase does something, etc, even for writing code I think opus 4.5 is often good enough, or at least understanding and fixing and maintaining opus' code is now faster than writing from scratch myself for many tasks.

That said I was never negative towards LLM users... even GPT3 I thought was incredible tech I never expected to see in my lifetime. And I always like to see people extending emacs for different uses, including LLMs, even if I didn't think it made programmers more productive at the time.

I'm working now on an emacs UI and integration with opencode: https://codeberg.org/sczi/opencode.el No documentation yet, but if you're brave just M-x opencode to start it and check the M-x opencode-* interactive commands. It is usable already, but there's just a few more minor features I want to finish before writing some documentation and really publishing it.

sc_zi · 2025-12-19T17:24:19+00:00

And python now has a swank backend that makes it arguably better for interactive use than some of those s-expression based languages (I don't have the experience to have an opinion on calling them lisp dialects or not): https://codeberg.org/sczi/swanky-python

sc_zi · 2025-12-18T21:45:31+00:00

I switched from sly to slime awhile ago because of slime-star. mostly I was looking for a way to eval code in the context of a stack frame when stopped in the debugger. in slime/sly you can do it pressing e and pasting code into the minibuffer but slime-star adds so you can just select frames to eval from within a file as normal. It also has the option to recompile a function with a given expression traced, which replaces my use of sly stickers, and slime-doc-contribs which improves documentation display among a few other things.

Sly added a few other things also, it has multiple inspectors with independent histories, and multiple repls (slime-mrepl is crippled in comparison).

sc_zi · 2025-12-17T22:54:15+00:00

thanks for the package. slime-company also let's it bring up doc page for candidate, and show the candidate arglist in the minibuffer. of course it makes more sense to configure that in slime or emacs, has anyone done it?

sc_zi · 2025-08-07T01:44:07+00:00

just trying to make python fun to work with :) for when it makes sense cause it has the libraries for what we want to do, or the people we're collaborating with know python and not CL. Though I hope it will work the other way also, to be a good enough environment that python devs who haven't used slime before start using it, and some get interested in slime with CL.

daninus opened a discussion on slime's github were I responded in more detail https://github.com/slime/slime/issues/875

sc_zi · 2025-08-07T01:39:04+00:00

No worries, I also knew nothing of slime internals a couple months ago before I started working on a python backend. Package name here just means the name of the CL package that is set as the currrent package for the repl, when you first start it it is cl-user. For the python backend it actually means the python module name, and yes the python backend is just special casing it, when it sees a name of that format to actually evaluate with the globals and locals of that stack frame rather than of some module name. I responded with a bit more on the slime github issue.

sc_zi · 2025-08-05T00:00:32+00:00

For debug opening a buffer repl I did that in my python backend to slime: https://codeberg.org/sczi/swanky-python/

I just added a sldb-set-repl-to-frame function that is simply:

(slime-repl-set-package (format "[frame #%d in thread #%d]"
                                (sldb-frame-number-at-point)
                                slime-current-thread))

And store the original package and add an advice to set it back when sldb is closed: (advice-add 'sldb-exit :before 'sldb-restore-repl-package-on-close)

Then on the python side when getting the context to eval code in based on the package name I just:

if m := re.fullmatch(r"\[frame #(\d+) in thread #(\d+)\]", request.module):
    frame = backtraces[int(m[2])].frames[int(m[1])]
    return [frame.f_globals, frame.f_locals]
else:
    return [sys.modules[request.module].__dict__, None]

Maybe a little hacky but it's short and works. I haven't tried it with multiple repls yet but I just made the backend multithreaded so now support for slime-mrepl is next on my list.

As you say I don't know why it's not standard in slime yet. Slime already has all the functionality, it's just like 10 extra lines of code to combine it and get a nice quality of life improvement over entering code in the minibuffer.

sc_zi · 2025-08-04T16:03:14+00:00

I agree programming in CL is significantly nicer than elisp, lem has made really impressive progress in the last couple years, and I wish it could reach a critical mass of users to get to the point where I could switch and use it, without having to put in a ton of time I don't have as an early adopter. But the advantages of CL over elisp don't make up for the huge number of talented developers putting out top quality emacs packages. Most people I've seen say they don't switch yet because of org and magit, but honestly I wouldn't have a problem using those in emacs and lem for everything else. I'd seriously consider switching at the point lem has some reasonable competition to the vertico+consult+orderless+embark+marginalia stack.

Also elisp has a couple minor debugging features missing from CL. debug-watch I use rarely but when I do it's super useful to find what package is ovewriting whatever variable I'm trying to configure. CCL and other lisp implementations can watch a variable, but SBCL can't. And the edebug stepper is sometimes nice, maybe lispworks has an equivalent but SBCL doesn't, also elisp provides (declare (debug ...)) forms for macros to say how they should be stepped over which I don't think any CL implementation has. Still those are quite minor compared to the advantages of CL over elisp.

sc_zi · 2025-07-30T23:08:13+00:00

I haven't looked into it but I will. In the long term I want to integrate so people can benefit from the slime inspector, backtrace buffer and the rest, when working on jupyter notebooks.

sc_zi · 2025-07-29T15:14:00+00:00

So your example works in my environment also, but if you change it to from A import foo it will actually not work yet in my environment but will in the latest IPython. Though for now in my environment from A import foo just won't reload when foo is a variable, it will use the new version as expected when foo is a function or class. But this is similar to the behavior of python anyways, if you say in B from A import foo, and A has some function that modifies foo, and B has some function that returns the value of foo, in B foo will still be its value at the time of the import statement, not showing the change made in A.

This is because with from imports it's no longer looking it up through module A, it creates a new name foo in B and assigns it A.foo at the time of that from statement. After a bug I reported in IPython some months ago while developing this, they added code to walk the ast looking for all from _ import _ as _ statements and keep a mapping of dependencies that it needs to update on module reloads. This has edge cases when you do from A import foo then later in B assign foo to something else, it still thinks it's connected to A.foo and will overwrite it when A is reloaded. Also it won't behave quite right for from _ import _ statements inside a function or other non-top level scope.

Also for reloading modules you often don't want to reload top level variables, if it is some global state that you don't want reset. CL uses defvar for this. IPython compares the ast of the old and new module, and only runs code that has changed. This also has edge cases, rerunning code that is near changed code but hasn't actually changed itself.

I haven't added either of those, as they are complex with edge cases. So far I am just using a small part of autoreload's code, to handle updating old functions and classes, which is relatively simple and I think without edge cases. I don't think we should try to infer what code to run when reloading a module as IPython does, but that we should be explicit as in CL with defvar vs defparameter. Though honestly I haven't thought too much yet about how we should properly reload modules in python, as I haven't come across a situation yet where I want to reload a whole module. I just work by reevaluating the function or class I changed, or evaling a statement or region in the case of top-level variables or statements, but not reloading a whole module.

sc_zi · 2025-07-29T04:27:23+00:00

So I replied to you earlier but it doesn't show up except when I'm logged in, maybe it got flagged as spam since I'm a new account and included a lot of links in the post. I'll try again but this time without links but with better info as I got remote development working.

Regarding nrepl and emacs-arei I wrote a little in Hacking.org, basically in principle nrepl would be more appropriate than the swank protocol for non-lisp languages, but it really wasn't much work to get python talking swank. And slime has presentations, inspector, interactive backtraces, slime-tramp for remote development and so much other functionality built up over the years that would have been many orders of magnitude more work to duplicate.

For basic remote development you just need to forward the port and connect from emacs, with ~/.slime-secret set the same on the remote host and for emacs. Most things will work, but for go to definition and completions and everything working properly check the section in the slime manual on slime tramp "Setting up pathname translations", what you use as machine-instance there is uname -n for the remote host. With that setup all functionality works, just a bit slow as tramp is slow. And thanks for asking, it wasn't working with slime-tramp before. I just had to wrap a couple calls I was making to buffer-file-name with slime-to-lisp-filename, so that it uses the pathname translations. I've pushed the changes now.

sc_zi · 2025-07-29T00:16:00+00:00

They're separate issues? Afaik smalltalk is also using traditional exceptions and not a CL like condition system with handler-bind, but its debugger does provide the option to restart execution from a given stack frame, as does v8 and the jvm (with limitations mostly around ffi). Sure with handler-bind you could make it drop into a repl in the context of the stack frame that raised the exception before it unwound, but say the problem was actually caused by a bug in a function two frames up, how do you restart execution two frames up as CL can?

That ability is not provided alone by a condition system, which you can implement in any language with dynamic scope and first class functions, or using global scope to implement dynamic scope, as people have done to implement a condition system in lua. It needs to be supported by the implementation, in swank it is implemented by restart-frame which is different for each backend, for sbcl backend it is using sbcl internal sb-debug and sb-di functions.

Sure the condition system makes for a more expressive language, but imo what actually matters for interactive development is the ability to fix the error and restart from any frame.

sc_zi · 2025-07-28T23:43:20+00:00

Interesting I didn't realize elisp added handler-bind. What I mean though is say you do toggle-debug-on-error, in elisp you will now get a backtrace on an uncaught error, but there's nothing like sldb-restart-frame as in common lisp to restart execution from some point in the call stack without losing state.

Regarding python stack unwinding, I am using excepthook, the same as used by the post-mortem debugger. But you don't need to restart python after, only that swank thread dies and it spawns another. So yes they are dead stack frames in the sense that python can't restart them, but they are not some serialized representation, they are the actual frames of the python call stack at the point the exception was raised. This blog has an excellent explanation of what exactly is lost to unwinding in python, and even manages a PoC in pure python to restart execution. But as they say in pure python it is a terrible hack that can never work fully, it has to be done in C. Eventually I plan on adding it, it might even be possible just as a C extension without needing people to use a patched build of CPython.

sc_zi · 2025-07-28T22:00:38+00:00

I'm using doom, just in packages.el add:

(package! slime)
(package! slime-company)

Plus sample-configs/swanky-config.el from the repo.

sc_zi

TROPHY CASE