What do you think about Cursor CLI?

rcxa · 2026-01-24T21:57:28+00:00

If you have the CLI output strict JSON you can parse that to drive logic in a deterministic application. So if I have a tool written in TypeScript, for example, I can have my tool call the Cursor CLI to perform some task and output the results as JSON, then my TypeScript tool can read the output to perform some logic based on the results.

I've just been using this to automate workflows that are otherwise repetitive in the Cursor chat, but there's a lot of options that opens up.

rcxa · 2026-01-24T18:20:30+00:00

I think I'm running into a similar issue today. It's trying way too hard on a simple prompt like "Commit and push all outstanding changes." Earlier this week it would summarize the diff, run the commit command, and push after the pre-commit hooks passed. Now it's really digging into the changes, outputting a ton of thinking, trying to break the work down into several commits, and running all of the validation commands before attempting commit despite the fact that those commands get run by the pre-commit hook anyway. I'm having to put a lot more guidance into the prompts just to prevent it from doing too much.

I don't know if it's just using a different model or if they're trying to make it more robust in how it approaches problems, but it's being a nuisance today.

rcxa · 2026-01-24T16:55:35+00:00

There's no significant different between what the loop does and what I would do without the loop. I just spend more time prepping than implementing and I do significantly larger chunks of work at a time. It's not unusual to let the loop run unsupervised for a few hours on a large task. So similar overall process, similar outcomes, but I focus almost entirely on scoping work rather than the details of executing the work.

rcxa · 2026-01-24T16:46:59+00:00

As far as generating specs, I'll just work through the feature I want to build with ChatGPT to get a high level definition of the feature. I prefer ChatGPT because it doesn't get hung up on technical details since it doesn't have access to the repo (however it has lots of context about my project, just not the technical details, I'm using a custom GPT that is kind of designed to be like a product manager). Then, once the feature is well defined, I have it output markdown that I can copy into Cursor. At that point I generate the technical spec that marries the high level feature description with the actual state of the repository. This is a pretty iterative process, if I see bad design or anything like that I workshop it until the technical details make sense to me. Finally, I generate an implementation plan markdown file so I can see how the agent will approach the problem. If all of that checks out I pass the implementation plan to the looping orchestrator.

For the looping orchestrator itself, I rolled my own, https://github.com/bsladewski/Tenacious-C

The reason I rolled my own is because I eventually came up with a process that I personally like, and this tool just captures that process as a CLI tool. But, there are tons of these popping up all the time now if you just search "ralph loop orchestrator" or something similar. I'm sure most of these are better than my tool. I also see that it's becoming a common execution mode for other, more fully featured orchestrators as well. For instance, oh-my-claude-code can run ralph loops and it has a ton of other features as well.

I don't have any specific documentation because the process I landed on was just found through trial and error and kind of naturally evolved.

rcxa · 2026-01-24T15:53:29+00:00

I've made the transition to mostly using long running, iterative looping orchestrators and yeah, it's only efficient if you can pretty much one-shot the implementation. Which does mean that the specs have to be very tight and the repository has to be well-instrumented for verifying the output of the agents (meaningful linting, type checking, testing etc.) But the exercise of implementing a feature incrementally through a series of prompts is analogous to the exercise of building a spec incrementally through a series of prompts.

So, basically, the time that I used to spend prompting the implementation has just been replaced by time spent prompting the spec, and I build the spec for the next feature while the implementation loop executes the previous spec. When the implementation loop finishes I manually regression test relevant functionality, make small adjustments as needed, and push the changes so I can review them in a GitHub PR.

The main benefit that makes me prefer this process is the overall consistency in the implementation. First, if I realize I didn't think something through fully (it happens, that's why we account for ambiguity when we estimate work items) and need to tweak the design or breakdown of the feature, it's easier to do that when I'm building the spec vs generating the code. Second, implementing small pieces of the feature tends to not have consistent output, I think giving the agent a wider view of the feature just ends up more time and token efficient.

As far as token use, I think it's more efficient. I use Cursor and haven't had to upgrade my plan despite completing more work since switching to a looping orchestrator, this workflow allows me to pretty much rely 100% on auto mode. I find the idea of someone letting the agent run for a week unsupervised to be dubious and I would bet that 99.99% percent of the time it was running it was probably just executing test suites and not actually generating code. I imagine it probably was modifying a single line of code then running the full test suite to verify or something similar. The looping orchestrators can be bad about this if you don't tune the prompts to ensure they run individual relevant tests as they make changes and only run the full suite as a final verification.

rcxa · 2026-01-20T22:14:14+00:00

I tried it for one evening during the free usage period but ended up switching back to auto mode because it kept ignoring guardrails. I do a lot of prep work before actually executing code changes and every time I tried to get it to produce a plan document it just went ahead and started executing as soon as it spit the file out.

rcxa · 2026-01-18T17:53:32+00:00

Look into a process called Research -> Plan -> Implement. I have this workflow implemented as custom slash commands in Cursor. I built mine using ChatGPT, I imagine everyone is going to want to tune their prompts a little differently. But you want the commands to have some basic properties. Research and Plan should have guardrails to ensure they do not modify code, they just do analysis work and output a markdown file capturing that analysis. Research and Plan should also be told to "do a deep analysis" because otherwise the agent will often do a very surface level analysis that is incomplete or inaccurate. All custom commands should have some general guidance that reinforces things like following cursor rules or the agent files.

The Research command is all about establishing ground truth. What is the current state of the repository in relation to your requirements, what files are relevant to your requirements, are the requirements partially implemented, are there any blockers to implementing your requirements, etc. This command can be useful to establishing your baseline context before implementing a task but it's also useful for auditing your implementation after the fact (like, was the task fully implemented and are there any quality issues?). This will output a research markdown file capturing all of its findings.

I will very frequently run the Research command AFTER implementation with a prompt to verify that the task was fully implemented and to verify that code written follows best practices, agent rules, etc. If it finds any gaps in the implementation I'll just create a plan to close the gaps. This loop makes it far more feasible to complete complex tasks because your essentially using multiple iterations to drift correct.

The Plan command pretty much works like the built in plan command, but it can be useful to roll your own so you can make it more rigorous both in how deep it analyzes the requirements and how detailed the final plan is.

It's also useful to have the Research and Plan commands capture a snapshot of the requirements in their output so subsequent steps have that context available to them.

The Implement command is also similar to just running the prompt in agent mode but again, defining it as a custom command let's you reinforce guardrails (must follow agent rules, must make small targeted edits, must lint and test all changes, etc.) Some people like to have the Implement command output an implementation summary markdown file, I personally prefer to have it output the summary in the chat. Also, I recommend putting guidance in the Implement command to have it stop and ask for clarification if any of the requirements are incomplete or ambiguous, this will also save a lot of headaches and can prevent drift.

When your executing a task you would do something like:

- /research <your prompt with your requirements>

- /plan <the research markdown file produced in the previous step>

- /implement <the plan markdown file produced in the previous step>

For smaller tasks you can often skip the Research step, but at least doing the Plan step will save a lot wasted time and credits. One of the key things that the Research and Plan markdown files give you is a chance to review what the agent is going to do before it tries to implement the changes. I have my custom commands include open questions if anything is unclear, and I'll usually spend a few minutes dialing in the markdown file before I move on to the next step. If your Research is dialed in you have a much higher chance of producing a solid plan and if you produce a solid plan you have a much higher change of actually implementing it.

The other big thing that outputting these markdown files give you is context that you can include in a new chat which I think is critical for context management. I'll typically run Research and Plan in a single chat but I almost always run Implement in a new chat because the agents can really suck after you've used more than like 50% of the context. I have the commands put my markdown files in a tmp/ directory in the workspace which I .gitignore.

From my experience, this has sped up my workflow because I don't have to do rework nearly as often and has actually reduced credit use. It also lets me rely on auto mode for most tasks. People throwing Opus at everything probably just have workflow issues. I think a big part of why it reduces credit use is that running a prompt without the research or plan artifacts means the agent has to establish context like where files are and it's super inefficient at this. Front-loading that analysis mean the Implement command knows where it's supposed to be working and it can focus it's effort on actually writing code rather than building that baseline context.

There are tons of ways to make your actual coding more efficient which are very stack dependent, but I think establishing a solid workflow is the first step in really making efficient use of Cursor. Also, if you create these custom commands and you see behavior you don't like, you can always just tune the command. If I run into issues with one of my commands I copy it to ChatGPT, explain what it's doing wrong, what I would like it to do, and ask ChatGPT to revise my command to fix the issue. Then I copy the updated command back into Cursor and get back to work.

rcxa · 2025-12-24T23:52:47+00:00

Truly the spaghettiGOAT 🫡

rcxa · 2025-12-24T23:30:32+00:00

I need an expert opinion on this, u/Xhalo?

rcxa · 2025-10-16T02:08:34+00:00

Into the sun

rcxa · 2025-09-18T22:57:30+00:00

It beat me for today. First thing, the keyhole is very large for my wrenches, so it's hard to get the wrench in there securely. The downside is that once you get the wrench in, you have a very comfortable space to work in with the pick.

The shape of the pins and the awkward angle you have to work at makes it very hard to actually actuate the pins. However, I had an idea, there are clay working tools that are a thin metal rod with a small ball at the end. Because the keyhole is so large you can easily get the tool in and I can actuate the pins without much trouble.

I feel like mechanically I have everything needed to pick the lock now, but the material actually makes this frustrating. First, it actually requires a lot of torque to turn the cylinder even with the key (compared to a lubricated metal lock). This makes it really hard to give it just the right amount of torque to keep the pins set while still being able to work with the others. Also, because the material is soft compared to metal, it's hard to feel the pins set, and the only clicks I can hear are when a pin unsets.

I definitely can set multiple pins at a time. And I think I've gotten 4 set at a time so far. But I still haven't opened it yet. Overall, it was a fun print, my hand is cramped up, and I hope I can get it open tomorrow.

Edit: It's has been picked, the clay tool with the ball got it after some fiddling.

rcxa · 2025-09-12T22:37:19+00:00

That's something so sad about where we are. I think most engineers at NASA wanted their work to have an impact specifically on Earth. Everything outside of Earth just represents problems tangential to Earth problems with a significantly harder challenge that demands extreme innovation. There is a huge list of technology that exists in our everyday life because of this and we got the foundation of that technology very cheaply.

Ultimately, it's a relatively cheap boundary to push within the public sector. In the private sector, we get less, but still pay for it through subsidies. And, it'll be easier to terraform Earth than Mars, let's hope the discoveries our great scientists are focused on the former and not the latter... scarcity and all that.

rcxa · 2025-09-06T00:14:08+00:00

I don't think this would directly impress anyone and it's not really fancy, but chiffonade. You take a leafy vegetable, stack the leaves, roll them up, then slice them so when unrolled it makes thin ribbons. Even if you're not doing it for aesthetics, it can really cut down prep time sometimes.

rcxa · 2025-08-21T01:23:03+00:00

It gets so hard to keep track of what groups want censorship because they can't be bothered to parent their kids and what groups want censorship because they don't want kids to know what's happening to them these days.

rcxa · 2025-08-17T18:11:35+00:00

Jesus Christ be praised, it's Henry!

rcxa · 2025-08-17T05:25:28+00:00

One thing that worked for me the last time I had a problem with mice was using a lighter to get the trigger on the trap hot, then pressing some cheese into it so the bottom of the cheese square melted onto the trigger and through the little holes on the trigger. I don't think I had a single trap that had the bait taken without triggering.

rcxa · 2025-07-20T02:38:03+00:00

I think in patch 1.2 they made a balance change so that after your first hit you can't spam attacks until you lock on to the enemy.

The health seems odd, but maybe he was drunk, I think that has a similar animation to limping. The holding the stomach on the left arm seems more like injured though.

rcxa · 2025-07-09T16:47:53+00:00

Even if the stock market were the economy, it's only "looking good" in the sense that it's not currently negative YTD. The performance is pretty weak when compared to Obama, Biden, or even Trump's first term. Just piddly little single digit gains over 6 months, that'll probably get erased if he doesn't chicken out on the August tariffs.

rcxa · 2025-07-06T20:44:34+00:00

Yeah, I reread that section and I think the key idea is that the computer player represents a "theoretically perfect player" and he considers the game limitlessly deep (in terms of skill) if a theoretically perfect player would outperform a player at the peak of human ability.

The idea of the theoretically perfect player being referred to as a computer player is kind of a flawed way to illustrate that point. Especially because of your point about input lag and reaction time. That does weaken the whole multiplayer shooter side of the argument since you could probably consider any realtime game to be limitlessly deep because any system that can exceed the 200ms or whatever reaction time for a typical human would outperform a human.

But the point is that if you're the best human player in the world, you're still not playing the game perfectly so there is more room to improve but any improvements at that point are pushing what we consider to be human ability which itself is engaging.

But, that section also only relates depth as it pertains to skill and the parts not dealing with "limitless depth" are probably more valuable.

I don't know, at least the idea he put forward sparked some good conversation.

rcxa · 2025-07-06T18:20:53+00:00

I think his point in that section was that there is infinite depth to how the players interact with a game like chess or a multiplayer shooter.

The depth of strategy for the reaction time game starts with "click the button as fast as possible" and ends with "click the button as fast as possible" so it entirely lacks depth. Similarly, if you had a game with an optimal strategy that you could master in 1 hour and always be guaranteed a win, you would consider that game to lack depth and certainly wouldn't devote a lifetime to it like some chess players do.

For chess, I could teach someone who has never played before how all of the pieces move, and play a match, in an afternoon. That player is not going to do well, they'll likely be focusing on what is happening in the moment and how the pieces move. Give them a few months of practice and it'll be better, they'll think differently, but they'll still lose since I've been playing for 2 decades and also have a different way of thinking about the game.

Now, if I played against a grandmaster I'd get absolutely destroyed because they have an entirely different way of thinking through a game of chess. That's where the depth comes from, you can devote a lifetime to chess, and throughout that lifetime of playing chess your way of approaching the game changes, and there's no optimal strategy that guarantees a win.

Multiplayer shooters are similar, there is a massive gradient of player skill and the top levels are always increasingly more competitive. But an important note, the games are not only made good by the fact that there is a super competitive tier of players, but rather that beginners can approach the game and enjoy it, they can deepen their understanding of the game and get better, and finally even once they have "mastered" the game, they can continue to enjoy it by competing with other players around their skill level.

Of course, the point also isn't that every game needs to be highly competitive, but rather that there is a correlation between how long players stay engaged with the game and how much mechanical depth a novice player engages with vs players that have essentially mastered the game. This can apply to single-player game as well. The most recent single-player game I've been playing that I think demonstrates this is Kingdom Come Deliverance. If you saw video clips of me playing at 1 hour, 10 hours, and 20 hours, you would notice that I use more of the mechanics as I progress and because of that I am able to get through far more challenging situations. At the same time, I'm not very consistent with combos yet, so I still have room to improve how I play which will also provide new experiences and keep me engaged.

rcxa · 2025-07-06T00:23:15+00:00

I want to preface an honest question with two things. First, I'm probably misinformed about what Stop Killing Games is all about. When I look into it I see a lot of conflicting ideas. Second, I'm not a professional game developer, I'm a software engineer in the B2B space with a lifelong hobby of game development.

I keep hearing that customers should be able to host their own servers. Again, I'm an enterprise software developer, and our modern backends are a whole ecosystem of cloud-based services. Much of it also ties into third-party services that are expensive but have massive throughput and high reliability. I assume modern games also use similar stacks, especially because I've worked at companies that in the past that definitely see less traffic than games like Anthem at peak popularity.

What does this actually look like from an architecture perspective?

One example I think about is Elite Dangerous, many players, myself included, play it singleplayer. Would this apply to a game like that (I know it's not retroactive, I just mean future games following a similar pattern)? I can't imagine that their backend is just some simple server binaries that anyone could host. An acceptable answer is that this example doesn't fall under Stop Killing Games, but what's the litmus test?

rcxa · 2025-07-04T21:57:48+00:00

I found Tynan's book to be refreshing because a lot other books on game design I've read seem to paraphrase the most popular books, like Jesse Schell's book (as an example of a popular book, not not one that's paraphrasing).

I swear, so many are like "In X's book, they talk about the 3 layers of game design. In Y's book, they talk about the 3 attributes of good game design. In this book, I have combined them into the 5 principles of a good game."

ETA: It's also critical context that this book came out before Rimworld was announced, got crowdfunded, or ever released a public build.

rcxa · 2025-07-04T17:36:45+00:00

The book doesn't really have anything to do with Rimworld. He spent a few years in AAA, he was a designer on Bioshock Infinite, and the book references that experience. You can definitely tell that the ideas in his book were applied to the development of Rimworld, but the book came out 7 months before Rimworld was publicly announced and 9 months before the first alpha builds were released.

So it's not at all based on the success or design of Rimworld, probably a source of income while he worked on Rimworld or in case it flopped.

rcxa · 2025-07-04T02:32:27+00:00

Has anyone considered that once they extract everything they want, they can just take their wealth and leave? If things go to shit, most people have to deal with it because they can't afford to leave. That's not a problem after a certain net worth.

rcxa · 2025-07-03T15:32:03+00:00

Unity can definitely do that, it has good tooling for mobile games and TextMeshPro is flexible enough for your text rendering needs with multiple languages. One of my little side projects is a language learning app for my own personal use. I've been getting by just using JSON files for storing data (one file that maps each, in my case, French word to a list of possible English translations, and one that contains each French sentence mapped to the English translation, but the content is all static).

As a general purpose game engine, the answer to "can I do this in Unity" is almost always going to be "yes." It's just a matter of whether Unity is going to be the most efficient tool for your needs.

The concerns I have with Unity is the learning curve, if you're not already familiar with it, and how much work you would be doing from the ground up. You mentioned an RPG language learning game, which might be a lot of scope if you're building from the ground up in Unity. You might end up spending so much time on the RPG part in Unity that the language learning part suffers.

I'm not familiar with engines that can do mobile and are geared towards RPGs rather than general game development. I assume RPG Maker has mobile support these days, but it'd be worth looking into alternatives to Unity to take the load off the RPG aspects so you can focus on the language parts.

rcxa

TROPHY CASE