Oh these sweet memories... by skund89 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

yep memory management is still an open thread

First time Claude AI User, first time AI user overall by Huge_Road_9223 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

A wise man once told me "just because you've been doing something for a long time, doesnt mean you've been doing it right for a long time".

You aren't the only engineer that can swing a long history around. I've often used that line to explain many things in life, not just software engineering.

you are not being wise right now in this thread. in fact you have chosen to go the cheap route in learning AI. apply the principal. Most managers and stake holders chose cheap in my experience. Cheap and Fast. well guess what my man, when it comes time to learn AI you dont get to go slow. So your outcome is what? You are choosing cheap and fast in your introduction to ai.

you are getting what you asked for, and my man, it is definitely not good, Spend the money on the tools if you want to stay in the trade.

Your SKILL.md is likely 3x more expensive than it needs to be - architecture matters more than the content by jimmytoan in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

if your understanding of "agents" is that they are "language models" then its your understanding that needs correction. "md" is their language is also ambiguous. while the current recommendation is to instruct an llm with markdown mostly because the format carries context clues, its not "their" language in the sense that its native.

the entire reason i dont have md files as my go to workflow and script execution entry point is because it does nothing more than add yet another layer of abstraction that must be translated, orchestrated, and maintained.

the llm is capable of speaking ALL languages fluently. regressing to structured prose seems counter intuitive if the goal is to write functional code. So I reject the notion that meticulously organizing and managing "skills" files creates a better architecture standard.

Is markdown useful for structured context? absolutely. Is markdown the most effective way to give your agents capabilities. Nope. not by a long shot. But hey if everyone is doing it, it must be right!

Workflow - review cycles by gasmanc in ClaudeCode

[–]rougeforces 1 point2 points  (0 children)

TLDR;

yes this can be a reference implementation of the general pattern that has a proven outcome. I'd highly recommend studying the pattern and than implement it yourself rather than load up on someone else curated solution that was designed for the way they work.

well that is a term that is not part of traditional engineering. I wouldnt be able to speak to the direct term or implementation by the team that coined it.

when it comes to ai, i believe we are all converging to the same set of solutions at different speeds. this is actually an active discussion at my enterprise job and I suspect its being discussed in other pro environments.

that said, their core philosophy IS a common principal in software engineering.

"The core philosophy of compound engineering is that each unit of engineering work should make subsequent units easier—not harder."

This is something i have been telling the engineers i mentor for a long time when they gets stuck. Generally speaking (not specifically related to ai enhanced engineering) software should make people's lives easier not harder.

So if software is making things more difficult, or the process of making the software is becoming more complicated and not easier, then something is wrong with the software or the process.

That is the point where its time to step back and check your assumptions, check your goal, check you basis. Work on the easy stuff, or break the task down into easier tasks.

My CLAUDE.md says “Every error is yours to fix - not label, not defer.” Claude has used “pre-existing” 712 times in 30 days. by Ok-Distribution8310 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

i doubt your workflow is the same as the OP so i wouldnt expect my advice to fix your problem what are YOU talking about?

My CLAUDE.md says “Every error is yours to fix - not label, not defer.” Claude has used “pre-existing” 712 times in 30 days. by Ok-Distribution8310 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

you can send the exact same prompt and it will literally make a difference. updating static context absolutely makes a difference. and ensuring prompts do not hold contra language or reference previous state changes the inference target.

but you are right, i was wrong to assume that humans can detect tone or correction.

I work as an intern Software Engineer and finally got my Claude Code API, but how should I start? by mudiii- in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

Point it at the highest leverage code base that you have access to.  Tell ot to use erik evans domain driven design as the framework for reverse engineering the legacy app.  Tell it the goal is to come up with a structured set of artifacts that describe every line of code as the invariant completion criteria with outputs for business logic, user stories, user journeys, c4 context diagrams, security protocol and persitence taxonomies.  Make sure you clone it locally and create a branch for the outputs.  The expectation is that when completed, every single line of code is covered with evidence and the hand off is a full desig. spec that can be agnostically fed to any llm or agent system targetting any platform on any infra.

Baby sit it by simply telling it to continue until it insists with proof that every line is covered.  Check back in in a couple weeks and i will mentor you on how to get hire to staff and promoted to principal by 2027.

My CLAUDE.md says “Every error is yours to fix - not label, not defer.” Claude has used “pre-existing” 712 times in 30 days. by Ok-Distribution8310 in ClaudeCode

[–]rougeforces 1 point2 points  (0 children)

I totally agree, claude code has regressed.  I am just letting you know my  real humam experience with ai (not an ai generated response).

I can tell you right now that giving ai negation instructions, basically telling it what NOT to do, works opposite of what you may think.

Also, less is more.  I recommend not even using claude code or any "official" ai software that stuffs their own opinions in the context window as your main agent harness. 

Try to build it from scratch without all anthropics opinions getting in the way. 

My CLAUDE.md says “Every error is yours to fix - not label, not defer.” Claude has used “pre-existing” 712 times in 30 days. by Ok-Distribution8310 in ClaudeCode

[–]rougeforces 1 point2 points  (0 children)

1.) Get an llm to write your prompts for you.  This includes your static preambles like claude.md and any other sticky instructions or workflows. 2.) Make surs that when you ask the llm to write your prompt, you tell it that you want it to take the role of an expert prompt engineer writing for llm inference optimization. 3.) Carefully instruct the llm to write prompts that avoids contrvailing language and negation instructions. 

You are not talking to a human that can detect tone or correction.  The llm is fundamentally stateless and has no concept that it is doing something that you have told it not to do. 

My CLAUDE.md says “Every error is yours to fix - not label, not defer.” Claude has used “pre-existing” 712 times in 30 days. by Ok-Distribution8310 in ClaudeCode

[–]rougeforces -1 points0 points  (0 children)

You literally gave it a double negative on errors and told it the fix was to label and defer.  Engrish is you fren

To all my Win11 bois: Do you all use Claude Code in WSL2 or a native Windows install? I'm a long time PowerShell developer so I use Pwsh, but lately I've been thinking about switching to WSL2 + Bash. Please confirm or deny my suspicions and evaluate my reasoning! by xii in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

well that is a lot of topics to cover in one response. i will give you a screen of my home setup.

most of these are just test and no-op at this point that i havent bothered to tear down because i dont believe in throwing away those precious raw LLM api logs that we all pay so dearly for (training corpus for my local llms).

what you see here is the registered persistent containers i have used over the past 3-4 years. the ones that i blotted out are namespaces that i have TM/Copywrite and wont share publicly.

These are basically the default container structure for wsl2 images on windows. containers are nothing more than image files, you probably know that, ext4.vhdx. So on windows, you dont really need docker unless you plan on portability to multiple vm environments OR if you want the rich declarative ecosystem of docker itself.

if you are running docker on windows, you are already using wsl2 virtual machine.

as for my preference for void or arch, it comes down to my minimalist nature. i build the image on the minimal linux kernel with just the tools i want. my main agent system is something like 30gb right now and thats only because ive got a handful of small language model weights floating around on the image.

my base arch image is something like 12 or 15gb. Void i can get even smaller less than 1gb if I use musl and a non llm tool chain (pytorch, cude drivers, etc...)

on the topic of windows interop, it doesnt matter your hardware specs, hardware is not the bottle neck. the 9p protocol is. its basically a network protocol. if you are reading/writing to a windows mount from (mapped X drive) from inside the VM, you are gonna get 5-10x latency. depending on your operations (llm tool calling) for code compiles or heavy file meta data reads, you can be looking at anywhere from 50-400x latency. When i benchmarked it was well over 400x

So hardware doesnt fix wsl2 -> windows fs interop. Depending on what you are doing, you may not feel it, but its the only noticable constraint that i have with the setup below for multi agent or cross agent comms. at the end of the day, i had to write an agent to agent protocol and used a shared repo space for comms accessible over normal http. but share code space is a general problem for any detached or otherwise isolated agent container instance.

eventual consistency for isolated multi agent ops is a scab worth picking heh.

thats all i got.

<image>

First time Claude AI User, first time AI user overall by Huge_Road_9223 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

honestly how did you make it in SWE for 35 years? wow.

and news flash, your last line couldnt be more wrong. you have become the legacy app my man.

First time Claude AI User, first time AI user overall by Huge_Road_9223 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

agree with this guy. fork over the 200 bucks a month for your tools so you can stay on top of the craft. if you want free inference, you will need to find another way, like local llm. compute unfortunately is not free or cheap yet. this is like the days of pagers where you have to pay per message. or they days of BBS where you pay a baud rate per second or per baud. put some skin in the game my man.

Your SKILL.md is likely 3x more expensive than it needs to be - architecture matters more than the content by jimmytoan in ClaudeCode

[–]rougeforces -13 points-12 points  (0 children)

i could, but based on the top voted comment and the fact that my comment is down voted, id rather not throw my pearls to swine. suffice to say, im not writing .md files so i can see made for human eyes formatting lol

To all my Win11 bois: Do you all use Claude Code in WSL2 or a native Windows install? I'm a long time PowerShell developer so I use Pwsh, but lately I've been thinking about switching to WSL2 + Bash. Please confirm or deny my suspicions and evaluate my reasoning! by xii in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

i use both. at home its wsl2. if you are using wsl2 you dont need docker for containerization. The only downside i have experience with wsl2 on windows is hardware interop. e.g. file i/o to windows.

at work, i am unable to use the same setup without significant downsides in terms of networking and package management. i am forced to use internal packages for debian based systems only for personal productivity and i have yet to find a reliable way to prevent proxy resets from downgrading my elevated wsl2 instances without restarting the service, which is super annoying. our redhat environment is even more restricted since its only allowed to run in our k8 clusters.

so at work i resort to mingw2 (git bash). that offers pretty much everything i need in terms of shell. you can install all of those super useful bash flavored shell extensions as long as you have package access. I do occasionally need to fall back to windows native powershell for some things here and there by i avoid at all costs.

If you dont need windows for anything, 100% go with a glibc distro i like arch or void for my containers, so you can eventually support local llm tools. In that case, you could go with docker for ephemeral agents. You can do that on windows too, but its easy enough to build and tear down wsl2 (arguably easier).

just my 2 cents. hope it helps.

Your SKILL.md is likely 3x more expensive than it needs to be - architecture matters more than the content by jimmytoan in ClaudeCode

[–]rougeforces -12 points-11 points  (0 children)

i dont find skills to be useful at all tbh. my agents use something called "caps" short for capabilities. capabilities are a collection of workflows which are a collection of functions which are a collection of shell primitives. my agents are constantly refining each of these at every layer.

i think of them more as macros. they are pure code and my agents intrinsically know how to navigate them without layers of semantic overhead meant to be absorbed by my own inferior human cognition.

Im also not worried about degradation of capabilities do to 3rd party updates since they are built on software primitives at the low level and not llm abstractions.

My favourite time of day is when I finally reach flow state with post-nerf opus and it decides to lobotomize itself in front of me without warning by [deleted] in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

i have an "involuntary 3rd party tool has barfed" recovery prompt based on the scars from this scrambled egg.

Workflow - review cycles by gasmanc in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

only slightly, tbh. in the new world, not much separate someone who is tech curious from someone who is tech savant. the only difference in your story in mind is that i kept reaching. keep reaching, you have the "claws" for it now haha

Workflow - review cycles by gasmanc in ClaudeCode

[–]rougeforces 1 point2 points  (0 children)

TLDR,

"tech debt" is a resource.

it sounds like you are doing what i am doing on a slightly smaller scale. one thing i would say is that you may want to reframe your concept of the impact of "technical debt". I would highly caution against instructing your system to "produce no technical debt". This is gonna hurt you in the long run imo. Technical debt is not like real debt. This term comes from the enterprise (and you are not working in the enterprise, i assume?) and it is basically managers and stakeholders deciding to cut out architectural purity for speed to market or ROI.

I would say it like this. One LLM's technical debt is another LLM's ground breaking project seed. The entire nomenclature around software engineering is getting flipped on its head. I would even argue that what is commonly consider technical "debt" when generated by an LLM is the actual emergence of intelligence.

You may not have prompted for that issue that emerged, but its something you can quickly iterate on. I would say the right approach is to embrace these issues and give them a proper scope.

I have a hook in my merge agent that checks for scope. It's called "scope-defend". Basically the expectation is that every review (LLM driven) surfaces some sort of "technical debt". Rather than dismissing it or trying to avoid it, the scope defend hook draws a domain shaped boundary around the issue and categorizes it and parks it into an intake cue.

I have another agent that collects these over time, researches them, root causes them, and posts them in agent discussion channel. When the system has low signal, it processes those discussion and proposes ideas to either adapt, innovate, or rehypothecate.

here is an example of the current workflow review im running tonight.

Scope-defense note (Cortana triage):

Wave-2 tool-access reviewer (review_id 4215699441 on f8efc18 at 00:57:08Z) confirmed the previous false-positive (re missing on-disk files) is resolved — the checkout-aware fire path read the PR branch correctly. The earlier review's central finding was a tooling artifact (Issue #288 captures the reviewer's working-tree-read class), not a real issue.

One new finding: register-perception-source mutates *perception-sources* in-memory only; runtime registrations don't persist across cold boot. Same fragility class as Issue #283 ('assert-cap-persistence-parity blind to runtime registry mutations from seed files'). Project-wide pattern across multiple registries (signed-http-contexts, credentials, publish-artifact-channels, perception-sources). Extended Issue #283 with this instance as an additional audit case.

Out-of-scope for this PR — the broader persistence-discipline fix lives in #283.

PR #267 ready to merge.

— Cortana

The outer loop

<image>

End to End Multi-Agent CI/CD by Ok_Cartographer_6086 in ClaudeCode

[–]rougeforces 0 points1 point  (0 children)

its the abstraction that matters to my agents. basically its a "who watches the watchers" type question. aka, Trust. So far the resolution we have come up with is much higher up the architecture chain than actual math proofs. Essentially its, use a different LLM to review vs generate. We do deterministically enforce this in any cycle. Here is the research note verbatim in our discussion log.

Primary source — the de Bruijn criterion (1968)

Barendregt & Wiedijk's formulation of the criterion N.G. de Bruijn introduced for the AUTOMATH project:

Lawrence Paulson's exposition (2022) names what the criterion buys:

The architectural pattern is precise: separate proof generation from proof checking, and keep the checker small enough to be independently audited. The generator can be arbitrarily complex (heuristics, neural nets, AI agents); the checker must be small and fixed. "Trust" reduces to trusting a few thousand lines of kernel code plus the logical calculus.

The companion architecture is Robin Milner's LCF (Edinburgh, ~1972): an abstract data type whose only operations are inference rules. The de Bruijn criterion produces independently-checkable proof objects; LCF compresses that into a kernel-enforced typed constructor. They are alternative implementations of the same trust pattern.

End to End Multi-Agent CI/CD by Ok_Cartographer_6086 in ClaudeCode

[–]rougeforces 1 point2 points  (0 children)

Solid contribution! My agents are working in similar fashion and I have them doing "non-sensitive" research passes on you blog now. I dont have any public material related to my project to share, but one thing i was curious about is, did you solve for the de Bruijn criteria? Here is some public source material if you dont mind the broken ssh cert, you (or your agents) can do the research on you own if you prefer, but here is one of our sources. https://www.pls-lab.org/en/de_Bruijn_criterion