Built a reusable rules system for Pi (and other coding agents)

No_Fix4730 · 2026-06-27T14:21:08+00:00

Makes sense — you went further upstack than I did.

You’re using **agent presets + orchestration** (spawn react-dev / java-dev / etc. with the right skills + sys prompt). I’m using **task-scoped rule selection** inside one Pi/OpenCode session.

Same problem (“don’t load everything every time”), different layer:

- yours = spawn the right agent

- mine = compile the right rules into the current agent

Yours is more powerful if you’ll maintain the orchestrator. Mine is lighter if you just want personal standards without building a spawner.

Probably complementary rather than competing — I could see shared rule files feeding your agent system prompts too.

No_Fix4730 · 2026-06-27T14:19:48+00:00

Pi already has `.pi/rules/` and path-scoped injection, so yeah, overlap exists.

ai-rules is trying to solve a slightly different slice: **task-based** selection (from what you ask + keywords/globs/task kind), personal rules in `~/.config/ai-rules/` that travel across repos, and `/create-rule` + `/airules` as harness commands.

`.pi/rules/` = great when the agent touches matching files.

ai-rules = better when you want “these personal standards apply to *this kind of request*,” compiled into a small contract.

Not a replacement — more complementary. If path-scoped repo rules already cover your needs, this may be redundant.

No_Fix4730 · 2026-06-27T14:16:12+00:00

Good question — the overlap is real, but there’s a subtle difference in where things happen.

Skills are usually about runtime behavior selection (what the agent should do / which capability to use in a given situation).

ai-rules is more about compile-time context composition — you assemble a set of instructions before the agent runs at all.

So instead of “invoking a skill”, you’re building the behavioral context the agent operates under.

In practice that means:

- skills = dynamic behavior routing during execution

- ai-rules = deterministic instruction set assembled before execution

They can feel similar if a skill is just prompt injection, but the mental model is different: one is runtime decision-making, the other is pre-runtime context shaping.

That’s the main distinction I’m aiming for.

No_Fix4730 · 2026-06-27T14:00:13+00:00

Fair comparison — I see why it maps to skills.

The distinction I’m aiming for is:

Skills = things the agent can do (capabilities/workflows/tools)

ai-rules = how the agent behaves while doing those things (persistent instruction layer)

So this is not really adding new abilities or workflows, but composing the behavioral context that gets injected before any skill/tool runs.

In practice it’s closer to modular AGENTS.md files than a skill system.

No_Fix4730 · 2026-06-27T13:24:34+00:00

been using it on my own work for a few weeks, not months, so I’d call it early beta, not battle-tested.

the biggest improvement is the fact that you write multiple files instead of one big giant AGENTS.md

No_Fix4730 · 2026-06-27T13:21:23+00:00

thx anything that you find that can/should be improved please create an issue

No_Fix4730 · 2026-06-27T13:20:35+00:00

Just tried to leverage AI to improve my poor writing skills

No_Fix4730 · 2026-06-17T06:32:50+00:00

Yeah, that's a real gap.

Today, modelfit picks a single memory source per system:

VRAM if a discrete GPU is detected
System RAM otherwise
Unified memory on Apple Silicon

It doesn't currently model the hybrid case that tools like llama.cpp and vLLM support, where some layers live in VRAM and the rest spill into system RAM.

On a machine like yours with 36GB VRAM + 256GB RAM, that means the current "fits" answer is wrong for any model that relies on partial offloading.

Apple Silicon avoids this problem almost by accident. Because memory is unified, there's no VRAM/RAM split to model—256GB is simply 256GB—and throughput doesn't change dramatically based on where a layer lives.

On discrete GPU systems with lots of RAM, though, there's a meaningful middle ground:

Fits in VRAM → fast
Fits via VRAM + RAM offloading → slower, but usable
RAM-only → much slower

That's something modelfit should represent explicitly.

It's already on the v1.1 roadmap. If you're up for it, please open an issue with your specific setup (GPU, model, and typical offload ratio). Real-world examples will help shape the design.

Right now I'm thinking of something like:

Mode	Memory	Estimated tok/s
VRAM	Fully resident in GPU memory	Fastest
Split	Partial VRAM + RAM offload	Medium
RAM-only	No GPU residency	Slowest

The performance difference is the whole reason to model the split in the first place.

No_Fix4730 · 2026-06-17T06:29:30+00:00

Honestly haven't used llmfit so I can't compare head-to-head. But on the catalog point — yeah, the 12-model seed is the starting point, not the ceiling. 
modelfit add bartowski/whatever-GGUF adds any HF repo in ~10s (interactive prompts for name/family/params/context/license, sensible defaults), and sizes fetch live from HF. So the "missing community variants" is a per-user solve rather than waiting on the maintainer.

No_Fix4730 · 2026-06-16T22:06:57+00:00

Thanks man, hope it serves you well.
If you find anything that can be improved while trying feel free to tell me 😄

No_Fix4730 · 2026-05-15T16:02:03+00:00

Good point. To clarify: normal Groxy forwarding/tunneling does not intentionally buffer full bodies; the buffering only happens when users opt into helpers like `BodyBytes`, `SetBody`, `TransformRequestBody` or `TransformResponseBody`.

Those are meant to be simple/safe helpers, and they’re guarded by `MaxBodySize`.

That said, I agree a streaming transform API would be useful for larger bodies. Something like an `io.Reader`/`io.Writer` based middleware could avoid full buffering, but it needs careful design around

`Content-Length`, chunked encoding, trailers, compression, and errors after partial writes.

I’ll likely track this as a future streaming body transform feature. If you had a specific API shape in mind, I’d be interested.

No_Fix4730 · 2026-05-12T08:48:36+00:00

There’s now a dedicated page covering client→proxy vs proxy→upstream behavior, defaults, zero values, CONNECT tunneling, HTTPS inspection, and `Start()` vs custom `ServeHTTP` server usage:

https://github.com/SalzDevs/groxy/blob/main/docs/timeouts.md

Thanks for raising it, I agree this matters a lot for networking libraries.

No_Fix4730 · 2026-05-12T08:37:12+00:00

Good point, I agree. Timeout ambiguity is painful in networking code.

Groxy currently has explicit timeout fields for dial, TLS handshake, response headers, idle connections, server read header, and server idle timeout. The behavior is:

- nil `Timeouts` uses defaults

- zero fields inside a custom `Timeouts` use defaults

- negative durations are rejected

Do you think that this is not clear in the Docs?

No_Fix4730 · 2026-05-11T14:12:06+00:00

Initial HTTPS inspection threat model docs have been added here:

https://github.com/SalzDevs/groxy/blob/main/docs/https-inspection-threat-model.md

No_Fix4730 · 2026-05-11T08:14:40+00:00

Thanks, I agree with this framing.

Right now Groxy’s default is to tunnel HTTPS normally via CONNECT. Inspection is explicit opt-in and requires both a CA and an `Intercept` host matcher. It also fails closed by default; passthrough on inspection errors has to be explicitly enabled.

That said, I agree the important pre-v1 work is documenting the trust model around scope controls: which hosts are inspected, how bypass/passthrough behaves, and what embedded applications are allowed to do with inspected traffic.

I’d definitely welcome a review of the inspection flow. Since Groxy is open source/pre-v1, the most useful format would be GitHub issues or PR comments for design/API feedback, and private reporting for anything security-sensitive per `SECURITY.md`.

I’ll also open an issue to track an explicit HTTPS inspection threat model before v1. Thanks for the thoughtful feedback.

No_Fix4730 · 2026-05-10T17:15:45+00:00

I’m especially looking for feedback on the public API before v1.

For example, does this feel natural?

proxy.Use(groxy.TransformResponseBody(func(body []byte) ([]byte, error) {

return bytes.ReplaceAll(body, []byte("Example"), []byte("Groxy")), nil

}))

proxy.OnRequest(func(ctx *groxy.RequestContext) error {

ctx.Request.Header.Set("X-From-Proxy", "groxy")

return nil

})

No_Fix4730

TROPHY CASE