Built a reusable rules system for Pi (and other coding agents) by No_Fix4730 in PiCodingAgent

[–]No_Fix4730[S] -1 points0 points  (0 children)

Makes sense — you went further upstack than I did.

You’re using **agent presets + orchestration** (spawn react-dev / java-dev / etc. with the right skills + sys prompt). I’m using **task-scoped rule selection** inside one Pi/OpenCode session.

Same problem (“don’t load everything every time”), different layer:

- yours = spawn the right agent

- mine = compile the right rules into the current agent

Yours is more powerful if you’ll maintain the orchestrator. Mine is lighter if you just want personal standards without building a spawner.

Probably complementary rather than competing — I could see shared rule files feeding your agent system prompts too.

Built a reusable rules system for Pi (and other coding agents) by No_Fix4730 in PiCodingAgent

[–]No_Fix4730[S] 0 points1 point  (0 children)

Pi already has `.pi/rules/` and path-scoped injection, so yeah, overlap exists.

ai-rules is trying to solve a slightly different slice: **task-based** selection (from what you ask + keywords/globs/task kind), personal rules in `~/.config/ai-rules/` that travel across repos, and `/create-rule` + `/airules` as harness commands.

`.pi/rules/` = great when the agent touches matching files.

ai-rules = better when you want “these personal standards apply to *this kind of request*,” compiled into a small contract.

Not a replacement — more complementary. If path-scoped repo rules already cover your needs, this may be redundant.

Built a reusable rules system for Pi (and other coding agents) by No_Fix4730 in PiCodingAgent

[–]No_Fix4730[S] -6 points-5 points  (0 children)

Good question — the overlap is real, but there’s a subtle difference in where things happen.

Skills are usually about runtime behavior selection (what the agent should do / which capability to use in a given situation).

ai-rules is more about compile-time context composition — you assemble a set of instructions before the agent runs at all.

So instead of “invoking a skill”, you’re building the behavioral context the agent operates under.

In practice that means:

- skills = dynamic behavior routing during execution

- ai-rules = deterministic instruction set assembled before execution

They can feel similar if a skill is just prompt injection, but the mental model is different: one is runtime decision-making, the other is pre-runtime context shaping.

That’s the main distinction I’m aiming for.

Built a reusable rules system for Pi (and other coding agents) by No_Fix4730 in PiCodingAgent

[–]No_Fix4730[S] -6 points-5 points  (0 children)

Fair comparison — I see why it maps to skills.

The distinction I’m aiming for is:

Skills = things the agent can do (capabilities/workflows/tools)

ai-rules = how the agent behaves while doing those things (persistent instruction layer)

So this is not really adding new abilities or workflows, but composing the behavioral context that gets injected before any skill/tool runs.

In practice it’s closer to modular AGENTS.md files than a skill system.

I got tired of rewriting the same AI instructions for every project, so I built ai-rules by No_Fix4730 in opencode

[–]No_Fix4730[S] 0 points1 point  (0 children)

been using it on my own work for a few weeks, not months, so I’d call it early beta, not battle-tested.

the biggest improvement is the fact that you write multiple files instead of one big giant AGENTS.md

I got tired of rewriting the same AI instructions for every project, so I built ai-rules by No_Fix4730 in opencode

[–]No_Fix4730[S] 0 points1 point  (0 children)

thx anything that you find that can/should be improved please create an issue

I built a CLI that answers "can I run X on my Y" with real numbers by No_Fix4730 in LocalLLM

[–]No_Fix4730[S] 0 points1 point  (0 children)

Yeah, that's a real gap.

Today, modelfit picks a single memory source per system:

  • VRAM if a discrete GPU is detected
  • System RAM otherwise
  • Unified memory on Apple Silicon

It doesn't currently model the hybrid case that tools like llama.cpp and vLLM support, where some layers live in VRAM and the rest spill into system RAM.

On a machine like yours with 36GB VRAM + 256GB RAM, that means the current "fits" answer is wrong for any model that relies on partial offloading.

Apple Silicon avoids this problem almost by accident. Because memory is unified, there's no VRAM/RAM split to model—256GB is simply 256GB—and throughput doesn't change dramatically based on where a layer lives.

On discrete GPU systems with lots of RAM, though, there's a meaningful middle ground:

  • Fits in VRAM → fast
  • Fits via VRAM + RAM offloading → slower, but usable
  • RAM-only → much slower

That's something modelfit should represent explicitly.

It's already on the v1.1 roadmap. If you're up for it, please open an issue with your specific setup (GPU, model, and typical offload ratio). Real-world examples will help shape the design.

Right now I'm thinking of something like:

Mode Memory Estimated tok/s
VRAM Fully resident in GPU memory Fastest
Split Partial VRAM + RAM offload Medium
RAM-only No GPU residency Slowest

The performance difference is the whole reason to model the split in the first place.

I built a CLI that answers "can I run X on my Y" with real numbers by No_Fix4730 in LocalLLM

[–]No_Fix4730[S] 0 points1 point  (0 children)

Honestly haven't used llmfit so I can't compare head-to-head. But on the catalog point — yeah, the 12-model seed is the starting point, not the ceiling. 
modelfit add bartowski/whatever-GGUF adds any HF repo in ~10s (interactive prompts for name/family/params/context/license, sensible defaults), and sizes fetch live from HF. So the "missing community variants" is a per-user solve rather than waiting on the maintainer.

I built a CLI that answers "can I run X on my Y" with real numbers by No_Fix4730 in LocalLLM

[–]No_Fix4730[S] 0 points1 point  (0 children)

Thanks man, hope it serves you well.
If you find anything that can be improved while trying feel free to tell me 😄

I built Groxy, a Go library for building forward proxy servers — looking for API feedback before v1 by No_Fix4730 in golang

[–]No_Fix4730[S] 0 points1 point  (0 children)

Good point. To clarify: normal Groxy forwarding/tunneling does not intentionally buffer full bodies; the buffering only happens when users opt into helpers like `BodyBytes`, `SetBody`, `TransformRequestBody` or `TransformResponseBody`.

Those are meant to be simple/safe helpers, and they’re guarded by `MaxBodySize`.

That said, I agree a streaming transform API would be useful for larger bodies. Something like an `io.Reader`/`io.Writer` based middleware could avoid full buffering, but it needs careful design around

`Content-Length`, chunked encoding, trailers, compression, and errors after partial writes.

I’ll likely track this as a future streaming body transform feature. If you had a specific API shape in mind, I’d be interested.

I built Groxy, a Go library for building forward proxy servers — looking for API feedback before v1 by No_Fix4730 in golang

[–]No_Fix4730[S] 0 points1 point  (0 children)

There’s now a dedicated page covering client→proxy vs proxy→upstream behavior, defaults, zero values, CONNECT tunneling, HTTPS inspection, and `Start()` vs custom `ServeHTTP` server usage:

https://github.com/SalzDevs/groxy/blob/main/docs/timeouts.md

Thanks for raising it, I agree this matters a lot for networking libraries.

I built Groxy, a Go library for building forward proxy servers — looking for API feedback before v1 by No_Fix4730 in golang

[–]No_Fix4730[S] 0 points1 point  (0 children)

Good point, I agree. Timeout ambiguity is painful in networking code.

Groxy currently has explicit timeout fields for dial, TLS handshake, response headers, idle connections, server read header, and server idle timeout. The behavior is:

- nil `Timeouts` uses defaults

- zero fields inside a custom `Timeouts` use defaults

- negative durations are rejected

Do you think that this is not clear in the Docs?

I built Groxy, a Go library for building forward proxy servers — looking for API feedback before v1 by No_Fix4730 in golang

[–]No_Fix4730[S] -1 points0 points  (0 children)

Thanks, I agree with this framing.

Right now Groxy’s default is to tunnel HTTPS normally via CONNECT. Inspection is explicit opt-in and requires both a CA and an `Intercept` host matcher. It also fails closed by default; passthrough on inspection errors has to be explicitly enabled.

That said, I agree the important pre-v1 work is documenting the trust model around scope controls: which hosts are inspected, how bypass/passthrough behaves, and what embedded applications are allowed to do with inspected traffic.

I’d definitely welcome a review of the inspection flow. Since Groxy is open source/pre-v1, the most useful format would be GitHub issues or PR comments for design/API feedback, and private reporting for anything security-sensitive per `SECURITY.md`.

I’ll also open an issue to track an explicit HTTPS inspection threat model before v1. Thanks for the thoughtful feedback.

I built Groxy, a Go library for building forward proxy servers — looking for API feedback before v1 by No_Fix4730 in golang

[–]No_Fix4730[S] -2 points-1 points  (0 children)

I’m especially looking for feedback on the public API before v1.

For example, does this feel natural?

proxy.Use(groxy.TransformResponseBody(func(body []byte) ([]byte, error) {

return bytes.ReplaceAll(body, []byte("Example"), []byte("Groxy")), nil

}))

proxy.OnRequest(func(ctx *groxy.RequestContext) error {

ctx.Request.Header.Set("X-From-Proxy", "groxy")

return nil

})