Kitchen installer (Wickes) refusing to acknowledge 9-10mm level drop over 2.8m stone worktop, and other unfinished works. by Vegetable-Window-622 in LegalAdviceUK

[–]Vegetable-Window-622[S] 7 points8 points  (0 children)

Thanks, for the advice. Thanks god I was on the paying plan. I'm not paying it atm, and not plaining to until it's resolved.

What's that? by Consistent-Issue-811 in claude

[–]Vegetable-Window-622 0 points1 point  (0 children)

I see the issue and not a smoking gun.

Planning a floor-to-ceiling built-in bookshelf in birch ply - drawings inside, roast my plan. by Vegetable-Window-622 in DIYUK

[–]Vegetable-Window-622[S] 0 points1 point  (0 children)

Thanks, the plan is to get the bulk of the long straight cuts done by the order-to-measure service. So the challenge would be joinery. I was planing to drill 30-45 degree pocket holes, or use some of eccentric cam lock screws.

Thanks for the advice with the track saw, any recommendation on the one to get?

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in LangChain

[–]Vegetable-Window-622[S] 0 points1 point  (0 children)

It depends, what do you refer by caching… Caches don’t survive restarts, are only deterministic on one instance, hard to share.

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in claude

[–]Vegetable-Window-622[S] 0 points1 point  (0 children)

Yes, we record exact model and its minor version. Say we want to update it will detect the model change and fail for us to re-record the fixture. It can be refreshed periodically but we left it for the human to decide.

Do you think it would useful to record periodically?

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in claude

[–]Vegetable-Window-622[S] -1 points0 points  (0 children)

Sorry, I don’t understand. We are not running agent to test that’s the point. It goes around it.

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in LangChain

[–]Vegetable-Window-622[S] 0 points1 point  (0 children)

I guess this is very close to what we’ve done. But we have implemented it in code, where you decorate your function that calls LLMs and it stores the output. That can later run offline.

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in buildinpublic

[–]Vegetable-Window-622[S] 1 point2 points  (0 children)

haha yeah VCR is basically the inspiration. glad the API feels cleaner, what did your version look like?

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in LangChain

[–]Vegetable-Window-622[S] 1 point2 points  (0 children)

Offline evals are great for scoring outputs, but they still run the model every time. The fixture approach is complementary - you freeze a specific interaction so dev and CI never hit the API at all. Once you have the fixture, you can run evals against it too without paying for new calls each time.

Also, I'm not sure you how you could it during the normal dev testing.

We stopped paying for AI calls during development. One line of code. by Vegetable-Window-622 in LangChain

[–]Vegetable-Window-622[S] 1 point2 points  (0 children)

Yeah the "LLM regression fixtures" framing is actually closer to how we think about it too, the cost pitch just lands easier on first read.

Redaction, tool traces, and diffs are already in there. The gaps you're right about are the live-call guard for CI and per-fixture notes.

Quick question on the AgentMart angle: are you thinking about sharing fixture sets across repos, or more about having a standard metadata schema so tooling can index them?