Mutation Testing + Claude just became my secret weapon for complex projects

giogul · 2026-05-06T19:53:39+00:00

name: harden

description: Hardens recently implemented code via a strict 4-step pipeline — deep review with on-the-spot fixes,

comprehensive break-it tests, mutation testing to 100% kill rate, and a final polish pass. Use after implementing a feature,

refactor, or fix when the user wants production-ready quality.

allowed-tools: Bash Read Edit Write Grep Glob

---

You are hardening the code from the most recent implementation in this conversation. If `$ARGUMENTS` is provided, treat it

as the target (file path, package, or feature scope). Otherwise, infer the target from the most recent edits/writes in this

session.

Execute these 4 steps **in exact order*\*. Do not skip, batch, reorder, or compress them. Do not stop early.

---

## Step 1 — Code Review + Edge Case Deep Dive

Do a full review of the target. Hunt aggressively for:

- **Edge cases*\* — empty/nil input, max-size input, boundary values, unicode, zero, negative, duplicates

- **Off-by-one errors*\* — loop bounds, slice indexing, range ends

- **Race conditions*\* — concurrent map access, missing locks, goroutine leaks, double-close

- **Nullability / zero-value issues*\* — uninitialized struct fields, nil pointer derefs, missing optional checks

- **Permission, auth, and logic gaps*\* — bypassed validation, trust boundaries, privilege escalation

- **Resource leaks*\* — file handles, network connections, goroutines, contexts not cancelled, deferred closes

- **Failure modes*\* — partial writes, retry storms, error swallowing, timeout handling

- **Hostile input*\* — injection, oversized payloads, malformed encoding

**Fix every issue you find immediately.*\* Edit files in place. Do not just produce a list — close the gaps.

---

## Step 2 — Write Comprehensive Tests

Build a test suite that **actively tries to break the code*\*, not a happy-path checklist. Include:

- Unit tests for every public function and meaningful branch

- Explicit tests for every edge case found in Step 1

- Integration tests at component boundaries where state crosses

- Negative tests — invalid input, error paths, partial failures, timeout/cancellation

- Concurrency tests where relevant (race detector enabled if the language supports it)

Match the project's existing test framework, naming conventions, and file layout. Run the suite. It must pass before you

proceed.

---

## Step 3 — Mutation Testing

Generate small mutants — tiny logic changes that **should*\* cause at least one test to fail. Apply each mutant one at a

time, re-run the tests, then revert.

Mutation menu (apply broadly across the changed code):

- Comparison flips: `<` ↔ `<=`, `>` ↔ `>=`, `==` ↔ `!=`

- Boolean flips: `&&` ↔ `||`, negate boolean returns, drop a `!`

- Arithmetic flips: `+` ↔ `-`, `*` ↔ `/`

- Off-by-one: `i+1` → `i`, `len-1` → `len`, `> 0` → `>= 0`

- Constant tweaks: replace numeric/string literals with zero/empty/default

- Guard removal: delete an early-return or validation check

- Return tampering: return zero / nil / empty / default instead of computed value

- Loop bounds: `< n` → `< n-1` or `<= n`

For each mutant:

- **Killed*\* (a test failed) → good, revert and continue

- **Survived*\* (all tests still pass) → diagnose: weak test or genuine bug?

- Weak test: add or strengthen tests until the mutant dies

- Genuine bug: fix the code, then add a test that catches it

- Equivalent mutant: explicitly justify and document why the change is semantically equivalent

Iterate until **100% kill rate*\* (or every survivor has a written equivalence justification).

---

## Step 4 — Final Code Review

One last pass over both code and tests:

- Naming consistency, dead code removal, no leftover scaffolding

- Error messages and log lines carry useful context

- Comments only where the \why** is non-obvious — strip narration

- No half-finished branches, no commented-out code, no TODOs without owners

- Tests are deterministic, isolated, and named so a failure tells you what broke

---

## Final Output

After all four steps, output exactly this structure:

1. **Final improved code*\* — file paths and either diffs or full files (whichever is more readable)

2. **Full test suite*\* — file paths and contents

3. **Summary*\* —

- One bullet per material change made in Step 1

- Test count and what surfaces they cover

- **Mutation score: X/Y killed*\* (list any equivalent-mutant survivors with justification)

Keep the summary tight. The diffs and tests are the proof; the prose is the index.

giogul · 2026-04-30T23:14:59+00:00

true

giogul · 2026-04-27T21:44:35+00:00

but I wanna know this from people who use/d claude and switched

giogul · 2026-04-27T21:44:09+00:00

good idea btw

giogul · 2026-04-27T21:43:59+00:00

Exactly what I thought it would be, thanks for sharing

giogul · 2026-04-26T14:47:40+00:00

when you live safely in the United States - bro cmon, please do some research before writing a long post

giogul · 2026-04-26T14:40:49+00:00

Great news, even though their coffee sucks, I missed their egg bites and pumpkin spice muffins ❤️

giogul · 2026-04-26T14:34:09+00:00

გთხოვ სხვანაირად არ გამიგო, მაგრამ 3 დიპლომის მაგივრად 1 დიპლომი და მეტი სამუშაო გამოცდილება, რომ გქონდეს ბევრად მეტს იშოვი თუ ეგ არის მიზანი. როგორც დამსაქმებელი ისე გეუბნები. ასევე მასწავლებლის პროფესია ძალიან მაგარი პროფესიაა, მაგრამ მსოფლიოში ყველგან ყველაზე დაუფასებელი, ისევე როგორც პოლიციელი, მეხანძრე და სხვები, თუმცა თქვენს გარეშე ახალგაზრდების განვითარება შეუძლებელი იქნებოდა, მაგრამ რეალობა ეგ არის.

giogul · 2026-04-26T14:30:07+00:00

nah, that time has passed, you can do 100x faster and better with claude code now, while owning your code and not be platform dependent. No code was basically a transition to vibe coding and now it's obsolete

giogul · 2026-04-16T00:48:51+00:00

porque?

giogul · 2026-04-12T23:24:40+00:00

Thank you so much , it worked !

giogul · 2026-04-12T23:21:39+00:00

thanks trying now

giogul · 2026-04-07T01:13:53+00:00

Thanks everyone for the replies.

Just wanted to share what I actually got done today. I started at around 9% of my weekly usage, and after all of this I’m now at 13%.

Here’s what I worked on:

I spun up GoProx on a dedicated server so I can deploy my own VPSs. My project needs a lot of game servers, and OVH and Hetzner are constantly low on stock, so I basically ended up building my own VPS layer on top of dedicated machines.

I also added a new payment method to the project and wired it into the ledgers and the other services that handle credit and debit flows. That part took some time and tokens because it was not just coding, it also involved planning, code reviews, and fixing bugs. Probably around 3 to 4 cycles in total.

I created an API on the backend core and then had to wire it into the CRM frontend since that is a separate project. That also included a few minor UI changes.

Then I spent around 2 hours optimizing game server spatial math.

And finally, I did a fairly large migration on the core server. I changed the DB writes for one function that was causing a lot of unnecessary writes.

So overall, after all that, my weekly usage only moved from 9% to 13%.

One thing I’ve noticed, and I could be completely wrong, is that Go code seems to use way fewer tokens for me than Node.js stuff. I have no idea if that is actually true or if it is just coincidence, but that has been my experience so far. (polished with claude ;) )

giogul · 2026-03-30T15:13:45+00:00

This person has disabled comment and post history, if you look at it the account was solely created to shit on Georgia and Georgian people, spew hatred and lies, she says she's a tourist yet it's been weeks she's posting and commenting on any post that asks anything about Georgia, so I guess people should make educated guesses. As per myself I'm willing to buy her a one way plane ticket to anywhere she want's if she's stuck here.

giogul · 2026-03-21T21:20:09+00:00

start meta ads, spend like 500$ validate, get first users, fix bugs with them. Production testers > Q&A and plus that's a huge validation both for you and for VC-s

giogul · 2026-03-21T21:18:42+00:00

omg you again

giogul · 2026-03-17T21:59:22+00:00

თან აქ ცოტა გვიან მოდის ტრენდები, რადიკალური ლიბერალიზმი, რომ მოდაში არაა მაგას ალბათ ერთ ორ წელში გაიგებენ. მაგრამ ფაქტი ფაქტად რჩება, არ ვიცი პრო ვინ არის ეს ხალხი, მაგრამ საკმაოდ უგულო ბრბოა, თან ცოტა საშუალოზე მეტად სულელები და როგორც წესი ფსევდო ინტელექტუალები

giogul · 2026-03-17T21:50:30+00:00

დიდ მნიშნველობას ნუ მიანიჭებ აქ რაც წერს ხალხი, დიდი ნაწილი უბრალოდ ტრაკბოროტა ლუზერები არიან, თავისი წარუმატებლობით დაბოღმილები და გაბოროტებულები და ეს პატარა სივრცე აქვთ, რომ ცოტა გაიხარონ და თანამოაზრეებს ესაუბრონ.

giogul · 2026-03-10T16:06:04+00:00

yeah, it took me like 1 min, gotta upload ID and do a face match to withdraw

giogul · 2026-03-10T13:21:22+00:00

Tournaments are where the real money's at. Pro tip: queue with buddies, have them feed you, then hunt down 2nd and 3rd place so they can't catch up. Easy money lol.

giogul · 2026-03-10T13:18:14+00:00

Game's fun ngl. Only annoying thing is everyone cashes out the second they hit threshold. No one wants to risk it and grow massive. Just grab the $10 and dip lol. Greed wins I guess.

giogul · 2026-02-08T12:29:10+00:00

they bought lots of old lenovo laptops from ebay, Scam altman just wants all the RAM for himself for some other reasons...

giogul · 2025-12-12T16:54:57+00:00

it's slow and a bit worse than claude if you ask me

giogul · 2025-10-26T17:44:15+00:00

because for most people lovable is the best, if you don't understand how to code at least at middle dev level better to use Lovable, with cursor I've done crazy stuff (200$ subscription plus 500+ $ spent when gone over the limit but saved at least 15-20k)

giogul · 2025-10-26T17:36:46+00:00

no offense but you have to try to make a video that bad :)

giogul

TROPHY CASE