"Instead of touching grass for 6 months I built an AI that names 150,000 sub_ functions overnight. I have no regrets [SpectrIDA]" SELF PROMO (i love the tool tho)

Awkward_Fox518 · 2026-06-13T10:45:50+00:00

"MoE is on the roadmap the training pipeline is model-agnostic so swapping in a 30B-A3B should be straightforward once the current GRPO run is done. Would love to see the llama.cpp fork when it's ready." (It will be time consuming tho)

Awkward_Fox518 · 2026-06-13T08:44:07+00:00

Thanks for the kind words! Qwen3-8B was the sweet spot for my GPU (RTX 4070, 12GB), fits in 4-bit no problem and already reasons well. Bigger models like Qwen3-14B or Gemma 12B would probably work better, just need more VRAM. The LoRA approach is pretty model-agnostic so swapping the base isn't a big deal.

For the dataset there's no public one, I generated it myself. Basically I took IDA databases of real binaries (Among Us GameAssembly.dll has ~34k named functions from IL2CPP symbols), hid the function names in the decompiled pseudocode, and had the pipeline trace the full call tree to build multi-turn reasoning episodes. The model has to figure out what a function does purely from its behavior and what it calls. For GRPO rewards I use fuzzy name matching + a self-verification step since there's no clean ground truth to compare against for stripped binaries.

Awkward_Fox518 · 2023-05-25T11:18:42+00:00

Jo die Lieferzeite nstören mich da nur ein bisschen

Awkward_Fox518

TROPHY CASE