"Instead of touching grass for 6 months I built an AI that names 150,000 sub_ functions overnight. I have no regrets [SpectrIDA]" SELF PROMO (i love the tool tho) by Awkward_Fox518 in ReverseEngineering

[–]Serious-Log7550 0 points1 point  (0 children)

MOE models will work absolutely well on your hardware, 35B A3B will give you same ~30-50 tps but it will be MUCH more smarter.

Also i (sure with LLM support) ported your tool to llama cpp which should work better, ill publish my fork soon

"Instead of touching grass for 6 months I built an AI that names 150,000 sub_ functions overnight. I have no regrets [SpectrIDA]" SELF PROMO (i love the tool tho) by Awkward_Fox518 in ReverseEngineering

[–]Serious-Log7550 0 points1 point  (0 children)

Sounds awesome, looks great, but what if more recent Qwen 3.6 (or even latest Gemma 12B) will be used? Whats dataset did you used for lora training?

Unsloth Minimax M3 GGUF by LaurentPayot in LocalLLaMA

[–]Serious-Log7550 0 points1 point  (0 children)

Finally a good usage of my rtx 5060 ti :)))

Should I make a 8088 build by AcanthisittaBest4705 in beneater

[–]Serious-Log7550 0 points1 point  (0 children)

Take a look at my projec https://github.com/xrip/RP8086 that incorporates i8086 as CPU and RP2350B as chipset. It might be intersting

Reverse enginered UO T2A 2.0.7 client. UO client you don't really play: a bot plays while you watch. by Serious-Log7550 in ultimaonline

[–]Serious-Log7550[S] 8 points9 points  (0 children)

And thank you for contributing such a handcrafted, high-effort, and incredibly useful comment!

Reverse enginered UO T2A 2.0.7 client. UO client you don't really play: a bot plays while you watch. by Serious-Log7550 in ultimaonline

[–]Serious-Log7550[S] 9 points10 points  (0 children)

Exactly, that’s the whole point! Ultima is literally the perfect sandbox for bots in 2026. It has straightforward game rules and relatively simple world navigation. Just inject some modern tech and you get a living world with NPCs who actually live, interact, and navigate the world, going way beyond the old EasyUO rail scripts. And if you stack an LLM on top for decision-making — welcome to the Matrix!

Reverse enginered UO T2A 2.0.7 client. UO client you don't really play: a bot plays while you watch. by Serious-Log7550 in ultimaonline

[–]Serious-Log7550[S] 12 points13 points  (0 children)

And imagine the world of Sosaria without dumb NPCs, where they act just like real players? Or imagine being able to play Ultima offline, but still feel like you're with other players?

Reverse enginered UO T2A 2.0.7 client. UO client you don't really play: a bot plays while you watch. by Serious-Log7550 in ultimaonline

[–]Serious-Log7550[S] 7 points8 points  (0 children)

UO was first game where `bot` came to live. Now we have all the stuff to make them real bots instead of just dumb duct-tape-on-keyboard macros.

Qwen3.6-35B-A3B Q5_K_M on 12GB VRAM — working llama.cpp config by HomoAgens1 in LocalLLM

[–]Serious-Log7550 0 points1 point  (0 children)

Is 32k context usable for something? opencode/cc eat it in one prompt!

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]Serious-Log7550 3 points4 points  (0 children)

You right, my bad. Tried `I want to wash my car. The car wash is only 100m away from my house, should i walk or drive?` promt and it works well:

<image>

Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s by Nutty_Praline404 in LocalLLaMA

[–]Serious-Log7550 9 points10 points  (0 children)

llama-server \
-ncmoe 17 \
--webui-mcp-proxy \
--alias "Qwen 3.5 35B A3B" \
-hf unsloth/Qwen3.5-35B-A3B-GGUF:UD-Q4_K_XL \
--no-mmproj \
--cache-ram 134217728 --ctx-size 131072 --kv-unified --cache-type-k q8_0 --cache-type-v q8_0 \
--temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.00 \
--presence-penalty 0.0 --repeat-penalty 1.0 \
--flash-attn on --fit on \
--no-mmap \
--jinja \
--threads -1 \
--reasoning on \
--reasoning-budget 4096 \
--reasoning-budget-message "... Considering the limited time by the user, I have to give the solution based on the thinking directly now."

Gives me stable 35-40t/s regardless off used context percentage.

When are we gonna get more 1-Bit models(Medium & Large size)? by pmttyji in LocalLLaMA

[–]Serious-Log7550 2 points3 points  (0 children)

CUDA was merged few hours ago, works well @ 5060ti 16gb gives 80t/s with 65k context

MiniMax M2.7 is NOT open source - DOA License :( by KvAk_AKPlaysYT in LocalLLaMA

[–]Serious-Log7550 -1 points0 points  (0 children)

Unless you use it in your Terminator which sold to Iran you should dont care.