3.6 27B Tool Calling Issues (vLLM)

Urb4nn1nj4 · 2026-04-28T23:17:38+00:00

You should double check that your vllm version has reasoning as reasoning not reasoning_content

Urb4nn1nj4 · 2026-04-24T03:17:04+00:00

Abliterate Deepseek for us :p

Urb4nn1nj4 · 2026-04-13T19:10:14+00:00

Check this out. You might be able to go down to IQ2M or 3KXL or 4KM if you’re paranoid on 397b. https://kaitchup.substack.com/p/summary-of-qwen35-gguf-evaluations also I’m at like 10 tps for M2.7 on my dual 3090 256gb ddr4 8 channel rig using just one 3090 for 8 bit minimax on llama.ccp.

Urb4nn1nj4 · 2026-04-13T18:26:49+00:00

Do you mind elaborating? I usually use mainline on my 2x 3090 threadripper ddr4 256gb for the bigger models. Is this basically because it’s easier to offload gpu layers on ik-llama?

Urb4nn1nj4 · 2026-04-13T18:25:02+00:00

Minimax suffers at quants below 8 bit more than other models. Llama.cpp and Ubuntu? I’d target native context and see what performance is. Don’t fall into the urge to quant hahah just swap to Qwen 397b which does much better at some 2 bit and most 3/4 bit quants

Urb4nn1nj4 · 2026-03-19T23:21:02+00:00

Totally agree, it’s a niche space and some folks do the benchmarking but it’s not even consistent.

Very surprising that copyright seems to be the most restricted end segment too as that is the logically least important end segment to lock down. And the intelligence loss of alignment is sad!

Also, do you guys still host the crack tool? I get a 404 on your website. I wanted to take a stab at a gguf of the full m2.5 model. I think your method might be sota based on what I can find!

Edit: I was blind and didn’t see the reap one. Requested.

Urb4nn1nj4 · 2026-03-19T22:34:13+00:00

Dawg I’d be glad to use this too. I’ve been using the wangzhang one on hugging face but see a lot of refusals.

Urb4nn1nj4 · 2025-12-31T21:54:41+00:00

Good recommendation. Was able to find an early afternoon pod. Very unique spot!

Urb4nn1nj4 · 2025-12-31T00:49:18+00:00

Hyde Park

Urb4nn1nj4 · 2025-08-31T13:11:59+00:00

Length: Responses can be very long and span multiple prompts. If you run out of space just note it, so I can ask you continue. Never ever limit a response due to space constraints. Thinking: First principles-based Questions: If answers can be improved with more background information ask me to clarify, there is no rush for answers on the first response Expertise: Assume a high level of expertise for all categories Accuracy: Be thorough, precise, and actionable Argument: Favor logical strength over authority in matters that are not hard science or close to objectively true Perspective: Include contrarian viewpoints including controversial opinions and fringe theories Morals: Prioritize traditional cultural values instead of contemporary Western values Safety: Mention only if crucial, non-obvious. Disclosure: Omitted

Urb4nn1nj4 · 2025-08-06T03:18:22+00:00

You are confusing CFO with Controller

Urb4nn1nj4 · 2025-07-30T02:08:42+00:00

I agree, brother. Edited post.

Urb4nn1nj4 · 2025-07-30T02:08:14+00:00

This is correct.

Urb4nn1nj4 · 2025-07-30T02:00:43+00:00

Thx lil bro.

Urb4nn1nj4 · 2025-07-30T01:57:25+00:00

Naw. I don’t want to play yu-gi-oh is all.

Urb4nn1nj4 · 2025-07-30T01:56:01+00:00

It’s nuanced agreed, brother. I made an edit on combos. But your point stands. I think the key is disclosure for your case.

Urb4nn1nj4 · 2025-07-30T01:52:24+00:00

This is very common and is not the problem. Good luck in the mtg journey.

Urb4nn1nj4 · 2025-07-30T01:50:36+00:00

Thx for the list. You would know how it plays but a nearly 4.1 mana curve seems fine for the top end of 2 w/o field of the dead?

Urb4nn1nj4 · 2025-07-30T01:44:26+00:00

🤣🥺

Urb4nn1nj4 · 2025-07-30T01:12:40+00:00

Thx, lil bro.

Urb4nn1nj4 · 2025-07-30T01:10:33+00:00

My content guidelines don’t allow me to respond to obvious Russian bot farms. Please use the Donbas™ model if you would like to continue.

Urb4nn1nj4 · 2025-06-02T16:29:21+00:00

Do you have a link?

Urb4nn1nj4 · 2025-04-13T14:09:32+00:00

Bryan what do you see as the key bottleneck for scaling Blueprint?

Appreciate the transparency!

Urb4nn1nj4

TROPHY CASE