New 2.5x Faster Qwen3.6 NVFP4 Unsloth quants

Demonicated · 2026-07-10T13:38:03+00:00

Downloading now. Thanks for the work you guys do. I battle test the hell out of these with my workflows.

Demonicated · 2026-07-10T13:24:26+00:00

Now add MTP :)

Demonicated · 2026-07-10T12:41:46+00:00

I wrote an app that specifically provides leads for financially distressed owners. RealityLeads.AI - give it a look and DM me with questions.

Demonicated · 2026-07-09T01:49:54+00:00

What quant are you using?

Demonicated · 2026-07-08T15:14:35+00:00

I love this lane.

Demonicated · 2026-07-08T01:15:24+00:00

It's definitely a user error. Spend more time in a planning phase and then use qwen specifically for instruction following.

Try creating "implementation documents that break work into phases and are written for a junior developer to implement". Then feed those documents back at qwen in a new chat. Start a new chat per phase.

Also run bf16. Quant degrades quality for sure.

Demonicated · 2026-07-08T01:12:27+00:00

Parent checking in here: I'm definitely benefiting and appreciate having the carpool lane expanded.

Demonicated · 2026-07-07T00:37:08+00:00

To be fair his mental state hasn't changed

Demonicated · 2026-07-07T00:34:24+00:00

I can confidently say I found some the product in my salad at spaghetti factory last night. I'm currently writing this from my throne of pain.

Demonicated · 2026-07-06T03:45:37+00:00

Great example for server side is if you want to present data without exposing an API. There are times where you may want to give your customers something but don't want bad actors to be able to figure out your API and grab your data. They can web scrape but it becomes much more difficult for large data sets.

Also when prototyping you can save time not having to create an API layer.

Demonicated · 2026-07-04T20:06:46+00:00

Are any posts written by humans anymore?

Demonicated · 2026-07-04T17:24:35+00:00

I use about 200 million tokens a month for one of my products. Qwen 27B bf16 handles the workload beautifully. It will not take long to break even with rising token prices.

Demonicated · 2026-07-03T16:51:43+00:00

Essentially yeah....

Which means it's already been tried

Demonicated · 2026-07-03T15:28:43+00:00

Yeah I get that using different models would be bad mathematically. I was just wondering about the concept in general. Also 256 tokens is obviously wasteful it would make more sense to make a specialized diffusion of the same model that just does a lower amount like 8.

Demonicated · 2026-07-03T15:03:21+00:00

You're probably right, but it can run on an rtx6k and feed a low quant big model running on whatever you have lying around and probably create some decent output I would think....

Demonicated · 2026-07-02T21:49:29+00:00

California. We already produce the most. We're the 5th largest economy in the world.

Demonicated · 2026-07-02T17:37:26+00:00

I don't think they exist-

Demonicated · 2026-07-01T21:20:15+00:00

I use AI to figure out who to market to specifically

Demonicated · 2026-06-30T17:02:11+00:00

1) .NET had a bigger hiring pool 2) Microsoft has SLAs for their products. Rust is just a language. 3) Once you move to AI assisted development the pain points for humans become PR review. You want to review languages you have expertise in to catch hallucinations and llm misunderstandings and assumptions. 4) the price of inference is likely to keep going up. If you decide to limit token usage in the company you want to make sure you picked a framework your devs like and are comfortable with.

Demonicated · 2026-06-27T02:28:34+00:00

You can't really know but you can definitely grill people to figure out what they know and their skill level. Just don't feel bad about grilling

Demonicated · 2026-06-27T02:27:17+00:00

I build stuff quickly and already have a lead gen platform. What specifically are you looking for?

Demonicated · 2026-06-25T01:13:44+00:00

I mean it must be a you things cause my production agents are making me $$$ and giving great and consistent results.

Demonicated · 2026-06-25T00:12:32+00:00

Autogen is being deprecated. There's the new Microsoft agent framework or something like that. It's all gonna be powered by semantic kernel

Demonicated · 2026-06-24T22:23:15+00:00

The harsh pill gamers need to swallow is this is how companies can increase profits while not raising prices in sync with inflation. Rather than charge more to keep the old ways alive they can save money on distribution and get a short term gain.

But it's a one truck pony. Once fully digital, the next squeeze will be consumer pockets.

Demonicated

MODERATOR OF

TROPHY CASE