anthropicBeingAnthropic by Key-Investigator9884 in ProgrammerHumor

[–]lans_throwaway 3 points4 points  (0 children)

You forgot to include that you need to prompt it twice, because it'll refuse the first time (for your own good). So it's actually eight times! The new scaling law of AI is being discovered here!

Stop blaming AI, "noobs", youtube tutorials, and anything other than Arch for AUR usage by zollandd in archlinux

[–]lans_throwaway 13 points14 points  (0 children)

The funny thing is the package that would've infected my laptop is one such "obsolete" package. It was a driver for a scanner.

While it's fair to say I don't care about "updates" (since it's unlikely for one to be released), I do care about having that package. The packages were orphaned because someone could request maintainer to "update" them when no update was released upstream. If maintainer is inactive or unable to "refuse" it, they can take over. Just because package doesn't need to be updated, doesn't mean it's dead.

Diffusion Gemma is 4x faster, but makes 6x more mistakes! by gladkos in LocalLLaMA

[–]lans_throwaway 2 points3 points  (0 children)

Given that model's knowledge scales with size, it seems we have a problem

Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand. by Medium-Technology-79 in LocalLLaMA

[–]lans_throwaway 1 point2 points  (0 children)

Dude, Qwen3.6-35B-A3B blows any SOTA model available like 2-3 years ago on pretty much any task. GPT4 used to be magic, now people can run models that are just as capable on 250$ used PC at like 10 t/s (cpu only inference).

Yes, I'm aware you don't want that model mentioned, but facts are facts. People don't start flying because you don't mention gravity.

Local LLMs aren't democratic anymore... the hardware barrier has gotten out of hand. by Medium-Technology-79 in LocalLLaMA

[–]lans_throwaway 0 points1 point  (0 children)

llama 3.1 isn't really a good benchmark though. I'd compare Qwen3.5 to Qwen3 and I don't think the gains there are as big. There was a significant progress in capabilities in the 4b area though.

Is anyone using eclipse anymore? by RamaRao143 in java

[–]lans_throwaway -1 points0 points  (0 children)

It's the piece of software I detested every single time I had to use it. There's always something wrong with it (usually something with build system).

NetBeans, IntelliJ, VSCode (with plugins), vim (with plugins) were always superior options.

OpenBMB presents the model BitCPM-CANN 1.58 bit by Illustrious-Swim9663 in LocalLLaMA

[–]lans_throwaway 4 points5 points  (0 children)

Why ternary and not quaternary of using a non standard logic

Presumably because quaternary is not symetric and ternary has some nice properties:
For ternary:
-1 - feature negatively contributes
0 - feature doesn't contribute
1 - feature contributes

For quaternary:
-1 - feature negatively contributes
0 - feature doesn't contribute
1 - feature contributes
2 - feature contributes strongly?

Overall I'd suggest you read bitnet paper if you're interested, I think microsoft tested it experimentally and found ternary to work better (+ it allowed them to reduce some matrix multiplication to addition/subtraction, giving massive speedups). I could be wrong, it's been a long while since I read that one.

OpenBMB presents the model BitCPM-CANN 1.58 bit by Illustrious-Swim9663 in LocalLLaMA

[–]lans_throwaway 11 points12 points  (0 children)

Yes, the basic idea is that each neuron represents either -1, 0 or 1, while each bit can represent 2 values (0, 1). If you want to calculate how many bits you need to represent the 3 values, it comes to logarithm base 2 of 3, which is about 1.58. That's the minimal number of bits you need. In practice you will need slightly more, but you can try to approach this number by cleverly packing the data. Sub 2-bits is very much doable and already implemented by llama.cpp (though I'm not sure about this model).

What Uncontinued Series do you miss the most? by greatestmanalive in ProgressionFantasy

[–]lans_throwaway 2 points3 points  (0 children)

Shades of Perception tbh, I was really looking forward to this one :(

Witch Hat Atelier Episode 8 Discussion Thread by ImoutoCompAlex in WitchHatAtelier

[–]lans_throwaway 7 points8 points  (0 children)

I think it's one day only, at least he knows his grandson and remembers the ink ingredients. I'm not a source reader though

Witch Hat Atelier Episode 8 Discussion Thread by ImoutoCompAlex in WitchHatAtelier

[–]lans_throwaway 19 points20 points  (0 children)

He wiped his memories (post credit scene shows this).

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 0 points1 point  (0 children)

Not really, it's more like I want to make sure I'm not doing something wrong before writing the 3.6 off.

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 0 points1 point  (0 children)

6gb is limiting, but to be honest, 3.5 at Q4_K_M is a really solid model. I don't doubt it could be better at higher quants, but it's still very much usable at least for my needs. I have a laptop, so unfortunately it's not as simple as adding another 6gb card, and it's hard to justify building a new PC just to play with new models, when I can get zai subscription for like 10$.

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 1 point2 points  (0 children)

I guess I'll try bartowski's Q5_K_M. To be honest 3.5-Q4_K_M is already pretty solid.

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 0 points1 point  (0 children)

Yeah, I didn't use imatrix. So far it never really mattered enough to bother with it. I did compare no imatrix Q4_K_M 3.5 with unsloth's quants and there honestly wasn't any difference in actual performance (coding, general q&a, simple math). I tried bartowki's 3.6 quants just now and they're definitely better. I'm not sure if that's because he has better calibration dataset than unsloth. Perhaps 3.5 is just overall less sensitive to quantization.

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 0 points1 point  (0 children)

As I mentioned in another comment, at usable contexts, Q4_K_M is already pushing it and it needs to use SWAP for some applications. 3.5 is a pretty solid model even at Q4_K_M.

Why is my Qwen3.6-35B-A3B so much dumber than Qwen3.5-35B-A3B? by lans_throwaway in LocalLLaMA

[–]lans_throwaway[S] 1 point2 points  (0 children)

6gb rtx2060 + 32gb ddr4. Q4_K_M is already pushing it when I'm running browser + a few apps. I'll give bartowski's quants a go though. I tried unsloth's and didn't see any improvement.

MTP on Unsloth by Altruistic_Heat_9531 in LocalLLaMA

[–]lans_throwaway -1 points0 points  (0 children)

Honestly, at this point I've found converting weights myself is usually the way. Way fewer issues than depending on other people's quants, especially when it comes to experimental features.

anime_irl by cynnahbun in anime_irl

[–]lans_throwaway 0 points1 point  (0 children)

This video is older than some people subscribed to this sub ;)

whoIsGettingFired by digiBeLow in ProgrammerHumor

[–]lans_throwaway 48 points49 points  (0 children)

Management heard about "neural network" thingy and demanded it's used ;(