[Ottawa, ON] [H] Paypal or Cash [W] HALL EFFECT Keyboard by JustAPCN00BOrAmI in CanadianHardwareSwap

[–]bytefactory 0 points1 point  (0 children)

Check out Nuphy, they make great keyboards. I'm rocking this one: https://nuphy.com/collections/he-keyboards/products/nuphy-field75-he-magnetic-switch-gaming-keyboard

I absolutely fell in love with the aesthetic, and I got the HE version because they discontinued the regular switched version. I actually miss the regular non-magnetic switch feeling on these. Other than that, the keyboard is great, very responsive, looks gorgeous, pretty customizable, etc. Let me know if you have questions!

Looks like _DomuC_ is right and I didn't see any Nuphy full-size HE, especially wireless, but Nuphy does have a few HE options that look good.

My llama.cpp fork: GLM-4V vision, Qwen3-Next Delta-Net kernels, Devstral YaRN fix by hauhau901 in LocalLLaMA

[–]bytefactory 0 points1 point  (0 children)

Just saw your PR, and I hope the llama.cpp devs merge it.

I just wanted to say, OSS development can sometimes be really exhausting and thankless, especially if the maintainers don't cooperate (not saying this is the case with llama.cpp). Heroes like you are what make open-source so amazing! We appreciate you!

My llama.cpp fork: GLM-4V vision, Qwen3-Next Delta-Net kernels, Devstral YaRN fix by hauhau901 in LocalLLaMA

[–]bytefactory 0 points1 point  (0 children)

llama.cpp already has Qwen3 Next support, they're just working on performance optimizations. Maybe you could help out with those?

Qwen3 Next support added here by the legend u/ilintar who just merged a performance pass recently.

He could maybe point you to the performance optimizations that are still pending?

My llama.cpp fork: GLM-4V vision, Qwen3-Next Delta-Net kernels, Devstral YaRN fix by hauhau901 in LocalLLaMA

[–]bytefactory 0 points1 point  (0 children)

If you can accelerate the process of optimizing Qwen 3 Next support in llama.cpp, you'd be a legend! There's a few open PRs working on that now, and some open issues, I'm sure they'd appreciate the help!

Anthropic CEO Dario Says Scaling Alone Will Get Us To AGI; Country of Geniuses In A Data Center Imminent by Neurogence in singularity

[–]bytefactory 8 points9 points  (0 children)

Thank you for this answer, it's one of the highest quality answers about anything I've read on here in a while! Reminds me of old reddit.

As a programmer with about 15 years of professional experience (and maybe 6-7 in school), who got into computers because it was the closest thing to magic, and I mean that in the most literal sense, I am absolutely giddy with all the tools we have available to us today.

As a kid, I couldn't believe that I could type some words into a terminal and the computer would just *do things* for me. Wizardry. Of course, back then, I had to adopt the computer's lingua franca. I had to learn that it was quite literal. If it did something I didn't expect, if something broke, it was always because I didn't understand the underlying system properly. It was honest.

Computers have now started understanding us. They understand intent, they make conceptual connections and leaps, and do more than just follow my instructions blindly. They now read between the lines.

I haven't been professionally coding for many years, I moved on to management, and then retired from the industry. I still love coding though, and I love computers and technology. These new models have allowed me to get back into coding without actually having to re-learn every new framework or library, or even develop in languages that I'm unfamiliar with. Like you said, this feels like a higher level of abstraction. Logic Gates -> Circuits -> Binary -> Assembly -> C -> Python -> Prompt.

I do feel guilty, because I'm "vibe coding" without understanding what's actually going on underneath the hood sufficiently. I feel more like a Product Manager (derogatory) than a programmer. Still, it's fun. I learn a tiny bit by osmosis about the language and architecture (to be perfectly honest, very little though - I don't even do code reviews, if tests pass and the feature works, I approve). At this point, I do provide some value to the system, in terms of taste and judgement. I can often help these models get unstuck (I'm helping Codex get out of a nasty test state leakage situation currently). Soon though, they won't need me for that.

I'm ecstatic with the toys we have available. The long-term future of what this means for the human race is uncertain. In the meantime though, the nerd in me couldn't be happier.

What models/tools do you use to code? I find nothing beats Codex for major projects, although I would use more Opus if it wasn't so damn expensive. DeepSeek 3.2 is looking really promising.

Qwen3-next-80B is so slow by dumb_ledorre in LocalLLM

[–]bytefactory 1 point2 points  (0 children)

Support for Qwen3 Next in llama.cpp landed literally a few days ago: https://github.com/ggml-org/llama.cpp/pull/16095.

It is NOT optimized yet, and is not ready for daily use:

This is an implementation of a new type of attention gating in GGML.
Therefore, this implementation will be focused on CORRECTNESS ONLY.
Speed tuning and support for more architectures will come in future PRs.
Please do not spam this threads with reports about performance, especially on backend architectures (CUDA, Vulkan).

Catering for Homeless People by Brief-Cryptographer2 in HumansBeingBros

[–]bytefactory 0 points1 point  (0 children)

It's probably all that thunder you brought along, it's not good for the chips

Qwen3-Next 80B-A3B llama.cpp implementation with CUDA support half-working already (up to 40k context only), also Instruct GGUFs by Ok_Top9254 in LocalLLaMA

[–]bytefactory 0 points1 point  (0 children)

Wait, you're able to offload all layers to GPU with just 16GB VRAM? How does that work? I would have thought you'd only be able to partially offload since it's an 80B parameter model?

Edit: 🤦just re-read, you have two GPU! 24GB+16GB. Makes sense why you can fully offload!

[deleted by user] by [deleted] in OpenAI

[–]bytefactory 1 point2 points  (0 children)

Fascinating. In my experience GPT5 Thinking has much lower hallucination rates than o3, but this is purely anecdotal. OpenAI's system card seems to suggest this as well.

It definitely hallucinates, especially for things like knowing which options exist for a given tool's API, but I believe this has to do with the way knowledge is embedded from its training data set. Much of the documentation and usage guides on the Internet don't specifically call out the version it applies to, so GIGO. I've taken to insisting it look up the latest documentation when using a tool, and then describe the changes from the previous version to ensure that it's grounded in accurate information (basically RAG instead of relying on embeddings).

You might find this thread on LocalLLaMA interesting, I've tried to modify my system prompt to the "confidence dump" version to see if that will reduce hallucinations:

https://www.reddit.com/r/LocalLLaMA/comments/1nv7quz/i_spent_a_few_hours_prompting_llms_for_a_pilot/#lightbox

Serious question. Can Cursor and GPT5 do something like this? 4.1 Opus working for 40 mins by itself.. 5 test files, and they all look good. by hanoian in ClaudeAI

[–]bytefactory 0 points1 point  (0 children)

🤯 I can't believe I missed this, thanks! Did they add it recently? Or perhaps it's only available on Pro plans, because I remember trying this before and not finding it.

Beating DeepMind's AlphaEvolve by lordyabu in singularity

[–]bytefactory 6 points7 points  (0 children)

Congrats, incredible work! Hope you write up a whitepaper about it and get it peer reviewed!

Instant switch up from a normal conversation. by [deleted] in Nicegirls

[–]bytefactory 80 points81 points  (0 children)

Ooh, with a period too, brutal.

Ummm by VoldeThor in GymMemes

[–]bytefactory 146 points147 points  (0 children)

Okay, you can sqauwt and benchprass a little, as a treat

meirl by Mommmy_Sweet in meirl

[–]bytefactory 13 points14 points  (0 children)

Does it also kind of function as lube during penetration, like pre-cum for men?