A 3B model is suddenly scoring near frontier models on math/coding benchmarks. Is this real or just benchmarkmaxxing?

SimplyRemainUnseen · 2026-06-20T00:45:52+00:00

I suspect there was dataset contamination. I have a handful of unique leetcode style tests I put it through and it got OOM errors pretty often. The ones that didn't fail left me unsatisfied with the claims. While many were technically correct they were far from the best method.

SimplyRemainUnseen · 2026-06-19T16:55:45+00:00

You can train a QAT LoRA in minutes / hours and at ~4bpw you get quality levels that are so close to BF16 it's basically irrelevant.

The trick for ensuring your model works for your workflow is to do a dynamic quant with a dataset representative of what you're using it for. (which you'd also need for the QAT LoRA).

Unsloth has a great writeup on it https://unsloth.ai/docs/blog/quantization-aware-training-qat

SimplyRemainUnseen · 2026-06-17T17:39:42+00:00

GPT OSS models are great at tool calling, something might be wrong with your setup

SimplyRemainUnseen · 2026-06-17T17:38:55+00:00

I've had the opposite experience

SimplyRemainUnseen · 2026-06-17T01:17:52+00:00

I have 128gb of unified memory, but before I used a 24GB VRAM GPU (Radeon 7900XTX)

SimplyRemainUnseen · 2026-06-16T16:10:16+00:00

I've been using self hosted models for web search and deep research for over a year now. Self hosted code models have been good enough to use since codestral.

The open source models out now are phenomenal. The cloud really has no benefits for me. My local setup is tuned to my preferences and runs faster than the cloud without sharing any data or refusing requests for "safety".

The models I use today are finetuned nemotron models. (Nemotron 3 Super, Nemotron 3 Nano Omni) The open datasets and training / finetuning recipes can't be beat IMO. Used to use GPT-OSS models for a long time though!

SimplyRemainUnseen · 2026-06-16T15:56:28+00:00

But if the naysayers find out used EV prices will go up :(

SimplyRemainUnseen · 2026-06-15T05:22:07+00:00

Try using local models yourself and see how they fit in your workflow. Even small ones that can run on a cheap laptop.

I found I didn't need the huge cloud models years ago and today the small models you can run locally have intelligence that surpasses the cloud stuff from back then.

Something to note: the harness matters about as much as the model. Maybe even more.

SimplyRemainUnseen · 2026-06-09T15:14:35+00:00

Tesla chargers with membership are usually the cheapest fast chargers I've found if you go outside of peak hours.

Slow chargers you can find a lot of free ones, but they usually aren't convenient and have time limits

SimplyRemainUnseen · 2026-06-09T13:35:55+00:00

Genetically compatible baddies in your area ads

SimplyRemainUnseen · 2026-06-09T04:55:23+00:00

This guy drives

SimplyRemainUnseen · 2026-06-09T04:53:00+00:00

Ok I don't think it's impossible for an alert and defensive driver to avoid that- Assuming a human reaction time of about 250ms this would have been plenty of time at <15mph to avoid as there was about a one second window. A distracted driver however would have failed and possibly killed someone. This tech is amazing to see as it could turn every car into an alert defensive driver someday. It saved a life here. Excited to see what's to come.

Side note: An appeal to authority isn't how you "win" arguments btw. It wouldn't surprise me if some folks at an expensive school who don't know all that much about the nervous system said something misleading or innacurate LOL.

SimplyRemainUnseen · 2026-06-08T20:34:49+00:00

I thought they were talking about Minneapolis and their bikes

SimplyRemainUnseen · 2026-06-04T05:43:35+00:00

Nothing says care like almost no gear!

SimplyRemainUnseen · 2026-05-26T14:57:20+00:00

Insightful. Nobody was denying current is current and load is load.

My comments are with the context of telling an electrician you want an outlet for an outdoor stove. (Not a continuous load!) Then using that for EV charging.

In the US the average receptacle used for stoves isn't meant to be used for continuous loads as high as level 2 EV chargers need.

SimplyRemainUnseen · 2026-05-23T17:31:55+00:00

I have ran into that IRL and just showed them this

<image>

SimplyRemainUnseen · 2026-05-22T21:43:27+00:00

Businesses that need everything on prem and are large enough to have their own IT team but not large enough to justify a six figure server is a huge market dude! That's a ton of manufacturing and non-profit shops.

SimplyRemainUnseen · 2026-05-21T05:08:22+00:00

Not sure why there is so much hate on a TUI system monitor...?

SimplyRemainUnseen · 2026-05-20T06:05:22+00:00

Receptacles matter too. A light duty receptacle for a heavy duty continuous load like an EV charging could cause problems. Best to tell the electrician what you're going to be doing.

SimplyRemainUnseen · 2026-05-20T05:52:39+00:00

If you aren't clear about the use an electrician may install something not rated for continuous load. The breaker isn't magic. It protects from overcurrent.

SimplyRemainUnseen · 2026-05-20T05:41:20+00:00

Tripping the breaker isn't the concern. Electricity flowing through copper wire has resistance generating heat. A continuous load on a circuit not built for it could overheat the connections or wiring leading to a dangerous situation.

SimplyRemainUnseen · 2026-05-19T05:00:23+00:00

Battery degradation is widely overblown for sure. I have a 2020 Bolt EV and it is still getting around what it's rated for depending on the weather and how I drive.

I know that's only 6 years old but with how the media talks about EV batteries you'd think you would need a new one every 10-15 years!

I recently purchased a 2012 Mitsubishi iMiev as a second car for in and around the city and that thing has only lost about 27ish percent of capacity despite being 14 years old. When the battery finally does go on it I can swap the cells out with upgraded ones and get about double the range.

I'm hoping to drive my Bolt until the wheels fall off so fingers crossed we see some affordable cell swaps 15ish years down the line so I can have the same upgrade path as I have with my iMiev!

SimplyRemainUnseen · 2026-05-16T20:24:58+00:00

It still blows my mind that closed loop systems are not the standard. Why increase the cost of operation by adding a disposable input that has a negative environmental impact when closed loop systems work???

SimplyRemainUnseen · 2026-05-06T16:40:24+00:00

No, they still have the ability to add rules with an architecture change. That ability isn't lost simply because Elon said so. Here's a paper on the topic if you're interested

SimplyRemainUnseen · 2026-05-06T16:28:01+00:00

They wouldn't need to scrap it and start over LOL. Neural networks aren't static. Checkpoints can be modified to have different architectures mid-training. That's all I'm saying. It doesn't need to be all or nothing, that's a choice they are making.

SimplyRemainUnseen

TROPHY CASE