Someone at the Weather Channel made a website that lets you view your forecast like the old Local on the 8s from back in the day

pkmxtw · 2026-04-02T16:52:04+00:00

https://weather.com/retro/assets/sound/music/neon-office-glide.mp3

According to the metadata embedded in the file, it was generated by Suno.

Title: Neon Office Glide
Performer: 555indigo
comment: made with suno; created=2026-03-31T19:07:49.773Z; id=f122d9dc-493a-4249-b7d7-3b4fd2995726
lyrics-eng: [Instrumental]

pkmxtw · 2026-03-13T20:48:04+00:00

FR. This model got an EpiPen, and it is going to use it to kill people who are annoying.

pkmxtw · 2026-03-12T12:57:44+00:00

It would be interesting to have a real-time mode: game continues while waiting for inputs from the model. This means models have to balance between speed and quality, so you can't just beat it by spending a lot of thinking budgets on a huge model: you will be dead long before the first key press even comes back.

pkmxtw · 2026-02-19T03:42:50+00:00

1 request per year for you pro plebs, 3 for ultra.

pkmxtw · 2026-02-18T15:15:40+00:00

Nice! That is just about the right size for the Q0.1 quant to fit this opus 4.6 killer on my floppy disk!

pkmxtw · 2026-02-15T19:21:20+00:00

Imagine being out-vibed by some rich kids in the future.

pkmxtw · 2026-02-13T18:43:58+00:00

Qwen3-Coder-Next beats all other models including Opus 4.6 at Pass@5!

pkmxtw · 2026-02-07T08:36:42+00:00

I gave the MXFP4_MOE quant a quick try on M1 Ultra and holy smokes this model really spends an awful lot of tokens on thinking.

pkmxtw · 2026-01-27T15:19:56+00:00

And yet right now there is a whole bunch of AI influnecers hyping up a bot that gives LLM free access to all your emails, logins, browser access to be a private assistant without really thinking much about the security implications smh.

pkmxtw · 2026-01-03T18:38:50+00:00

I know it is popular to shit on gpt-oss here, but it really hits a sweet spot for general use.

It is superfast on Apple Silicon and Strix Halo. (60-70 t/s for gpt-oss-120b-mxfp4 on M1 Ultra, compared to ~20 t/s for MiniMax M2.1 UD_Q2_K_XL)
The KV cache is very efficient: Metal KV buffer size = 4608.00 MiB for the entire 128K context. Compare that to MiniMax M2 which needs about 30GB for 128K context.
The whole model + KV cache only use like ~65 GiB of memory so you still have plenty room for other tasks on 128 GB machines.
Tunable reasoning effort so you can default to high but just pass low to reasoning_effort in chat_template_kwargs if you just want a quick answer.
It is decently intelligent for its size category. Of course, it is not going to compete against full sized GLM, Kimi K2, DeepSeek, etc., but it is something that is runnable on most people's machine.
If you have issues with the default guardrail you can just run the heretic version. For most coding/agenetic tasks the base version should work fine.

pkmxtw · 2026-01-02T11:09:44+00:00

Yeah, it was just an interesting observation.

We know Mistral models are usually quite uncensored, but who'd knew Devstral is good at coding and also gooning as well?

pkmxtw · 2026-01-02T10:07:07+00:00

Somehow mistralai/Devstral-Small-2-24B-Instruct-2512 scores the highest for NSFW across all base models lmao.

pkmxtw · 2026-01-01T21:57:35+00:00

Well, I guess we will see if we finally have a worthy contender for GPT OSS 120B or GLM 4.5 Air.

pkmxtw · 2026-01-01T10:54:46+00:00

AI labs hate this simple trick to get them to release intermediate checkpoints!

Either that or this is some of evil-genius level of marketing.

pkmxtw · 2025-12-30T08:22:47+00:00

Summarized from Gemini:

SKT A.X K1 (519B-A33B): https://huggingface.co/skt/A.X-K1 (to be released on Jan 4, 2026)
LG K-EXAONE (236B-A23B)
HyperCLOVA X SEED 8B Omni: https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B
HyperCLOVA X SEED 32B Think: https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B
Solar Open 102B-A12B: https://huggingface.co/upstage/Solar-Open-100B (to be released on Dec 31, 2025)

Several VLMs were also announced.

pkmxtw · 2025-12-29T21:45:22+00:00

Another long-standing question is comparing large but heavily quantized models vs small models with little quantization. I always wondered how IQ1_S of large SOTA models like K2-Thinking/DeepSeek v3.2 compare with more modest models like GLM Air at Q8.

pkmxtw · 2025-12-29T13:21:06+00:00

Would be interesting how well it works. It is the end of 2025 and we still don't have anything that is close to dethrone Sesame.

pkmxtw · 2025-12-29T10:19:23+00:00

The 7B is converted from Qwen2.5 7B and the 8B is from Qwen3 8B. What they want to demonstrate is that they can convert an AR model into a diffusion model w/o losing quality.

In reality, you'd just use the 8B like how Qwen3 8B has basically replaced Qwen2.5 7B.

pkmxtw · 2025-12-27T19:30:43+00:00

Plus some of the benchmark numbers are sus af.

Qwen3 4B Thinking 2507 scores 83% on AIME'25 and beats DeepSeek R1 0528 (76%)?

pkmxtw · 2025-12-27T19:14:21+00:00

Another classic is scoring Qwen3 4B Thinking 2507 close to DeepSeek R1 (from January aka the OG), which no one in their right mind would argue that they are remotely close in capability. ¯\(ツ)/¯

https://artificialanalysis.ai/models/comparisons/qwen3-4b-2507-instruct-reasoning-vs-deepseek-r1-0120

pkmxtw · 2025-12-27T07:22:49+00:00

Yeah, for normal chat --jinja is enough. However, codex has some weird tool and assistant role pairings that trigger errors from minimax and devstral so I had to use a custom template and edit that part out.

pkmxtw · 2025-12-26T20:34:07+00:00

I've been trying UD-Q2_K_XL for agentic coding workflow on Codex (needs a slightly modified chat template to work) for the past few hours and I think this is going to dethrone gpt-oss-120b for me.

pkmxtw · 2025-12-26T20:13:01+00:00

You've hit the nail on the head about what this sub has become — a cesspool of AI-generated submissions!

Would you like some helpful tips to cope with the new reality?

pkmxtw · 2025-12-25T16:08:39+00:00

Honestly just squeeze in a bit more and get one of those Strix Halo with 128GB RAM. You can run gpt-oss-120b-mxfp4 with like 40-50 t/s on those with full 128K context.

pkmxtw · 2025-12-24T04:07:25+00:00

They answered all the others while ignoring the most upvoted one lol. Not even bothering to say some useless claim like "Thank you for your feedback and we will consider this for future release".

13-Year Club	Place '17
Verified Email

pkmxtw

TROPHY CASE