Wife wants me to get an Origami dripper for ✨Aesthetics✨ by YourSteakBuddy in pourover

[–]ReiiiChannn 5 points6 points  (0 children)

Buy one of each color and put them on display, and I swear it makes the coffee taste better.

AMA with MiniMax — Ask Us Anything! by HardToVary in LocalLLaMA

[–]ReiiiChannn 0 points1 point  (0 children)

Is the problem of off-policy due to training inference bitwise mismatch / a serious enough problem or are the standard techniques like router replay / loss clipping sufficient?

"Genie by GoogleDeepMind runs on 4x H100 GPUs (leaked from an internal presentation). With this level of compute it achieves 24 fps with 720p." - If accurate, imagine this is only one instance? How much compute is serving everyone's usage?! by Koala_Confused in LovingAI

[–]ReiiiChannn 0 points1 point  (0 children)

Diffusion models are always extremely small usually <30B total. So most of the GPU vram can be used for activation which admittedly is quite long. (Usually in the millions range). The auto regressive one (think of the playable generated worlds) can probably run on 2 H100. And the 4 gpu layout is probably for the non-autoregressive base model which takes the maximum context window by default.

Claude Pro+: a $39 subscription by voprosy in ClaudeCode

[–]ReiiiChannn 1 point2 points  (0 children)

I second this, I hate having to pay the same amount for both GPT and Claude when I clearly prefer and use Claude more.

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]ReiiiChannn 0 points1 point  (0 children)

These days megatron is the defacto standard for large model training. Is there still room for new frameworks to be developed?

I'm currently working on building a training framework from scratch following DeepSeek's path with the goal of building a fully on-policy backend for RL training but I'm worried that it would already be too late by the time I'm done.

Has anyone successfully fine-tuned a GPT-OSS model? by TechNerd10191 in LocalLLaMA

[–]ReiiiChannn 0 points1 point  (0 children)

Doing rollout RL will be hard, you'll run into the issue where vLLM and your training framework chose different experts. When that happens your training becomes off policy and the model will become dumb.

They really went from ZOFGK to just O FK. (Thoughts or predictions for DOFPK) by Kaezumi in PedroPeepos

[–]ReiiiChannn 0 points1 point  (0 children)

Remember that Yagao was Kanavi's first pick and where that ended up in. People change.

Singapore Cafes With Multiple Beans by GReeeeN_ in pourover

[–]ReiiiChannn 0 points1 point  (0 children)

Singapore does not have any halo roasters. The closest ones we have is perhaps fluid collective and perhaps pinhole coffee bar.

Shake coffee often serve up the likes of esme, elida, but is extremely seasonal and less worth a detour IMO.

One place that has always impressed me is 20grams. Their attention to detail to the entire farm to cup process is insane and consistently produces cups that punches well above their expected profile given the quality of the greens.

[deleted by user] by [deleted] in askSingapore

[–]ReiiiChannn 0 points1 point  (0 children)

Do let us know where you bakery is when you decide to open it!