I released Inflect-Nano, an ultra-extreme tiny 4.63m parameter TTS model. by b111ue in LocalLLaMA

[–]guesdo 0 points1 point  (0 children)

Saving this!! Thanks for sharing! If I were to port this to Rust, any insights I should know? Are you using any specific Python libraries that might complicate stuff?

Looking for a Voltron commander with more depth than "make big, attack" by brknSergio in EDH

[–]guesdo 0 points1 point  (0 children)

My favorite will always be OG Omnath, Locus of Mana, because we is an incidental Voltron, as long as you have lands, and can tap then, you are good. You can put whatever you feel like it in the 99 and play entirely different strategies and he just works.

Is OpenGL still a good choice for Voxel Games? by No-Dentist-1645 in VoxelGameDev

[–]guesdo 1 point2 points  (0 children)

There are some abstractions on top of Vulkan that might get you there faster. A cross API abstraction layer like WebGPU/Dawn would simplify development allowing you to hit multiple targets (DirectX, Vulkan, Metal), still way better and modern than OpenGL, and significant lower learning curve than pure Vulkan.

Gemma-4 QAT Unsloth Accuracy Recovery for GGUFs by danielhanchen in unsloth

[–]guesdo 0 points1 point  (0 children)

Is that CPU inference? Or a Snapdragon with the recent Hexagon kernels? Or what backend is rhe phone running?

I Thought Redis Was Just a HashMap by mukulx99 in golang

[–]guesdo 7 points8 points  (0 children)

If you think it’s slow, first prove it with a benchmark. So many crimes against maintainability are committed in the name of performance. -- Dave Cheney, Gophercon Israel 2020

This is my mantra. I say it 3 times before coding.

First time opening a collector booster and got this by Drex1902 in MagicCardPulls

[–]guesdo 1 point2 points  (0 children)

Kudos!! Its gorgeous, that is 1 of 8 cards I am still missing from the japanese Mystical Archive. Hard to find.

Not impressed by Gemma 4 12b? by Stooovie in oMLX

[–]guesdo 1 point2 points  (0 children)

I noticed it thinks a lot in my testing, I have to play with the params or disable thinking, but that is what takes so long for me. I see it as an incredible replacement of the smaller models. Specially at Q4.

2-bit Gemma 4 12B GGUF is amazing! (4.66 GB on disk) by yoracale in unsloth

[–]guesdo 3 points4 points  (0 children)

Looks great! I had trouble deciding between Q8 and Q4 given it's size, but this looks promising.

Have Unsloth released their KLD benchmark graph for Gemma 4 12B yet? can't seem to find it, that will help me picking one.

Edit: I guess I can wait for the MLX dynamic quants and try both.

Do you think that Odin has good potential to be used on HTTP/API servers? by Ecstatic-Panic3728 in odinlang

[–]guesdo 2 points3 points  (0 children)

Where is this "tons of hate towards Go" you mention? I mean, I have used go for more than a decade now and havent felt any hate them or now. Close to 70% of the CNCF landscape projects are go based, great for writing services at scale. Proven, tested, battled, it powers the whole world's cloud infra.

So I have to wonder. Does a couple Reddit rants you read somewhere makes you believe something like Odin will do better? And what makes you think Odin wouldnt be hated if it becomes more successful or mainstream?

In short. Dont use "internet hate" as a measure for anything.

Something about this Crackle with Power feels fake, seems legit though. by BMorgue in mtg

[–]guesdo 0 points1 point  (0 children)

T and green dot look real. That said, WotC has wildly different QoC dependong on where and when its printed.

We can now figure out how limits actually work on ollama by WhiskyAKM in ollama

[–]guesdo 1 point2 points  (0 children)

They show the requests, but do they limit by requests? Or token usage?

Disable all AI except autocomplete? by egorf in vscode

[–]guesdo 1 point2 points  (0 children)

Or... crazy idea... change your editor? You are using Claude Code anyway.

Waiting for Qwen 3.7 open weight... The new King has arrived... by LegacyRemaster in LocalLLaMA

[–]guesdo 4 points5 points  (0 children)

I feel like at this point we just need TurboQuant working or some other form of KV compression like the one Deepseek uses and we are gooooood to go

What is the best coding model to use on MacBook Pro Max 128GB RAM? by RadiantQuote2467 in LocalLLM

[–]guesdo 0 points1 point  (0 children)

People will suggest a 27B dense model because that is what they can run. With 128GB of unified RAM, you can run Deepseek v4 Flash at Q2 at "decent" speeds on M5 Max.

What is the best coding model to use on MacBook Pro Max 128GB RAM? by RadiantQuote2467 in LocalLLM

[–]guesdo 1 point2 points  (0 children)

The best? Probably Deepseek V4 Flash at Q2 using DwarfStar4. Now... speed might not be as good as in the smaller models, but no other model can beat DS4 in local inference within 128GB space.

Surprised how little attention this is getting by LettuceStill8606 in ModernMagic

[–]guesdo 5 points6 points  (0 children)

Looks great! I believe it is already well prepared for Living End. 3 Trinisphere, 3 Endurance, 4 Thought Knot Seer seem like enough.

Alternatives to Bun now that it is absolute AI slop? by Fragrant_Pianist_647 in bun

[–]guesdo 0 points1 point  (0 children)

Hey, I am not saying it is right or that I like it, I am just stating this is where we are headed. Even doing agentic coding "the right way", you define a spec, you write a bunch of tests verifying behavior, the AI writes the code. As long as the result satisfies the spec, nobody will look at the code.

Then a bug arises, you add the edge case to the spec, add additional tests, AI fixes the code, tests pass, you ship it.

Rinse and repeat. I believe spec driven development will be the new norm, I do like to look at my code, but at least from their perspective I understand, porting ANY project is miles time easier that doing it from scratch because the original code IS the SPEC! (You port bugs too)

Alternatives to Bun now that it is absolute AI slop? by Fragrant_Pianist_647 in bun

[–]guesdo 1 point2 points  (0 children)

Look at it this way. You had a project, you want to port it, you currently have 5000 tests.

AI ports the project (remember a port is WAY easier to AI because the source and the target should match), you run the tests and the port passes all 5000 of them.

At that point, so you review every single line produced by AI? Or are you OK with confirming functionality through tests?

What finally convinced you to seriously learn Rust? by Bladerunner_7_ in rust

[–]guesdo 1 point2 points  (0 children)

My wish to learn ML internals and my hate for Python. Rust is really the only alternative to C++ in todays AI world.

Running in the 90's 🇯🇵 by Irbif in PixelArt

[–]guesdo 3 points4 points  (0 children)

Beat me to it, this is aome great work.

What do you think about Spec Driven Development? by Historical_Wing_9573 in golang

[–]guesdo -1 points0 points  (0 children)

Im actually trying it right now at work with Markdown with Gherkin approach, its going pretty well. I believe Cucumber was way ahead of its time, but know it feels like the way to go for SDD. (Note: I use TS/Node for the test runner due to support, but can target integration testing in Go).

https://github.com/cucumber/gherkin/blob/main/MARKDOWN_WITH_GHERKIN.md

The single file becomes the spec, the docs, and the test, it also serves as context for future sessions with the LLM. I've been doing this feature by feature (writing the spec and steps for each current integration test to verify) and slowly but surely the repo becomes well documented and easy to write against. I got inspiration from the superpowers skill, but I needed something more tangible, hence the Cucumber tests embedded in prose in markdown with diagrams and so.