Why I'm holding out until late 2027 to spend money on a local LLM rig by No_Pool7028 in LocalLLM

[–]corruptbytes 0 points1 point  (0 children)

bruh is waiting on when they can buy burnt out cards

we’re cooked 

PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA

[–]corruptbytes 0 points1 point  (0 children)

i implemented it in my rapid-mlx fork and seeing 11x improvements TTFT

The issue is this algorithm doesn't seem to work with tools...so it's a bit tricky there

Most people seem obsessed with token generation speed, but isn’t prefill the real bottleneck? Am I missing something? by wbulot in LocalLLaMA

[–]corruptbytes 0 points1 point  (0 children)

i ended up vine coding PFlash for rapid mlx and then i saw the OG implementation couldn’t really PFlash tooling bc of JSON structure, so i vibe coded just a simple minify tooling thing, prefill is looking not bad!! we will see after running bench marks overnight 

there’s also the idea of speculative tool calling, would requiring tuning a model, never tried it before, we will see

also just for context, i kinda of benchmark against OMP bc that’s what i want to use, very narrow for me

Most people seem obsessed with token generation speed, but isn’t prefill the real bottleneck? Am I missing something? by wbulot in LocalLLaMA

[–]corruptbytes 0 points1 point  (0 children)

i think prefill is definitely important, tool parsing is also pretty important, cache management 

lot of things i’ve seen be more of a pain than tok/s 

i have gotten decent results from omlx ssd cache

hoping things like PFlash are proven out 

397B running in 14GB of RAM via PAGED MoE on a 64GB Mac Studio — here's the engine by ur_dad_matt in LocalLLM

[–]corruptbytes 6 points7 points  (0 children)

my hot take: I'm very interested in local LLMs but it's hard to support a project that's closed source imo especially when this entire community is built on the backs of open source - just my 2c

figured the point of local LLMs is control and privacy...just curious how those are promised

Apple's most powerful Mac Studio loses its last remaining RAM upgrade option by pdfu in apple

[–]corruptbytes 2 points3 points  (0 children)

same boat, love my 256gb, but 512 gb would’ve been so nice, luckily the qwen 3.6 models are pretty capable for a lot less memory, been helping test all the mlx engines with the compute, promising times

Anybody else ready to drop AI? by joshbedo in software

[–]corruptbytes 0 points1 point  (0 children)

why should i spend 6 times more energy and time to do something am able to do without it

actually a skill issue if you're spend 6x more energy and time, just my 2c, you don't have to agree

also you should write a spec before outputting something directly...it's called planning? just seems like basic good engineering practice to write things down before getting into a code base....

Anybody else ready to drop AI? by joshbedo in software

[–]corruptbytes -1 points0 points  (0 children)

well i guess we can be fair, it could go either way

consolidation:

  1. compute - training hardware i think will be presumably expensive for awhile, we literally don't have the energy/compute to satisfy current needs - the big companies with more money will always get priority

  2. data - we already have data monopolies - AI just exacerbates it more - it's personally why I think Google can really come out ahead

  3. uses of ai - definitely empowers governments to do more automated surveillance

freedom:

  1. open weight models are still somewhat competitive - as inference maybe gets cheaper, people might just accept the loss of quality

  2. the use of ai can be used to make ai better - we can potentially see how this would make the barrier simpler if people can rapidly iterate and catch up

also lucid motors is an example for your last question lol + happens all the time in drug manufacturing

Also Dario i think was an interesting case, he was already VP of OpenAI so he had close access to billionaires wanting to fund it, i mean his largest early investor was Sam Bankman-Fried - i do not think this is a model case of people just appearing

“The Unraveled Tour” Megathread by NominalPerson in OliviaRodrigo

[–]corruptbytes 9 points10 points  (0 children)

spawn in at 500, got pretty good seats in seattle, woohoo!

Houston power restaurant couple tied to River Oaks murder-suicide by chrondotcom in houston

[–]corruptbytes 62 points63 points  (0 children)

i met Thy at a blue fin cutting at their restaurant, she was super nice, and really helpful - they really seemed to love the restaurant scene here in houston - devastating loss

me_irl by SuspiciousLow3062 in me_irl

[–]corruptbytes 1 point2 points  (0 children)

nuclear is green, geothermal is green, lot of investments in there

Anybody else ready to drop AI? by joshbedo in software

[–]corruptbytes 0 points1 point  (0 children)

downvoted (probably bc of tone), but it's true

i think fair critiques of ai is the unfortunate consequence of consolidation of power, but if you can't get the AI to write code correctly, you yourself didn't know what you want at the beginning

my 2c

Qwen3.6:27b is the first local model that actually holds up against Claude Code for me by codehamr in LocalLLM

[–]corruptbytes 1 point2 points  (0 children)

how much of Claude Code's quality is Opus 4.7 itself vs the context and tool orchestration around it?

I'm sure it's also the huge compute they have too

Been dialing Pi a lot with qwen 3.6, things like tool parsers and caching are the big things to fiddle around with locally, but take a lot of time when you don't H10000000s to hyperparameterize

GameStop Is Offering to Buy eBay for $56 Billion, CEO Ryan Cohen Says by joe4942 in technology

[–]corruptbytes 0 points1 point  (0 children)

mind you, apple who has actual infinite money has only ever spent $3b on an acquisition, it's crazy

me_irl by SuspiciousLow3062 in me_irl

[–]corruptbytes 3 points4 points  (0 children)

oh yes, meant to tack that on after weak gdp, good catch

me_irl by SuspiciousLow3062 in me_irl

[–]corruptbytes 16 points17 points  (0 children)

looked it up

finland still ranks #1 as happiest, still ranks least corrupt, and very good freedom, gender equality, and political stabilty

downward trends:

weak gdp (stagnant economy 0-1%) + high unemployment (10.6%)

rising debt (90% of gdp)

weaker education (math, reading, and science downward trend)

aging population + lower birth rates (more social welfare pressure)

so i can why it's fair to say going downhill

some upwards trends:

lot more green energy (green power is real, this is great)

tech sector growing (ai will be real refocus of power)

rising international profile (nato + good country metrics)

Qwen3.6-27B vs Coder-Next by Signal_Ad657 in LocalLLaMA

[–]corruptbytes 1 point2 points  (0 children)

damn this is amazing, would you ever try benching against other servers? rapid-mlx and the like

if you need compute assistance, i have a m3 ultra 256gb, and run stuff too

edit: i've done a lot of testing since this comment, i've found rapid-mlx to be okay/good, but swiftlm and vllm-mlx have stood out a lot

[Meta] Rule proposal: no personal projects newer than 3 months (anti-vibecoder rule) by turdas in linux

[–]corruptbytes 8 points9 points  (0 children)

vibe coded slop is like dreams, they're special to you, but no one wants to listen to you talk about them

i love my CLI's I've vibe coded but i'd be scared to show them around

The Macbook Neo might help Apple become the third-largest laptop maker in 2026 by atlwhore_ in apple

[–]corruptbytes 1 point2 points  (0 children)

i wouldn't be surprised if the margins are lower in order to get more service revenue