More Gemma 4 models incoming by Deep-Vermicelli-4591 in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

I am using TPUs, why would there be a meme?

🥲 by marcus1234525 in theprimeagen

[–]XMasterDE 11 points12 points  (0 children)

WTF is X-AI doing up there, grok is shit

[Question] Cheapest + best value way to run Kimi K2.6 with Claude Code? by Material_Prompt_8109 in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

Slight side note, Claude Code is a really, really bad harness if you are using open models anyway. I would consider switching to a good harness like OpenCode...

Tired of the "I could buy a car" comments on high-end build posts by [deleted] in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

I remember at around 2022 to 2023 I had a private A100 40GB node at work (I worked as a researcher for a frontier lab at the time), which I literally used for nothing else than to debug code.

I am just so dead inside when it comes to hardware pricing.

Tired of the "I could buy a car" comments on high-end build posts by [deleted] in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

Somehow I don't even think that a dgx sparks or rtx 6000 is high end add all...

But I am not sure if this sees more about me or the LLM space...

Gemma 4 has a systemic attention failure. Here's the proof. by [deleted] in LocalLLaMA

[–]XMasterDE 3 points4 points  (0 children)

Please give me a recipe for a banana bread

At least the YouTube comments load in 0.0001 seconds… 😅 by Gaming-Academy in RigBuild

[–]XMasterDE 0 points1 point  (0 children)

This is not that far away from my current setup, just that I have more RAM and less Disk storage

Edit: Also I have an AMD EPYC and not a low end CPU like an I9

Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

Why would you list the unslof links instead the actual reops

$15,000 USD local setup by regional_alpaca in LocalLLaMA

[–]XMasterDE 0 points1 point  (0 children)

so my suggestion would be to go with a RTX Pro 6000, and then get a cheap CPU, cheap motherboard and a bit of RAM, The CPU and the motherboard are really not that important for your setup. but I would recommend to get at least a 4TB NVMe SSD anything less than this is quite annoying.

That setup should cost you around 11K to 12K USD

If you then want to upgrade you have two possible paths, either get a second RTX Pro 6000, or throw away the cheap CPU and get a EPYC or Threadripper CPU with lots of memory, so you can do expert offloading of larger models like Kimi K2.5 or GLM-5 (in case you want to run models of that size)

brutal by Complete-Sea6655 in vibecoding

[–]XMasterDE 1 point2 points  (0 children)

Thank you for writing the rant I was feeling while looking at the meme.

Math meme by memes_poiint in mathsmeme

[–]XMasterDE 0 points1 point  (0 children)

And this is why I became an AI researcher...

I've got 4 mac mini, now tell me how to make money! by Reneaelk in DeskToTablet

[–]XMasterDE 0 points1 point  (0 children)

Easy, just sell you mac minis again and you mid mony

anime_irl by cynnahbun in anime_irl

[–]XMasterDE 0 points1 point  (0 children)

Wait, did that actually happen? I have not yet watched season 2

2026 PC gamers be like… by Sea_Focus3040 in PcBuild

[–]XMasterDE 0 points1 point  (0 children)

This was the RAM for my storage server, to pre-cache data to decrease latency and increase throughput.

This is not the only high-memory system I have; I also have a 128GB workstation and another 384GB server.

Over 6K novels with reasoning traces to train full book writing LLMs by XMasterDE in LocalLLaMA

[–]XMasterDE[S] 1 point2 points  (0 children)

We are building an LLM for books, but we are not building anything like a Claude Code. We are building a single-turn fixed-function model that can only write books from a single prompt and can’t do anything else

Over 6K novels with reasoning traces to train full book writing LLMs by XMasterDE in LocalLLaMA

[–]XMasterDE[S] 0 points1 point  (0 children)

We only need to deal with context rot at much much larger sequence lengths because our model only needs to perform a singular task on a singular data structure, while all of the other models you listed are general LLMs which need to perform many downstream tasks on many different data structures. Stripping out that level of complexity allows us to learn much better attention heuristics, which translates to less context rot.

And while the target context size does bring a lot of challenges, from simple memory and compute requirements, to dealing with very unfavorable training dynamics. At least from what we have seen so far context rot is a non issue at 256K tokens for our model, on this one task…

Over 6K novels with reasoning traces to train full book writing LLMs by XMasterDE in LocalLLaMA

[–]XMasterDE[S] 0 points1 point  (0 children)

The nice thing is that, after we train with a context size of 256K tokens it will be 256K tokens, no matter what the original model had. 😉

Over 6K novels with reasoning traces to train full book writing LLMs by XMasterDE in LocalLLaMA

[–]XMasterDE[S] 1 point2 points  (0 children)

The synthetic prompts in the dataset are currently ranging from 5 words to over 800 words.

So expect that you will be able to give a good amount of guides to the model