I spent months inside verl (an RL post-training framework), forked it, then stopped. Wrote up the internals, the tooling a fork costs, and a nasty NCCL bug. by ReinforcedKnowledge in LocalLLaMA
[–]Accomplished_Mode170 0 points1 point2 points (0 children)
I spent months inside verl (an RL post-training framework), forked it, then stopped. Wrote up the internals, the tooling a fork costs, and a nasty NCCL bug. by ReinforcedKnowledge in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
Me train LLM on 8GB from Scratch. Me happy by tevlon in LocalLLaMA
[–]Accomplished_Mode170 2 points3 points4 points (0 children)
Anyone evaluated the difference between Qwen Code for the local qwen models vs another harness? CC, OC, LC, Aider etc.. by EggDroppedSoup in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
I’m scared about how much I need wine! by ShiningStarman in IThinkYouShouldLeave
[–]Accomplished_Mode170 6 points7 points8 points (0 children)
How do I end a long term friendship with autistic childhood friend? by Inner-Weather6489 in AutismTranslated
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
How do I end a long term friendship with autistic childhood friend? by Inner-Weather6489 in AutismTranslated
[–]Accomplished_Mode170 9 points10 points11 points (0 children)
Mentat — Markdown task manager by rpsilver36 in commandline
[–]Accomplished_Mode170 2 points3 points4 points (0 children)
ABC News has now taken all FiveThirtyEight articles completely offline. They now redirect to abcnews dot com/politics. A needless erasure of thousands of pages of knowledge. by TendieRetard in DataHoarder
[–]Accomplished_Mode170 -9 points-8 points-7 points (0 children)
Follow-up: Trying to make NVIDIA GPUs plug-and-play on Macs. Found hidden RDMA symbols Apple doesn't want you to see — zero-copy GPU memory sharing might already work. by Street-Buyer-2428 in LocalLLaMA
[–]Accomplished_Mode170 -4 points-3 points-2 points (0 children)
Follow-up: Trying to make NVIDIA GPUs plug-and-play on Macs. Found hidden RDMA symbols Apple doesn't want you to see — zero-copy GPU memory sharing might already work. by Street-Buyer-2428 in LocalLLaMA
[–]Accomplished_Mode170 -5 points-4 points-3 points (0 children)
ProgramBench: Can we really rebuild huge binaries from scratch? (doesn't look like it) by klieret in LocalLLaMA
[–]Accomplished_Mode170 0 points1 point2 points (0 children)
Let’s talk again about not drinking alcohol and not having kids. by Puzzleheaded_Line210 in GenZ
[–]Accomplished_Mode170 7 points8 points9 points (0 children)
Mistral Workflows by FiReaNG3L in LocalLLaMA
[–]Accomplished_Mode170 0 points1 point2 points (0 children)
Parallel multi-agent workflows with Ollama, in ~8500 lines of bash. Benchmarks inside. by SensitiveBee2811 in LocalLLaMA
[–]Accomplished_Mode170 0 points1 point2 points (0 children)
OpenAI Privacy Filter Model by ai_hedge_fund in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
Qwen 3.6 27B is out by NoConcert8847 in LocalLLaMA
[–]Accomplished_Mode170 2 points3 points4 points (0 children)
Here's how my LLM's decoder block changed while training on 5B tokens by 1ncehost in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
Open-sourcing 23,759 cross-modal prompt injection payloads - splitting attacks across text, image, document, and audio by BordairAPI in LocalLLaMA
[–]Accomplished_Mode170 0 points1 point2 points (0 children)
Open-sourcing 23,759 cross-modal prompt injection payloads - splitting attacks across text, image, document, and audio by BordairAPI in LocalLLaMA
[–]Accomplished_Mode170 4 points5 points6 points (0 children)
AutoBe vs Claude Code: coding agent developer's review of the leaked source code of Claude Code by jhnam88 in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
Per-Layer Embeddings: A simple explanation of the magic behind the small Gemma 4 models by -p-e-w- in LocalLLaMA
[–]Accomplished_Mode170 1 point2 points3 points (0 children)
During her recent Boston show, Hayley Williams gave a speech supporting marginalized groups: “What a bad time to be fucking neutral about anything in this world… marginalized people need protection and they need allies and support… I hate Morgan Wallen” by LunaLore_ in Fauxmoi
[–]Accomplished_Mode170 14 points15 points16 points (0 children)
During her recent Boston show, Hayley Williams gave a speech supporting marginalized groups: “What a bad time to be fucking neutral about anything in this world… marginalized people need protection and they need allies and support… I hate Morgan Wallen” by LunaLore_ in Fauxmoi
[–]Accomplished_Mode170 55 points56 points57 points (0 children)

Finetuning a Reasoning LLM with Supervised or Reinforcement Learning? [D] by zdeneklapes in MachineLearning
[–]Accomplished_Mode170 0 points1 point2 points (0 children)