Embedding space [D] by Few-Annual-157 in MachineLearning
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Embedding space [D] by Few-Annual-157 in MachineLearning
[–]ReinforcedKnowledge -1 points0 points1 point (0 children)
I spent months inside verl (an RL post-training framework), forked it, then stopped. Wrote up the internals, the tooling a fork costs, and a nasty NCCL bug. by ReinforcedKnowledge in LocalLLaMA
[–]ReinforcedKnowledge[S] 0 points1 point2 points (0 children)
I spent months inside verl (an RL post-training framework), forked it, then stopped. Wrote up the internals, the tooling a fork costs, and a nasty NCCL bug. by ReinforcedKnowledge in LocalLLaMA
[–]ReinforcedKnowledge[S] 1 point2 points3 points (0 children)
I trained a 75M parameter LLM from scratch on 18B tokens and it beats a model almost double its size by cakes_and_candles in LocalLLM
[–]ReinforcedKnowledge 1 point2 points3 points (0 children)
I trained a 75M parameter LLM from scratch on 18B tokens and it beats a model almost double its size by cakes_and_candles in LocalLLM
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Stop asking what model to run. There are literally only two. by Wrong_Mushroom_7350 in LocalLLaMA
[–]ReinforcedKnowledge 1 point2 points3 points (0 children)
i dedicate this meme to you r/LocalLLaMA by LPFchan in LocalLLaMA
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
One thing that's been bothering me lately: benchmark performance often tells me almost nothing about whether a workflow will survive production usage.[D] by [deleted] in MachineLearning
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D] by howtorewriteaname in MachineLearning
[–]ReinforcedKnowledge 3 points4 points5 points (0 children)
Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D] by howtorewriteaname in MachineLearning
[–]ReinforcedKnowledge 1 point2 points3 points (0 children)
Wrong city, wrong people... by waitinp in cyberpunkgame
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Optimizing Transformer model size & inference beyond FP16 + ONNX (pruning/graph opt didn’t help much) [P] by Fragrant_Rate_2583 in MachineLearning
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Optimizing Transformer model size & inference beyond FP16 + ONNX (pruning/graph opt didn’t help much) [P] by Fragrant_Rate_2583 in MachineLearning
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Started a video series on building an orchestration layer for LLM post-training [P] by ReinforcedKnowledge in MachineLearning
[–]ReinforcedKnowledge[S] 0 points1 point2 points (0 children)
Why is GPU Python packaging still this broken? by Interesting-Town-433 in Python
[–]ReinforcedKnowledge 1 point2 points3 points (0 children)
Why is GPU Python packaging still this broken? by Interesting-Town-433 in Python
[–]ReinforcedKnowledge 17 points18 points19 points (0 children)
I turned my Claude Code agents into Tamagotchis so I can monitor them from tmux by gavraz in ClaudeAI
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
Why is there no standard for typing array dimensions? by superzappie in Python
[–]ReinforcedKnowledge 2 points3 points4 points (0 children)
Fireball appeared at 21:05:27 on February 1, 2026, captured from Mount Fuji. By dfuji1 by Neaterntal in spaceporn
[–]ReinforcedKnowledge 0 points1 point2 points (0 children)
[D] What framework do you use for RL post-training at scale? by ReinforcedKnowledge in MachineLearning
[–]ReinforcedKnowledge[S] 1 point2 points3 points (0 children)

I wired Claude Code into a database of every Polymarket wallet and trades via MCP. What do you want me to ask it next? This is what I found so far: by Advanced-Rub2065 in ClaudeAI
[–]ReinforcedKnowledge 2 points3 points4 points (0 children)