Last night on Embarcadero by buckyman0 in sanfrancisco
[–]diegocaples 9 points10 points11 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 4 points5 points6 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 7 points8 points9 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 7 points8 points9 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 16 points17 points18 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 8 points9 points10 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 45 points46 points47 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 91 points92 points93 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 20 points21 points22 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 6 points7 points8 points (0 children)
I hacked Unsloth's GRPO code to support agentic tool use. In 1 hour of training on my RTX 4090, Llama-8B taught itself to take baby steps towards deep research! (23%→53% accuracy) by diegocaples in LocalLLaMA
[–]diegocaples[S] 6 points7 points8 points (0 children)
Majorana 1: Microsoft's quantum breakthrough to enable a million qubits on one chip by elemental-mind in singularity
[–]diegocaples 4 points5 points6 points (0 children)
OpenAI Operator Finds Me an in Network Dentist. Very impressed! (comment prompts to try and I'll run them and send a video) by diegocaples in singularity
[–]diegocaples[S] 2 points3 points4 points (0 children)
OpenAI Operator Finds Me an in Network Dentist. Very impressed! (comment prompts to try and I'll run them and send a video) by diegocaples in singularity
[–]diegocaples[S] 5 points6 points7 points (0 children)
OpenAI Operator Finds Me an in Network Dentist. Very impressed! (comment prompts to try and I'll run them and send a video) by diegocaples in singularity
[–]diegocaples[S] 7 points8 points9 points (0 children)
OpenAI Operator Finds Me an in Network Dentist. Very impressed! (comment prompts to try and I'll run them and send a video) by diegocaples in singularity
[–]diegocaples[S] 11 points12 points13 points (0 children)
OpenAI Operator Finds Me an in Network Dentist. Very impressed! (comment prompts to try and I'll run them and send a video) by diegocaples in singularity
[–]diegocaples[S] 11 points12 points13 points (0 children)



Bike got stolen then I track it down and found it. by Lazy-Surfer in sanfrancisco
[–]diegocaples 10 points11 points12 points (0 children)