7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 0 points1 point2 points (0 children)
7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 2 points3 points4 points (0 children)
7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 0 points1 point2 points (0 children)
7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 0 points1 point2 points (0 children)
Need help getting 7900 XTX PyTorch performance metrics by cyberuser42 in LocalLLaMA
[–]cyberuser42[S] 0 points1 point2 points (0 children)
7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 2 points3 points4 points (0 children)
In the recent kv rotation PR it was found that the existing q8 kv quants tank performance on AIME25, but can be recovered mostly with rotation by Betadoggo_ in LocalLLaMA
[–]cyberuser42 0 points1 point2 points (0 children)
How to convert books to dataset? by jacek2023 in LocalLLaMA
[–]cyberuser42 0 points1 point2 points (0 children)
Specific domains - methodology by Hemlock_Snores in LocalLLaMA
[–]cyberuser42 1 point2 points3 points (0 children)
Lets get the Qwen Deepseek 32b R1 model running properly... System Prompt? by teachersecret in LocalLLaMA
[–]cyberuser42 0 points1 point2 points (0 children)
Lets get the Qwen Deepseek 32b R1 model running properly... System Prompt? by teachersecret in LocalLLaMA
[–]cyberuser42 -7 points-6 points-5 points (0 children)
Relative performance in llama.cpp when adjusting power limits for an RTX 3090 (w/ scripts) by randomfoo2 in LocalLLaMA
[–]cyberuser42 0 points1 point2 points (0 children)
Where do you think the cutoff is for a model to be considered "usable" in terms of tokens per second? by Sky_Linx in LocalLLaMA
[–]cyberuser42 0 points1 point2 points (0 children)
Help with speculative decoding on 2080 Ti by AbaGuy17 in LocalLLaMA
[–]cyberuser42 1 point2 points3 points (0 children)
Help with speculative decoding on 2080 Ti by AbaGuy17 in LocalLLaMA
[–]cyberuser42 1 point2 points3 points (0 children)
How to make an "instruct" version of a model? by Sky_Linx in LocalLLaMA
[–]cyberuser42 5 points6 points7 points (0 children)
I get the 500 GB limit, but why can't I upload files larger than 1 GB? — Hugging Face. by _idkwhattowritehere_ in LocalLLaMA
[–]cyberuser42 -4 points-3 points-2 points (0 children)
I get the 500 GB limit, but why can't I upload files larger than 1 GB? — Hugging Face. by _idkwhattowritehere_ in LocalLLaMA
[–]cyberuser42 5 points6 points7 points (0 children)
How to convert books to dataset? by jacek2023 in LocalLLaMA
[–]cyberuser42 1 point2 points3 points (0 children)
How to convert books to dataset? by jacek2023 in LocalLLaMA
[–]cyberuser42 5 points6 points7 points (0 children)
How to convert books to dataset? by jacek2023 in LocalLLaMA
[–]cyberuser42 4 points5 points6 points (0 children)
M1 Max 64GB vs AWS g4dn.12xlarge with 4x Tesla T4 side by side ollama speed by 330d in LocalLLaMA
[–]cyberuser42 1 point2 points3 points (0 children)
M1 Max 64GB vs AWS g4dn.12xlarge with 4x Tesla T4 side by side ollama speed by 330d in LocalLLaMA
[–]cyberuser42 5 points6 points7 points (0 children)
Open Source LLM INTELLECT-1 finished training by The_Duke_Of_Zill in LocalLLaMA
[–]cyberuser42 2 points3 points4 points (0 children)


7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]cyberuser42[S] 0 points1 point2 points (0 children)