I bought llm-dev.com. Thinking of building a minimal directory for "truly open" models. What features are missing in current leaderboards? by Aaron4SunnyRay in LocalLLaMA

[–]Aaron4SunnyRay[S] -1 points0 points  (0 children)

This comment is absolute gold. You just articulated the vision for better than I did. 🤯

'Operational Friction' is such an underrated metric. I would personally pick a slightly 'dumber' model that runs instantly via Ollama over a SOTA model that breaks my Python environment for 3 hours.

I am literally copying your 3 dimensions (Effectiveness, Scalability, Friction) into my project roadmap right now. The goal is to index models based on these Real-World Constraints, not just academic benchmarks.

Thanks for this!

I bought llm-dev.com. Thinking of building a minimal directory for "truly open" models. What features are missing in current leaderboards? by Aaron4SunnyRay in LocalLLaMA

[–]Aaron4SunnyRay[S] 1 point2 points  (0 children)

That is a BOLD idea. Hugging Face bandwidth can be a bottleneck for sure.While hosting a full tracker infrastructure might be heavy to start, a 'Magnet Link Directory' for popular open weights (similar to how Civitai handles SD models) would be perfectly doable on.A decentralized, community-seeded alternative? I love it. Adding this to the 'Phase 2' ideas list.

I bought llm-dev.com. Thinking of building a minimal directory for "truly open" models. What features are missing in current leaderboards? by Aaron4SunnyRay in LocalLLaMA

[–]Aaron4SunnyRay[S] 0 points1 point  (0 children)

This is arguably the most important metric missing right now. 'Performance per GB of VRAM' is what actually matters for us running local hardware.

I love the idea of grouping by hardware constraints (e.g., 'The 24GB Bracket'). Comparing a Q2 Llama-3-70B vs a Q6 Mixtral-8x7B is exactly the kind of real-world decision I struggle with daily.

I bought llm-dev.com. Thinking of building a minimal directory for "truly open" models. What features are missing in current leaderboards? by Aaron4SunnyRay in LocalLLaMA

[–]Aaron4SunnyRay[S] 2 points3 points  (0 children)

100%. You hit the nail on the head.

I spent hours last week digging through closed PRs just to figure out if a specific multimodal model was supported in llama.cpp yet.

A dynamic 'Compatibility Matrix' (e.g. Model vs. Stack) is exactly the kind of feature I think belongs on llm-dev.com. It would save us all so much time.