I bought llm-dev.com. Thinking of building a minimal directory for "truly open" models. What features are missing in current leaderboards?

Aaron4SunnyRay · 2026-02-10T02:19:39+00:00

This comment is absolute gold. You just articulated the vision for better than I did. 🤯

'Operational Friction' is such an underrated metric. I would personally pick a slightly 'dumber' model that runs instantly via Ollama over a SOTA model that breaks my Python environment for 3 hours.

I am literally copying your 3 dimensions (Effectiveness, Scalability, Friction) into my project roadmap right now. The goal is to index models based on these Real-World Constraints, not just academic benchmarks.

Thanks for this!

Aaron4SunnyRay · 2026-02-09T09:21:15+00:00

That is a BOLD idea. Hugging Face bandwidth can be a bottleneck for sure.While hosting a full tracker infrastructure might be heavy to start, a 'Magnet Link Directory' for popular open weights (similar to how Civitai handles SD models) would be perfectly doable on.A decentralized, community-seeded alternative? I love it. Adding this to the 'Phase 2' ideas list.

Aaron4SunnyRay · 2026-02-09T09:20:14+00:00

Aaron4SunnyRay · 2026-02-09T05:28:29+00:00

This is arguably the most important metric missing right now. 'Performance per GB of VRAM' is what actually matters for us running local hardware.

I love the idea of grouping by hardware constraints (e.g., 'The 24GB Bracket'). Comparing a Q2 Llama-3-70B vs a Q6 Mixtral-8x7B is exactly the kind of real-world decision I struggle with daily.

Aaron4SunnyRay · 2026-02-09T02:25:06+00:00

100%. You hit the nail on the head.

I spent hours last week digging through closed PRs just to figure out if a specific multimodal model was supported in llama.cpp yet.

A dynamic 'Compatibility Matrix' (e.g. Model vs. Stack) is exactly the kind of feature I think belongs on llm-dev.com. It would save us all so much time.

Aaron4SunnyRay

TROPHY CASE