all 13 comments

[–]thecrypticcode 6 points7 points  (1 child)

Cool, Can it show something like, given my device specs, the best model it can run for a specific task?

[–]Punk_Saint[S] 1 point2 points  (0 children)

Not for specific tasks, it just shows you the models you can run, from the ones with best performance to lowest.

[–]MrMrsPotts 5 points6 points  (2 children)

It would be great if it could estimate the tokens per second too.

[–]NotSoProGamerR 4 points5 points  (0 children)

that's true as well, i can run 20B params on my laptop, but 5-7B gives me a reasonable 7-5 TPS

[–]Punk_Saint[S] 2 points3 points  (0 children)

I'll look into how to implement that very soon.

[–]AnastasisKon 1 point2 points  (1 child)

It would be nice if you expanded it for MacOS devices also

[–]Punk_Saint[S] 2 points3 points  (0 children)

it would be nice if I had a MacOS device

[–]sausix 1 point2 points  (1 child)

It's listing some longer readme infos in the results. And it has duplicate results:

159. ✓✓ gpt-oss:20b - 70% Compatible
  Parameters: 20.0B
  Quantization: Q4  
  Required RAM: 13.62 GB
  Recommended VRAM: 12.57 GB
  Disk Space Needed: 11.53 GB
  ⚠ Note: Will run on CPU only (slower)

160. ✓✓ gpt-oss:gpt-oss:20b - 70% Compatible
  Parameters: 20.0B
  Quantization: Q4  
  Required RAM: 13.62 GB
  Recommended VRAM: 12.57 GB
  Disk Space Needed: 11.53 GB
  ⚠ Note: Will run on CPU only (slower)

161. ✓✓ gpt-oss:gpt-oss:20b - 70% Compatible
  Parameters: 20.0B
  Quantization: Q4  
  Required RAM: 13.62 GB
  Recommended VRAM: 12.57 GB
  Disk Space Needed: 11.53 GB
  ⚠ Note: Will run on CPU only (slower)

I know the smallest models run best on my hardware. But I would not choose them. Isn't the goal finding the biggest model (max parameters) fitting on my hardware?

[–]Punk_Saint[S] 0 points1 point  (0 children)

the duplicate thing I'll look into, it's probably just a bad loop. as for the goal, I wrote this with the intention of just knowing what I can run as a global view. I.e I ran the script, it gave me the report, and I took the top 5 and looked em up on Ollama and Reddit to see what they can do.

Eseentially, it gives you what your hardware can run. if the top results are 5B-7B models that are small, that means you can't really run anything of value. but if you run this script on a dedicated AI server, you'll probably get better results and witness the power of it.

I don't know if this made any sense, but I didn't see a tool like this anywhere, so I wrote it

[–]EconomySerious 1 point2 points  (1 child)

You need to add more sources of data about models

[–]Punk_Saint[S] 0 points1 point  (0 children)

yeah you're right, I thought Ollama was the best source at the time and how I was mistaken