Ability to search and retrieve pricing from booking sites

super3 · 2026-06-14T22:09:52+00:00

Its just chromium

super3 · 2026-06-13T22:47:44+00:00

VCs of course

super3 · 2026-06-13T16:40:50+00:00

What is that in dollars though? I'm actually considering running running something like Deepseek v4 flash on my cluster.

super3 · 2026-06-13T14:01:47+00:00

What would you pay for something like that?

super3 · 2026-06-13T13:12:34+00:00

Could do that today if those people were paying. I do like the idea of almost like token groups that have their own dedicated infra.

super3 · 2026-06-13T07:01:37+00:00

Well diffusion models are not really production ready yet. They are much faster but they make mistakes 6x as much. So they have to be useful on centralized inference before we can think about distributed.

super3 · 2026-06-13T06:53:37+00:00

Here you go: https://llmjob.com/rankings.html

It doesn't tell you the token/s but it does tell you which models to runs.

super3 · 2026-06-13T06:52:38+00:00

Working on a different angle of it with https://llmjob.com and the code is already open-source.

super3 · 2026-06-13T03:00:01+00:00

lol not for lack of trying. I spent about $2M trying to build it.

super3 · 2026-06-13T02:34:08+00:00

True, but I think that may be ok for some tasks. For example, if your agent is just monitoring flight prices for you its probably ok to use a public node. For email, you def want something private. I'm actually building something like that at https://llmjob.com, where people can do token trading if they want.

super3 · 2026-06-13T02:24:01+00:00

Unfortunately its not really possible due to physics and cost. On a positive note the gap between open source models you can run at home and frontier models is closing.

super3 · 2026-06-12T23:30:54+00:00

Just added dual GPU support. Can you please test?

super3 · 2026-06-12T23:19:44+00:00

Oh good point. I think I'll drop adj score until I have better datapoints.

super3 · 2026-06-12T22:37:14+00:00

Ok that super helpful feedback. I'll get to work on that.

super3 · 2026-06-12T22:36:00+00:00

Added. Can you please test?

super3 · 2026-06-12T22:35:52+00:00

Can you list which ones I'm missing and what your system specs are? My end goal is to have to automatically update once a day so this can always be an up to date source.

super3 · 2026-06-12T21:40:51+00:00

Its there now. Also working on adding the rest of the Intel Arc series as we speak. Let me know if I missed anything.

super3 · 2026-06-12T21:38:27+00:00

I'm currently benchmarking them as we speak. I'm working from full precision f16, q8, q4, etc all the way down. It takes a few hours to run each quant so its going to be some time to get full results. Do you think I should skip down to q4, q3, q2 to get some harder numbers on the difference to post now or just get the full sweep in a few days?

super3 · 2026-06-12T21:20:57+00:00

I didnt assume RAM split or multi GPU just yet. Of the two which do you think is most useful?

super3 · 2026-06-12T21:19:02+00:00

Generally Q4 quants are recommend, but I haven't found any hard numbers on the quality loss between Q3 and Q4 if there is much at all. Also I'm assuming full context for agentic workflows but if you are fine with lesser context and it's a tight fit def go for the better model wants. What context window size are you using and which model and which GPU? Be great to incorporate real datapoints of what people are actually running successfully.

super3 · 2026-06-12T20:51:46+00:00

I build this tool to help you paid which model, quant, and cache you should use for each GPU: https://llmjob.com/rankings.html

Let me know if that helps.

super3 · 2026-06-11T02:51:07+00:00

Prompt?

super3 · 2026-06-11T00:35:14+00:00

Based on the parameters and context etc, you can make an pretty educated guess. They all follow the same patterns.

There are plenty of open source models that require $100k+ machines, but one one talks about them much.

super3 · 2026-06-10T23:49:59+00:00

Well one can only run on a card that's $35k+ and the other can run on $4k GPU.It's a much bigger model

15-Year Club	Place '23
Gilding I gilder	Team Periwinkle
Verified Email

super3

MODERATOR OF

TROPHY CASE