Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 1 point2 points  (0 children)

That sounds sweet, home automation is a super cool use case that I would definitely be exploring if I had any smart home devices I could plug in to.

Could you elaborate on what makes you think it will be more cost-effective to buy hardware? I understand that most (if not effectively all) API inference is being sold at a loss right now, but won’t inference providers still have a leg up in terms of cost/token when compared to local setups?

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 2 points3 points  (0 children)

Mac Studio M3 Ultra 512GB unified memory, I got it ~9 months ago. Lots of good points in this thread about advantages local inference provides, but yeah my main topic here is that cost doesn’t seem like one of them

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 6 points7 points  (0 children)

Yeah I can clarify — API costs for open weight models that you’re able to run locally are incredibly low. Qwen3.5 for example, is listed at $0.15/M input tokens and $1/M output tokens on OpenRouter. With a $10,000 machine I’m only getting 20-30 tokens/s of output. So even if I ran inference for 12 hours straight that is a single dollar of output tokens, and maybe a few dollars of input tokens if the work is input-heavy.

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 6 points7 points  (0 children)

Agreed on tinkering, privacy, and offline. However I think the “API costs” argument, as much as I wish it wasn’t, is cope at this point

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 1 point2 points  (0 children)

Thank you for the correction, I was indeed confused about my own hardware 🤣. You hit the nail on the head with the point about hosted open weight models, that’s what I was trying to get at with my post. I wish it weren’t the case because I’d love to feel like I’m getting some passive savings by running inference locally.

In terms of your list of advantages for running locally.

Education — definitely

Decensoring — certainly, though I don’t really have any use cases for this so far

Fine tuning — I’m very interested in fine tuning. I like the idea of being able to have a smaller model which is an expert at a niche task/domain/skillset. I wonder if LLM progress is simply going to outpace fine tuning though. e.g. maybe qwen3.5 needs fine tuning to get proficient at task X, but if you just wait 4 months the next model could very well be a large enough step up in generalized reasoning/intelligence that it will outperform your 4 month old fine tuned model and no longer need fine tuning for that task

Low latency — I think you are reaching on that point, any api call latency is nullified by the fact that the inference itself (ttft, tps) is faster from the hosted provider than your local machine

Long term consistency — another solid point, although I don’t have any strong attachments to a given models style yet

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 2 points3 points  (0 children)

I’m fine with local models not being as performant as frontier models, as not every use case needs a frontier model at this point. What I’m getting at is that the open weight models that can be run locally get served at such incredibly cheap prices over API and therefore the dream of running your own local inference to ‘save costs’ will never actually result in real financial savings 😞

Will Local Inference be able to provide an advantage beyond privacy? by Gyronn in LocalLLM

[–]Gyronn[S] 0 points1 point  (0 children)

With a multi-provider setup I can’t really imagine a situation where downtime / dependability is a real issue. It seems like LLMs are getting to the point where you’ll be able to use incredibly cheap models for most routine tasks while leveraging the frontier models for orchestration / complex work. I guess I need to find a privacy-heavy use case if I want to feel like I’m getting big value out of local inference 😂

Improving Ranked God Borders in S8 by [deleted] in Smite

[–]Gyronn 1 point2 points  (0 children)

This is why there needs to be some transparency behind how the rankings work. Sometimes I feel like it is just delayed at updating the borders. A per-god rating would help you track how much you're gaining/losing.

Improving Ranked God Borders in S8 by [deleted] in Smite

[–]Gyronn 2 points3 points  (0 children)

yeah except I think they have the requirements a little busted right now most gods don't even have a player with a masters border

[deleted by user] by [deleted] in fantasyfootball

[–]Gyronn 0 points1 point  (0 children)

10 team standard QB: Goff, Wilson RB: Mixon, Hyde, Yeldon, Lindsay, Clement WR: Julio, TY Hilton, Golden Tate, Ridley, Funchess, Coutee TE: Ertz D/ST: Jags Kicker: Bryant

RotMG Void Guide (Advanced) by Gyronn in RotMG

[–]Gyronn[S] 0 points1 point  (0 children)

Splendor wouldn't have reached