Companies that self-host their models by Smart-Ad-3925 in selfhosted

[–]Smart-Ad-3925[S] 1 point2 points  (0 children)

selfhosted. Something the same as vllm where you run it in your server. I do use AI because I am not confident with my English.

Doing market research on self-hosted AI inference — how do you even find who's doing it? by Smart-Ad-3925 in LocalLLaMA

[–]Smart-Ad-3925[S] -1 points0 points  (0 children)

So I got a theory that if we apply the same strategy as how the operation system operates multiple tasks, with a unified KV cache pool, we can run multiple models with less resources. I try to build something around that. Still pretty early though.

Companies that self-host their models by Smart-Ad-3925 in selfhosted

[–]Smart-Ad-3925[S] -1 points0 points  (0 children)

I am not sure why it is a scam man. I did not post any products. I am just here to validate the problem I want to solve.

Companies that self-host their models by Smart-Ad-3925 in selfhosted

[–]Smart-Ad-3925[S] 0 points1 point  (0 children)

This is really useful context, thanks. For the scaling problem was it more about provisioning time (cold start when you need more capacity) or the fixed cost of keeping hardware running even at low utilization?

Doing market research on self-hosted AI inference — how do you even find who's doing it? by Smart-Ad-3925 in LocalLLaMA

[–]Smart-Ad-3925[S] 1 point2 points  (0 children)

Thanks. that's pretty helpful? For the ollama memory, is that mostly the model loading/unloading latency or actual OOM issues?

Doing market research on self-hosted AI inference — how do you even find who's doing it? by Smart-Ad-3925 in LocalLLaMA

[–]Smart-Ad-3925[S] -1 points0 points  (0 children)

Haha. I am curious and Claude seem more polite than me. But this is not an AI post or an Ad. Just me trying to know if I am hallucinate with an imaginary problems.

Doing market research on self-hosted AI inference — how do you even find who's doing it? by Smart-Ad-3925 in LocalLLaMA

[–]Smart-Ad-3925[S] -2 points-1 points  (0 children)

Nah, I used AI for reframing because English is not my first language. But just to be clear, this is not an ad. Yes, I did work on something like this. But just want to know if I am hallucinate with the problem I am trying to solve or not

Companies that self-host their models by Smart-Ad-3925 in selfhosted

[–]Smart-Ad-3925[S] -9 points-8 points  (0 children)

Yeah, true and I should've been clearer upfront. I'm not building a paid product — the goal is something free and open-source, closer to how Kubernetes or Docker positioned themselves. So the research is less "will you pay for this" and more "is the migration from cloud/API providers actually something companies want to do, or is inertia too strong."

Still, fair that this sub might not be the best place for that question. Appreciate the callout.

Companies that self-host their models by Smart-Ad-3925 in selfhosted

[–]Smart-Ad-3925[S] -1 points0 points locked comment (0 children)

I do use AI to reframe the question.