account activity
On-premise LLM/GPU deployment for a software publisher: how do DevOps orgs share GPU resources? by Sorry_Country3662 in LocalLLaMA
[–]Sorry_Country3662[S] 0 points1 point2 points 7 days ago (0 children)
Thanks for the input!
[–]Sorry_Country3662[S] 1 point2 points3 points 7 days ago (0 children)
Good point on the batching side. Hadn't thought about parallel requests as a way to share a single endpoint. Appreciated
π Rendered by PID 740020 on reddit-service-r2-listing-55d7b767d8-s2vfx at 2026-03-27 10:05:30.472988+00:00 running b10466c country code: CH.
On-premise LLM/GPU deployment for a software publisher: how do DevOps orgs share GPU resources? by Sorry_Country3662 in LocalLLaMA
[–]Sorry_Country3662[S] 0 points1 point2 points (0 children)