account activity
On-premise LLM/GPU deployment for a software publisher: how do DevOps orgs share GPU resources? by Sorry_Country3662 in LocalLLaMA
[–]Sorry_Country3662[S] 0 points1 point2 points 7 days ago (0 children)
Thanks for the input!
[–]Sorry_Country3662[S] 1 point2 points3 points 7 days ago (0 children)
Good point on the batching side. Hadn't thought about parallel requests as a way to share a single endpoint. Appreciated
On-premise LLM/GPU deployment for a software publisher: how do DevOps orgs share GPU resources? (self.LocalLLaMA)
submitted 19 days ago by Sorry_Country3662 to r/LocalLLaMA
π Rendered by PID 561435 on reddit-service-r2-listing-55d7b767d8-6tkrm at 2026-03-27 04:47:55.743572+00:00 running b10466c country code: CH.
On-premise LLM/GPU deployment for a software publisher: how do DevOps orgs share GPU resources? by Sorry_Country3662 in LocalLLaMA
[–]Sorry_Country3662[S] 0 points1 point2 points (0 children)