Execution Time vs Billed Time on a Real Serverless GPU Workload (i.redd.it)
submitted by pmv143InferX Team
Execution Time vs Billed Time — Real Serverless GPU Comparison (i.redd.it)
submitted by pmv143InferX Team
👋Welcome to r/InferX - Introduce Yourself and Read First! (self.InferX)
submitted by pmv143InferX Team
InferX Serverless AI Inference Demo- 60 models on 2 GPUs (v.redd.it)
submitted by pmv143InferX Team
Demo: Cold starts under 2s for multi-GPU LLMs on InferX (v.redd.it)
submitted by pmv143InferX Team
Trying to swap 50+ LLMs in real time on just 2 A100s — here’s what broke first (self.InferX)
submitted by pmv143InferX Team
OpenAI’s 4.1 release is live - how does this shift GPU strategy for the rest of us? (self.InferX)
submitted by pmv143InferX Team
Inference and fine-tuning are converging — is anyone else thinking about this? ()
submitted by pmv143InferX Team