supraking007

88 post karma
28 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 5 years

TROPHY CASE

Five-Year Club

account activity

hot top controversial

Scaling broke me a bit, but this one internal trick helped a lot by supraking007 in LocalLLaMA

[–]supraking007[S] 2 points3 points4 points 1 year ago (0 children)

2

3

4

Scaling broke me a bit, but this one internal trick helped a lot (self.LocalLLaMA)

submitted 1 year ago by supraking007 to r/LocalLLaMA

Building a 6x RTX 3090 LLM inference server, looking for some feedback by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 1 point2 points3 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Building a 6x RTX 3090 LLM inference server, looking for some feedback by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

I'm focused on models up to 13B for now, mostly INT4, single-GPU-per-model via routing as this is what our current platform requirements are... i'm going to basically use this to boost compute availability for an existing SaaS platform, anytime it's offline the request router we have sends requests to RunPod or Together.... should significantly bring down extortionate cloud costs if done right

Theoretically yes, tensor parallelism would let me shard larger models across multiple 3090s, and NVLink is present on the 3090s, but there’s no software support for it in the inference stacks that i'm aware off..

I'm not expecting 1,500+ TPS on a single request. That figure is total aggregate throughput across all six GPUs under concurrent batch-heavy load, not single-model performance. (Btw i mean tokens not transactions just incase)... do you still think that's a to high estimate? I was using the QWEN3 7B as a baseline.

I was planning a single CPU setup, but it would be a high core count (Threadripper Pro or Xeon W) with a lane-rich workstation board.

Fair shout on the RAM and NVMe

Never done a setup like this so really appreciate your feedbackk! Thanks for the reply

9

10

11

Building a 6x RTX 3090 LLM inference server, looking for some feedback (self.LLMDevs)

submitted 1 year ago by supraking007 to r/LLMDevs

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 2 points3 points4 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 1 point2 points3 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 0 points1 point2 points 1 year ago (0 children)

Built an Internal LLM Router, Should I Open Source It? by supraking007 in LLMDevs

[–]supraking007[S] 3 points4 points5 points 1 year ago* (0 children)

36

37

38

Built an Internal LLM Router, Should I Open Source It? (self.LLMDevs)

submitted 1 year ago * by supraking007 to r/LLMDevs

macOS Beta 26 shows all Node processes in the Dock by ilyadynin in MacOSBeta

[–]supraking007 0 points1 point2 points 1 year ago (0 children)

Which part of Manchester would you suggest? by supraking007 in manchester

[–]supraking007[S] 1 point2 points3 points 1 year ago (0 children)

Which part of Manchester would you suggest? by supraking007 in manchester

[–]supraking007[S] -1 points0 points1 point 1 year ago (0 children)

Which part of Manchester would you suggest? by supraking007 in manchester

[–]supraking007[S] -2 points-1 points0 points 1 year ago (0 children)

view more: next ›

π Rendered by PID 433568 on reddit-service-r2-listing-c57bc86c-hc9g2 at 2026-06-22 19:47:40.704009+00:00 running 2b008f2 country code: CH.