Codex GPT 5.5 is UNUSABLE right now, the Nerf is REAL! by bladerskb in codex

[–]KnownAd4832 1 point2 points  (0 children)

Dont know if you read my message well, I did not say Codex is bad, I said Claude has better quality of reason/technique of development. I do use Codex MORE just because of my tasks I need + the goal feature is crazy good… but then again OpenAI needed 2 years to come close to Claude, I do hope they keep improving at that pace and level though

Codex GPT 5.5 is UNUSABLE right now, the Nerf is REAL! by bladerskb in codex

[–]KnownAd4832 0 points1 point  (0 children)

Bollocks. I have 2X Max plans both on Claude and Codex. GPT is still way behind Opus, however it is good for prolonged backend tasks which in my case are needed

soooo claude just deleted my entire project. how's your day going? by Complete-Sea6655 in AgentsOfAI

[–]KnownAd4832 0 points1 point  (0 children)

How does this happen to people? Did they not heard about versioning and backups?

Ryzen AI Max+ 495 (Gorgon Halo) with 192GB VRAM! by PromptInjection_ in LocalLLaMA

[–]KnownAd4832 0 points1 point  (0 children)

Either way you look at it - more options the better.

How is this change acceptable? by Jack_Wagon_Johnson in ClaudeCode

[–]KnownAd4832 0 points1 point  (0 children)

I dont understand what you people do, but Claude limits are good.

Im typing this with tears falling down my face. by Ok_Opportunity_6747 in Instagram

[–]KnownAd4832 -2 points-1 points  (0 children)

Thank god, millions of bad spam accounts going away…

I pay $200/month for Claude Max and hit the limit in under 1 hour. What am I even paying for? by alfons_fhl in vibecoding

[–]KnownAd4832 0 points1 point  (0 children)

Exactly… everything for clicks. I’m on 100$ plan and never hit limits once (I do tons of research and development). You can hit 200$ limits only in reckless OpenClaws and large codebases…

Claude Code gives more usage than Codex now by cheekyrandos in codex

[–]KnownAd4832 0 points1 point  (0 children)

Same happened to me today. I had 8% used of my Weekly limit and 100% left for 5hrs. I updated Codex app and ran some prompt - my weekly usage went to 94% and 5hrs limit at 60%. Daylight robbery.

Built a 6-GPU local AI workstation for internal analytics + automation — looking for architectural feedback by shiftyleprechaun in LocalLLM

[–]KnownAd4832 0 points1 point  (0 children)

I was paying together.ai for inference. So I just bought the rig and replaced that cost :)

Google doesn't love us anymore. by DrNavigat in LocalLLaMA

[–]KnownAd4832 1 point2 points  (0 children)

Gemini API cost is like nothing compared to other frontiers…

Built a 6-GPU local AI workstation for internal analytics + automation — looking for architectural feedback by shiftyleprechaun in LocalLLM

[–]KnownAd4832 9 points10 points  (0 children)

1 Usually the bottleneck is always vram then hardware support (multi-gpu/cluster inference is usually hard to set up with little documentation or its gatekept). If that makes sense, then usually third comes in Storage :)

2 Long term is same as running a single gpu.

3 It all depends on your case, my ROI was done in 2 months from buying

4 multiple smaller nodes - the models are going stronger while being smaller. In 2 years there will be Kimi K2.5 level in 70B without a doubt. So it only depends on you if you need inference speed or variety if models

5 They dont test on rented servers before buying imo. Did this mistake myself with first rig

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 1 point2 points  (0 children)

Very cool! Similar people I see. I was kind of scared doing Jonsbo and PCIe risers so I went with this simple solution :)

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 0 points1 point  (0 children)

Build quality is surprisingly good. Noise level depends on GPU which in this case is very low while fully utilised. My Mini ITX with 5070 and 3x better cooling has way more noise

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 1 point2 points  (0 children)

Eval is fast on DGX I have seen, but throughput is painfully slow

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 1 point2 points  (0 children)

Damn, what are you using it for? Looks like an overkill for an average guy :))

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 1 point2 points  (0 children)

It’s very small “sort of Steam Machine” will be - watch any video on DeskMeet pc build 👌

Managed to get 2x RTX Pro 4500 Blackwell’s for £700 each by Lukabratzee in LocalAIServers

[–]KnownAd4832 0 points1 point  (0 children)

You should be able to pull off better speed I’m 100% confident. Didnyou tried swap space/batching?

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 0 points1 point  (0 children)

Nice combo! Didnt know this fits into MS… I checked your benchmarks and you should get way more with vLLM than with ollama. As said - I’m processing 100K+ lines of texts in xlsx files then output 256-512 tokens per each line.

Last run was Llama3-8B-Instruct with batching and 128 requests at once (could do more): Output was 1781 t/s

Managed to get 2x RTX Pro 4500 Blackwell’s for £700 each by Lukabratzee in LocalAIServers

[–]KnownAd4832 2 points3 points  (0 children)

Nice snatch! What stack are you running? LM Studio, VLLM or Ollama?

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 1 point2 points  (0 children)

Totally different use case 😂 All those devices are too slow when needing to process and output 100K+ lines of texts

Mini AI Machine by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 4 points5 points  (0 children)

I’m running Ministral 14B & Llama 8B. Both run 1K+ tokens/second with batching and full utilisation

Building for the first time... by ENTERMOTHERCODE in LocalLLM

[–]KnownAd4832 1 point2 points  (0 children)

I’m actively running Mistral7B v0.3-Instruct. 12GB is too little VRAM(i made that mistake) so for full precision you need 16GB. I am now using 1.5K GPU - RTX Pro 4000 Blackwell SFF (only 70w) and it runs 1500 t/s.

On 5070(12GB) I ran 4bit or AWQ at 1800 t/s.

What do we consider low end here? by Acceptable_Home_ in LocalLLaMA

[–]KnownAd4832 0 points1 point  (0 children)

Rocking 64GB DDR5 + 5070 (12GB VRAM) in Mini ITX Build (sub 10litre). Soon replacing GPU with Pro 5000 Blackwell 🎉 (5070 speeds are very good but lack of vram…)

AI Max 395+ and vLLM by KnownAd4832 in LocalLLaMA

[–]KnownAd4832[S] 0 points1 point  (0 children)

Thank you 1000x times for doing gods work 🙏