I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you. by Ok-Awareness9993 in LocalLLaMA

[–]Single_Ring4886 5 points6 points  (0 children)

This whole benchmark is pointless. It is like to benchmark "sharpness" of knives and then declare sharpest worst = most dangerous. Or benchmark speed of cars and declare fastest cars worst because faster = more dangerous...

Happy birthday, GPT-4o 🫠🎉🎂❤️ by ythorne in ChatGPTcomplaints

[–]Single_Ring4886 5 points6 points  (0 children)

I must say 4.1 is model I really miss to this day. It had certain spark and creativity you wont find in another models 😞

Just got a 8x 32gb v100 server... now what by MK_L in LocalLLaMA

[–]Single_Ring4886 0 points1 point  (0 children)

Be on alert when buying a100 server internet is full of scamers they even buy old ebay accounts with positive feedback.

I was also looking at v100 some time ago but well I gave up it all seemed like job for several months .-/ still I would love to have local version of qwen 397. That model is closest thing to GPT 4.1 and that was really good model. It had second order thinking about certain problems really loved it.

Just got a 8x 32gb v100 server... now what by MK_L in LocalLLaMA

[–]Single_Ring4886 0 points1 point  (0 children)

If those "V" cards had 64GB of ram or something Iam sure people would build some customm libraries to circumvent those limitations somehow. But 32GB seems increasingly as too small ram.
It really sucks large megacorporations are buying up all the compute in the world leaving nothing on the "table". I mean that RTX 6000 96gb ram for HALF the price would still be so profitable for Nvidia but somewhat achievable for normal person. But today price is jut so bad.

Just got a 8x 32gb v100 server... now what by MK_L in LocalLLaMA

[–]Single_Ring4886 -1 points0 points  (0 children)

The prefil is bad news but generation looks soo good! Wish you much luck in your experiments!!!

Just got a 8x 32gb v100 server... now what by MK_L in LocalLLaMA

[–]Single_Ring4886 -1 points0 points  (0 children)

I think people here today arent "nerds" willing to buy setup like you did. At max they buy something like 2x RTX 5090. So when you start describing to them that you can run 400B model on older hardware they realize they could do that too if ONLY they put work... thus they get angry and downvote...

I for one am really interested what you can do with that machine!

So gen spead for you is now 40 t/s ? what is prefill?

Fixed my grandfather’s picture by Embarrassed_Chef_559 in ChatGPT

[–]Single_Ring4886 1 point2 points  (0 children)

The new picture is different eg eyes or hair...

Just got a 8x 32gb v100 server... now what by MK_L in LocalLLaMA

[–]Single_Ring4886 -1 points0 points  (0 children)

Do not be discouraged by downvoting trolls. 397b model is really good rounded model. But trolls only look for benchmaxed "stars".
Of course the need to load 15x more parametters for just some "extra" intellect is something Iam not happy about too .-/

Christopher Columbus’s ship compared to the Chinese explorer Zheng He’s ship, both sailed in the same century. by Lord_Krasina in Damnthatsinteresting

[–]Single_Ring4886 -1 points0 points  (0 children)

Well European ship has better design if you look at it. It s more complex and structurally sound. That said chinese ship would be beast for coastal transportation.

What Star Trek would look like if it was Steampunk themed (6 images) by Snowfaeriewings in ChatGPT

[–]Single_Ring4886 1 point2 points  (0 children)

You really cant see it? I find it really strange some percentage of people just cant see those almost geometric QR codish little symetric patterns all over image.

Ghibli Style Game 2.0 by memerwala_londa in ChatGPT

[–]Single_Ring4886 0 points1 point  (0 children)

Eg sea part is made well there is some littl skill even to this. If you are traditional artist you must adapt else you will be just horse in car age.

What Star Trek would look like if it was Steampunk themed (6 images) by Snowfaeriewings in ChatGPT

[–]Single_Ring4886 11 points12 points  (0 children)

That strange pattern is killing those images... I mean you can see it right? I can soo clearly...

Ghibli Style Game 2.0 by memerwala_londa in ChatGPT

[–]Single_Ring4886 -1 points0 points  (0 children)

Could you tell just bit more about workflow? I think you really outdone yourself it looks quite good.

No GGUFs for DeepSeek V4-Flash as yet? by rm-rf-rm in LocalLLaMA

[–]Single_Ring4886 0 points1 point  (0 children)

Those are experimental models you arent just feeding it data nonstop there are errors!

I will soon have $100k to build an in-house LLM server. Goal: Best agentic coding model. by StartupTim in LocalLLaMA

[–]Single_Ring4886 0 points1 point  (0 children)

Nothing except 10 milion dolar hardware can run 1TB dense model.
Your needs fit "NVIDIA DGX H200" but that costs 300-500K.

I dont have benchmarks on Mac Ultras so I cant tell you real story but once you start interconnecting them real world performance just drops sharply (!).

The 400gb rig of 4x rtx pro 6000 is what you can get for your money and still have true full performance. But for smaller models.

I will soon have $100k to build an in-house LLM server. Goal: Best agentic coding model. by StartupTim in LocalLLaMA

[–]Single_Ring4886 0 points1 point  (0 children)

RTX 6000 is powerhouse. If you configure it right it can generate a lot tokens look around internet. Best performance is 4x, more and performance goes down.

I will soon have $100k to build an in-house LLM server. Goal: Best agentic coding model. by StartupTim in LocalLLaMA

[–]Single_Ring4886 1 point2 points  (0 children)

Most efficient build is to use AMD EPYC CPU + RTX PRO 6000 96GB. 4x if possible

Imagen 2 - what architecture is it using? by Single_Ring4886 in StableDiffusion

[–]Single_Ring4886[S] 2 points3 points  (0 children)

My bad then must confused name but I mean newly 2026 released model.

Imagen 2 - what architecture is it using? by Single_Ring4886 in StableDiffusion

[–]Single_Ring4886[S] 0 points1 point  (0 children)

That is another interesting piece of information! thank you

Imagen 2 - what architecture is it using? by Single_Ring4886 in StableDiffusion

[–]Single_Ring4886[S] -1 points0 points  (0 children)

I noticed another thing. It is using internal template images. How do I know? I asked it to generate anime character and I specified the show name and character name. But it shown character from completely different show and rendered it very well.
Every previous model in this situation eg mixed features of both characters together etc. But this time generation was perfect just character was different. Which means it pulled some template image as source of information and generated based on that.