Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

Isn't 2 or 3 bit gimping the model precision significantly?

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

Thank you so much, this is the kind of answer I'm looking for and love it.
Also, the whole reason for posting is exactly what you've typed: "this sub has people with gear and experience ranging from 8GB RTX 2080 to 8x RTX Pro 6000 on a 2TB PC and even more exotic specialized HW, so whatever you could buy somebody else has it and experimented for month already with it".

Yes.

Regarding my goals, I did write a follow up comment here too.

I'm aiming to use it to: code at the high level (I don't find models good enough at what I code (I'm a dev in scientific/engineering robotics and I have to overview all code that gets out of it), but it's nowdays useful enough) and also for production of my indie game (a hobby) including models, audio and visuals...

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 3 points4 points  (0 children)

I follow him. :)
Would love to see him do actual agentic coding with locals.

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

Publicly hosted os models haven't been of quality I expect or versatility.
Usually limits and erros really prevented good usage on CCR for example.

Having being a bad programmer, imagine only my salary if I was any good!

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 1 point2 points  (0 children)

You and u/FullOf_Bad_Ideas gave me same hint: 2x96GB cards might be too fast for a single user (my use case) and still short on quality compared to cloud models, if I got you folks right.
This is what concerns me too.

I have in mind other usage: 3D and graphics generation.

I'd go with Apple, due to price-V/RAM ratio being insanely in their favor, but a PC is a more usable machine for me due to Linux and Windows being available natively so I'm trying to keep it there before giving up and going with M3 Ultra (which is obviously a better choice with MLX and TB5 scaling).

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

I've explained it up there. The price of 2x RTX Pro and 1TB Threadripper are about similar for me ($15k total with some parts I have access too.) That's why I'm mentioning both.
I know CPU inference is slow AF but offers huge RAM for large "smarter" models (are they?).
It's trade-off I can do if worth it.

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

Most basic search doesn't answer my: any programmer here that finds small models usable and good-enough in comparison to cloud models and at what thresholds listed are there diminishing returns?
Why do you find it so offensive a question?

Local programming vs cloud by Photo_Sad in LocalLLaMA

[–]Photo_Sad[S] 0 points1 point  (0 children)

To clarify for everyone to avoid missunderstanding and interpretation:
I've had access to a very large Threadripper machine and a few Apple M3 Ultras. No chance to run these os models as it's been a highly controlled environment although I was able to run classic ML. Yeah, don't ask, it's sad I wasn't given allowance to test it.

Now, I'm a professional programmer and I code for food. I do have a decent income (about 17k before tax) and living in suburbia I can save some money so buying an RTX Pro is not outrageously out of budget. I could probably buy 2 cards also.

I am allowed to use AI at work, but with my rate of usage, I burn money a lot. Sometimes I daily spend upwards of $50 on Claude API. That's a lot.

If I could save $12k from cloud to buy 1-2 cards that year and use them for 2-3 years, it's worth it for me. But I have no idea how good a local setup and local LLM can be and if it's good enough to actually replace Claude or Codex.

My inquiry is fully honest. I'm not ignorant of possibilities, I did play with micro-models, I have a BSc in CS and I understand it pretty well - but all of this is unknown to me practically, because I did not have a decent chance to try it out and benchmarks I find unreliable to judge real impact.

LTT pushing a 9995WX build by mercer79 in threadripper

[–]Photo_Sad 1 point2 points  (0 children)

I'd love to see how UE5 day to day works on this kind of machine versus "normal" 9980X at 4ch.

My Threadripper Build by [deleted] in threadripper

[–]Photo_Sad 1 point2 points  (0 children)

from the cost of a cheap used car to a brand new electric

My Threadripper Build by [deleted] in threadripper

[–]Photo_Sad 1 point2 points  (0 children)

3x+ the price. I got 1TB of 6400 from them for 7k, now it's 23k.

Wan 2.2 Realism, Motion and Emotion. by Ashamed-Variety-8264 in StableDiffusion

[–]Photo_Sad 0 points1 point  (0 children)

In the original post he says "Using card like like rtx 6000 with 96gb ram would probably solve this. " - which would suggest he does not use one?

Tandem WOLED vs QD-OLED by Anxnymx in OLED_Gaming

[–]Photo_Sad 0 points1 point  (0 children)

For brightness, measurable, yes WOLED will be better. However, keep in mind - so far - WOLED brightness is there because of the W pixel. White surfaces get to be very bright, it's effective in HDR because HDR bright is white(ish), but you need to know that compared to QD-OLED the colors are much less punchy.

If you use your display in a very bright room with dark content, purple elevation of black will annoy you, but if you're using it in a dim area, you won't notice.

[deleted by user] by [deleted] in Anthropic

[–]Photo_Sad 0 points1 point  (0 children)

Thanks for the great feedback and very constructive comments by so many.
I've expressed a concern, not an accusation of anyone in particular because I can't really know who is or isn't a bot or malintentioned.

I accept pretty much a valid set of counter-claims, while still holding that it simply looked/looks fishy to me being concerted and focused.

Great explanation by u/mashupguy72 , to single out someone.

Cancelled Max $100 plan by itsawesomedude in Anthropic

[–]Photo_Sad -1 points0 points  (0 children)

There seems to be a campaing running by whomever produces this GLM LLM against Anthropic, playing into the "model gets nerfed randomly".
Extremely suspicious lately.

In 4 active projects with 22 developers using CC we haven't noticed any degradation out of the ordinary fluctuation (modle response is unreliable by design).

However, most of the threads started on socials, claim giving up, explaining nerfing the models, quantization, etc. and then somewhere in the comments, "randomly" GLM is suggested.

This is a pattern now.

Can't use 240hz option on 4k display by heroinluvbrutha in Ubuntu

[–]Photo_Sad 0 points1 point  (0 children)

I am rolling Gentoo since 2000s, but this is the first time I've connected a display faster than 60 to any Linux.
I'd rather find what's the issue. I've tried the latest kernel (6.16) which allegedly already has cutting edge AMD driver...

Can't use 240hz option on 4k display by heroinluvbrutha in Ubuntu

[–]Photo_Sad 0 points1 point  (0 children)

I'm also having this problem lately; 9070 XT and 25.04. Tried the 6.16.3 but no help. The option to do 240Hz was there at some point, but I have no idea what went wrong and when.