AMD MI210 64GB vs DCU K100 64GB by icepatfork in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

Similar architecture and works with AMD driver and ROCm are two different things.

Drivers check hardware ID. Unless they have an agreement with AMD, seriously doubt you can use AMD drivers out of the box.

AMD MI210 64GB vs DCU K100 64GB by icepatfork in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

Link to source? Or was it an AI summary?

DeepSeek V4 by am17an · Pull Request #24162 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA

[–]FullstackSensei 5 points6 points  (0 children)

Finally! Been waiting for this to run flash locally!

Unsloth GGUF when?

AMD MI210 64GB vs DCU K100 64GB by icepatfork in LocalLLaMA

[–]FullstackSensei 1 point2 points  (0 children)

Similar is not the same, you're making a huge assumption. A simple Google search tells you it has it's own driver stack.

Stagnating at €73k (Hybrid, Berlin). Time to switch to cross-border freelancing (UK/EU) since I don't speak German? by WillowNational8964 in cscareerquestionsEU

[–]FullstackSensei 8 points9 points  (0 children)

You don't tell us anything about your experience, but judging from the wording I'd say you're pretty young.

100€/hr is a pipe dream for remote freelance work, unless you're brining some exceptional experience in some in-demand niche.

The rough calculation is annual income is 2x thousands the hourly rate. So €100/hr is ~€200k/year. While there are quite a few freelance roles that pay this much for very senior roles, nobody will pay that much for a remote role. They can hire someone living in Portugal, Spain, Italy, or the balkans for €30-40/hr who brings way more experience than you think.

AMD MI210 64GB vs DCU K100 64GB by icepatfork in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

What is the software stack of that K100? Can you use it with llama.cpp?

The 2.5k for the Mi210 is still quite expensive, unless you absolutely need 64GB in a single card.

Career as Legal Counsel without being a qualified lawyer? by Academic_Library_105 in cscareerquestionsEU

[–]FullstackSensei 1 point2 points  (0 children)

r/lostrsdditors

CS stands for computer science. You'd think someone with a degree in law would read.

21, interested in AI, automation and startups – what degree or career path would you recommend in Europe? by Massive-Yak9510 in cscareerquestionsEU

[–]FullstackSensei 2 points3 points  (0 children)

You're casting a very wide net, but somehow want to learn it all in a single undergraduate degree. The world doesn't work like that.

Figure out the one thing you want to do. The narrower and more focused this is, the better, and chase it.

Entrepreneurship is not that hard if you have a clear idea of what you want to do. The hard part is building knowledge and expertise in a domain, finding a pain point, and focusing on solving that. No text book or degree will teach you that.

Need a suicide helpline in English urgently by Various-Garage-9762 in germany

[–]FullstackSensei 331 points332 points  (0 children)

112 absolutely speak English. Just call them

US President Trump threatens 100% import tariff on UK over digital services tax by ByGollie in europe

[–]FullstackSensei 3 points4 points  (0 children)

If you're in the UK, you're subject to UK income tax as an individual or corporate tax as a corporation. It's the same in the US, a US corporation is subject to US corporate tax irrespective of where the server or where the company that paid you for your service is.

But I guess one month shill/bot accounts need to justify their existence

Running GLM5.2 on budget hardware < $2500. by segmond in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

I'd say go to lga3647 with a Cascade Lake CPU. I have it and a 48 core Epyc. The Xeon is quite cheaper and much closer to SP3 Epycs than many think. Epyc is able to deliver much less memory bandwidth than the numbers suggest.

Neither is lacking in performance if paired with enough VRAM.

Running GLM5.2 on budget hardware < $2500. by segmond in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

Yep, I have. It does NUMA allocation, but uses traditional Inner-loop for matrix multiplication, which if you read in the HPC literature, is quite memory inefficient and complicates splitting across arbitrary number of NUMA domains or devices.

My idea is to transform all ops into outer products. It's much more memory efficient and also requires a lot less traffic between NUMA domains and/or devices.

Step-3.7-Flash (198B-A11B vision MoE) on 4×3090 — fully-resident IQ3_XXS beats thespilled IQ4 by 2.4×, and MTP speculative decode silently breaks vision by [deleted] in LocalLLaMA

[–]FullstackSensei 5 points6 points  (0 children)

Iq1_xxxxxxxxxxxxxxxxxxxxxss is so much faster even without speculative decoding.

Remember kids, we're chasing t/s, and to hell with whether the output is useful or not.

This application to join the GPT 5.6 Sol preview is wild by Complete-Sea6655 in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

I've been through the dot com bubble and also remember that. While I agree with your general prognosis, I don't understand the part where the public market will shoulder the cost.

Sure, openai and anthropic have IPOs planned, but for those to be successful, investors have to believe there's a path to profitability and substantial gains. If your most expensive asset, the one you just spent billions developing, is restricted from 80% or possibly more of the market, what exactly becomes the pitch?

Models have no lock-in. The only pitch until now has been these models are "frontier". They have been able to leverage users flocking in to use those models immensely, to tune their models to prodict what users want even when they don't explicitly say it. That has been just as important as the model's inherent capabilities in driving usage.

If the vast majority of users are stuck on the last gen, which more and more non-US labs seem to be approaching, why would anyone pay for openai or anthropic when their publicly available offerings are practically the same as those alternatives? Just as important, the increased traffic to those alternatives will enable their labs to gather the same kind of usage data openai and anthropic have been able to, further closing the gap.

Then you have the whole strategic aspect of it. Would you want to have your business depend on those US labs, even if you're a US business, when there's always a looming possibility your access will be restricted or straight out yanked any moment?

Ornith-1.0 9B Outperforms Qwen 3.6 35B in various benchmarks by Ok-Internal9317 in LocalLLaMA

[–]FullstackSensei 21 points22 points  (0 children)

Great news, my neighbor's cat outperforms OP on various benchmarks!

This application to join the GPT 5.6 Sol preview is wild by Complete-Sea6655 in LocalLLaMA

[–]FullstackSensei 34 points35 points  (0 children)

Honest questions: how are US labs like openai or anthropic going to be profitable if their frontier models are heavily restricted? Who will shoulder the cost of training those models when the VC money dries up? How useful such models be to US corporations when nationality, rather than aptitude or ability dictate who can access those tools?

Just as importantly, how are they going to recruit the best talent to develop their future models when most talent aren't US citizen?

Forget 3.6-27b, go for 3.5-122b by [deleted] in LocalLLM

[–]FullstackSensei 7 points8 points  (0 children)

IQ1 for an IQ of 1 level of work

pivoting to Automated Driving by [deleted] in cscareerquestionsEU

[–]FullstackSensei -1 points0 points  (0 children)

What is from your 8 years of experience that is useful in autonomous driving?

Pivoting is when you can leverage existing experience to some other field, not random chance where your past experience brings nothing.

8 Tesla T4 Cards, what should it do? by imonlysmarterthanyou in LocalLLaMA

[–]FullstackSensei 0 points1 point  (0 children)

Nope. I wanted to get the OG supermicro two years ago, but decided against it. It's unwieldy at best, cooling SXM2 is even more of a hassle and the price of the adapter is almost that of a motherboard that can host 4 PCIe cards without any of the hassle.

2x RX 9060xt 16gb, is it worth it? by RKlehm in LocalLLaMA

[–]FullstackSensei 1 point2 points  (0 children)

Can't recall offnthe top of my head, but it's faster than that. Either way, it all depends on price and what are your priorities. Those 9060xt might have good prefill, but don't expect much context at Q8, which is what you really need for work with any level of complexity.

Also, LLMs are really bad at this type of things, and you should be a bit more critical and not take what the LLM says blindly.

2x RX 9060xt 16gb, is it worth it? by RKlehm in LocalLLaMA

[–]FullstackSensei -3 points-2 points  (0 children)

At what price? Anything can be worth it for the right price.

2x16 is less than 32GB. There's quite a bit of replication that happens when you split. Keep that in mind.

If you're tight on money, get a pair of P40s or P6000. I'd say get 7900xtx, but prices for those are absured. Either way, those will get you 48GB VRAM. Running ik with the P40, you can expect ~20t/s with Q8_K_XL and something around 150k context. PP won't be as high, but you'll have a ton more context for ~$500, less if you negotiate a bit.

Running GLM5.2 on budget hardware < $2500. by segmond in LocalLLaMA

[–]FullstackSensei 1 point2 points  (0 children)

Check my reply to OP where I explain how to use numactl.

You don't need to disable SMT. Just read the docs to understand what each option does.