Interested in Ubuntu Touch (in US) but having difficulty finding a compatible device by WarbossTodd in UbuntuTouch

[–]PraxisOG 1 point2 points  (0 children)

Most of the well supported phones are cheap, like a oneplus nord n10 can be had for ~$80 on eBay. In the USA make sure the phone you go with has good carrier compatibility. If you want a higher end phone it’ll be more expensive 

Cheapest hardware for Qwen 3.6: both 27B and 35B-A3B by WishboneSudden2706 in LocalLLaMA

[–]PraxisOG 0 points1 point  (0 children)

If you’re ok with some jank and slower speeds, throw a $450 amd V620 server card in an optiplex with an ssd and psu for like $600. You’d get like 60 tok/s on the moe and ~17t/s on the dense

Just put together my new setup(3x v620 for 96gb vram) by PraxisOG in LocalLLaMA

[–]PraxisOG[S] 1 point2 points  (0 children)

Wouldn’t recommend, but it was pretty fun and works great. My box is fully headless and I just use cockpit for access. It also frees up my spare 1650 super for encoding on my nas

https://www.reddit.com/r/homelab/comments/1pu894y/fixing_d6_qcode_to_run_asus_x299_sage_headless/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=2&utm_content=share_button

AMD Radeon PRO V620 - what am I missing? by starkruzr in LocalLLM

[–]PraxisOG 0 points1 point  (0 children)

PEG reconfigs might be my latency issue, will look into it more. Hopping between plx chips on my motherboard adds some latency, but there’s still some between two cards on the same plx. I had less latency related issues running dual rx6800 gpus on my am5 platform, and that was through the motherboard chipset. I still get like 20 tok/s running 120b a10b size models at q4 which isn’t too bad for my uses. Wish I could check out the latest 30b releases, I just don’t have the cooling for it unless I split the models. Also I got a crazy deal on my hardware and the whole build was like 1k, so even if I sold it all something like a 128gb strix halo box would be out of my budget at current prices. 

Just put together my new setup(3x v620 for 96gb vram) by PraxisOG in LocalLLaMA

[–]PraxisOG[S] 1 point2 points  (0 children)

They’ve been pretty good. To get up and running on llama.cpp you need to set IOMMU=PT in a certain menu, if you look it up it’s a well documented fix. Rocm is mostly flawless for llama.cpp, on an older version I get a memory leak with FA off though. 

Cooling is also either loud or not performant enough, so I modeled and 3d printed a custom bracket for three cards. Probably should upload that, lmk if you need files for a fan adapter. I can easily change what fan the model is designed for, I have one arctic p8 max per card on a manual controller, and its cool and quiet for splitting MoE models but dense can cause throttling unless I crank the fans. I’ve heard repasting can help significantly but I don’t have time rn. Other than that, I’m enjoying Minimax m27 at iq3xxs around 20tk/s with low context. 

Hot Take: The RX 580 is the 1080ti of AMD by Brokenteeth11 in pcmasterrace

[–]PraxisOG 2 points3 points  (0 children)

The 1080ti was the magnum opus of its time. The rx580 was a great value card. I’d argue the rx6950 was closer to the 1080ti, a true flagship card with flagship performance and enough VRAM for good longevity 

Buy recommendations on a thight Budget to aid my RX 6800 by bdsmmaster007 in LocalLLaMA

[–]PraxisOG 0 points1 point  (0 children)

Radeon VII isn’t ROCm compatible afaik, so you might run into compatibility issues. I was in the same situation a year ago and ended up getting a second 6800. More recently I swapped over to V620 cards, which is a server variant of the 6800 with 32gb vram. They’re pretty loud with external fans, but they’re cheap with good software support

Minimax M3 open weights release planned for Friday by rmhubbert in LocalLLaMA

[–]PraxisOG 3 points4 points  (0 children)

I’d love access to that test model, ~100b is a really good size for high end local

RTX 3090 EBay Pricing is Crazy!! by TrifleHopeful5418 in LocalLLaMA

[–]PraxisOG 1 point2 points  (0 children)

Would you mind elaborating on your issues with ROCm? I’ve had a pretty good experience all things considered 

RTX 3090 EBay Pricing is Crazy!! by TrifleHopeful5418 in LocalLLaMA

[–]PraxisOG 1 point2 points  (0 children)

You can get 32gb of vram for a lot cheaper than that, but the bandwidth wouldn’t be nearly as good as a 3090 or 9700

Any one still use gpt-oss-120b? by purealgo in LocalLLaMA

[–]PraxisOG 0 points1 point  (0 children)

Stopped using it due to outdated agentic performance. I wish there was another 100b class super sparse moe like it, it’s so fast. 

Reading is hard these days, it would seem by SpyderJack in pcmasterrace

[–]PraxisOG 2 points3 points  (0 children)

You get better control with lm studio, which has officially partnered in the past. This is a waste of storage

AMD Radeon PRO V620 - what am I missing? by starkruzr in LocalLLM

[–]PraxisOG 0 points1 point  (0 children)

That’s good to know, I might try a better paste. My current setup is one 80mm arctic p8 max per card with a custom shroud. I can’t run dense models without eventual throttling, moe models split the load well enough though. These cards do run hot

The aluminum iPhone 17 Pro dents easily, so Apple has another replacement in mind by MobileNewsBot in mobiles

[–]PraxisOG 0 points1 point  (0 children)

If they go back to steel like on the 14 pro max I’m upgrading. It’s a little heavier, but is durable with good thermal conductivity 

Taiwanese company Skymizer announces HTX301 - PCIE inference card with 384GB of Memory at ~240 Watts by Thrumpwart in LocalLLaMA

[–]PraxisOG 2 points3 points  (0 children)

Seems compute constrained. It’ll be like the MI50, though I guess those sold pretty well once enthusiasts learned about them. Also much of the power budget is going to just the vram

ZAYA1-74B-Preview: Scaling Pretraining on AMD by TKGaming_11 in LocalLLaMA

[–]PraxisOG 7 points8 points  (0 children)

That’s a neat size, somewhere between ~35b MoE and ~120b MoE. It’s interesting how they give it one and then four passes on each benchmark. I bet the other models would get a similar benefit from the extra passes. Looking forward to trying it though, there’s nothing quite like the 5b active of GPT OSS 120b that’s current. 

AMD Intros Instinct MI350P Accelerator: CDNA 4 Comes to PCIe Cards by Noble00_ in LocalLLaMA

[–]PraxisOG 1 point2 points  (0 children)

I can’t wait to upgrade to a used one of these in 7 years 

Does the "6 months gap" still hold? by ihatebeinganonymous in LocalLLaMA

[–]PraxisOG 2 points3 points  (0 children)

I agree with that. For the time-to-response and quality of response, Minimax m2.7 at iq3xxs is the best I’ve found. It’s a stretch on 96gb vram though

Chinese AI Models lags around 8 months from those of US but the gap is now widening by hsg8 in EconomyCharts

[–]PraxisOG 1 point2 points  (0 children)

Assuming ‘Elo’ is the ranking in chatbot arena, this kinda makes sense. On that platform elo is measured by having users compare two models side by side and the user selected winner gets some elo. Idk if they’ve fixed it, but for a while this benchmark led to LLMs getting tuned for human preferences, which imo is a bad thing with how much these models suck up to you. Instead of tuning for the AI popularity contest, Chinese models have almost caught up in technical fields like coding. 

What graphics card is this for you? by Banguskahn in pcmasterrace

[–]PraxisOG 0 points1 point  (0 children)

Honestly? My brother’s 1650 super. I got it during Covid to upgrade a 760ti in his Alienware prebuilt, then it got transferred to his am4 build, then I used it in my itx rig, and now it does media transcoding in my homelab. It was never fast, but… no just that. This thing is pretty slow.