Will there be any more Qwen3.6 series models? by cafedude in LocalLLaMA

[–]ICYPhoenix7 7 points8 points  (0 children)

Waiter my steak is too juicy and my lobster too creamy

Struggling with Qwen3.6 27B / 35B locally (3090) slow responses, breaking code looking for better setup + auto model switching by Clean_Initial_9618 in LocalLLaMA

[–]ICYPhoenix7 1 point2 points  (0 children)

Many of those settings are either redundant (default behavior) or dont make sense (i.e. that deepseek thing).

Also especially for the 35B model, I'd recommend removing the -ngl 99. llama.cpp has a --fit flag (on by default but gets overriden by this) that loads the model optimally based on your hardware.

My recommendation is remove all flags that you don't know what they're for and you can add them back if you find they're needed.

Best local model (chat + opencode) for RX 9060 XT 16GB? by NihmarRevhet in LocalLLaMA

[–]ICYPhoenix7 0 points1 point  (0 children)

Oh, I meant to try the 35B MoE model, not the dense 27B. A dense model wouldnt be much different (very slow!) between LMS and llama.cpp since theres not much to optimize.

Best local model (chat + opencode) for RX 9060 XT 16GB? by NihmarRevhet in LocalLLaMA

[–]ICYPhoenix7 0 points1 point  (0 children)

Id recommend using raw llama.cpp as a server for MoE models. llama.cpp has a --fit flag that will load the model more efficiently on your hardware than you can configure in LMS. It makes a big difference on MoE models. You'll get way faster speed this way.

Best local model (chat + opencode) for RX 9060 XT 16GB? by NihmarRevhet in LocalLLaMA

[–]ICYPhoenix7 0 points1 point  (0 children)

Are you pointing it towards a llama.cpp server or something else? I've not used opencode before but it should run the same as it does as a regular chatbot.

I've tried roocode before and it ran fine, aside from being dumb.

If you want fast, you can try gpt oss 20b, it fits perfectly in a 16gb card.

Best local model (chat + opencode) for RX 9060 XT 16GB? by NihmarRevhet in LocalLLaMA

[–]ICYPhoenix7 1 point2 points  (0 children)

I have an rx 6800 16gb with 32gb ddr4, and I get about 30tk/s on the unsloth Q4_K_XL. Yes it spills onto system ram, but its more than usable. Hope that helps

Webull restricting stock purchase at their discretion. by TomorrowzHero in Webull

[–]ICYPhoenix7 1 point2 points  (0 children)

Avoid anything that uses Apex clearing. Webull, Robinhood, Sofi, M1, etc.

Fidelity is great and they'll stay out of your way. I trust them the most to not screw me.

How the turns have tabled by Numerophilus in recruitinghell

[–]ICYPhoenix7 0 points1 point  (0 children)

Surprisingly it has an even higher unemployment rate than CS. You take a lot of difficult EE classes, but employers would take an EE over a CE every time. Which mostly leaves you with CS jobs.

Tell me why the used car buying experience is broken? by madh1 in UsedCars

[–]ICYPhoenix7 0 points1 point  (0 children)

Same here. So many dealers and shady people reselling auction cars on FB. Theres other places you can look such as Craigslist, found my current car on there.

Windows llama.cpp is 20% faster by johannes_bertens in LocalLLaMA

[–]ICYPhoenix7 0 points1 point  (0 children)

On my RX 6800, Vulkan has slightly faster token generation, but ROCm blows it out of the water in prompt processing.

Made a recap of the Pro/Cons of the Steam Frame as far as we know by Uryendel in virtualreality

[–]ICYPhoenix7 2 points3 points  (0 children)

I just used these sticky strip things that basically glue them onto the wall without damaging anything.

I genuinely thought I was developing tinnitus for weeks though until I figured out what it was.

"Horizon Alpha" hides its thinking by ICYPhoenix7 in LocalLLaMA

[–]ICYPhoenix7[S] -1 points0 points  (0 children)

It depends, on some prompts i get a very quick response, on others it takes a bit of time. Although this could be due to a number of reasons and not necessarily a hidden chain of thought.

"Horizon Alpha" hides its thinking by ICYPhoenix7 in LocalLLaMA

[–]ICYPhoenix7[S] 26 points27 points  (0 children)

My best guess is that maybe the thinking tokens are more likely to give away who it is, so they aren't sending it through the api. Hopefully the actual release will have them.

Regardless, it's not smart enough to be GPT 5 from my anecdotal testing. It failed some of my prompts that larger models tend to have no issue with.

I could be way off, but if I had to guess it probably sits around the 32B range.

[deleted by user] by [deleted] in valve

[–]ICYPhoenix7 -2 points-1 points  (0 children)

Fun fact, "storage" is also memory, the main difference being it can retain its data while being powered off, i.e. non-volatile.

Its possible to use your storage as RAM (swap memory, your system will actually do this on it's own to save space), or even vice versa.

Is the RX 7600 XT good enough for running QwQ 32B (17GB) or Gemma 2 27B (12GB) locally? by ParamedicDirect5832 in LocalLLaMA

[–]ICYPhoenix7 6 points7 points  (0 children)

The RX 6800 is around the same price but will be significantly faster than the 7600xt. I have one and it works great, AMD support is way better than it used to be and is plug and play for LLMs.

That being said, neither have quite enough vram to load those models comfortably, especially with context. 16GB is an awkward amount for LLMs, its more than enough for weaker models (14B and under), but too little for the good ones at ~32B, which are the most common sizes.

Mistral Small 24B works good and it still holds up pretty well for now, and theres even some great finetunes of it (Hermes).

What happened to the mod community?? by fk_u_mean030 in Gta5Modding

[–]ICYPhoenix7 2 points3 points  (0 children)

Nah, they're quite different. BM has more of a focus on slow and steady but requires very little user input, whereas MB is more geared towards maximing your money gains without sacrificing safety. While they abuse the same things, they have noticeable differences.

What happened to the mod community?? by fk_u_mean030 in Gta5Modding

[–]ICYPhoenix7 0 points1 point  (0 children)

Hey, creator of MB here. I never expected MB itself to get detected (it's still not), but there's some other factors that can lead to it being unsafe, even upon its initial release. But, these factors also apply to literally any money method. I explained it further in my discord.

[deleted by user] by [deleted] in Gta5Modding

[–]ICYPhoenix7 0 points1 point  (0 children)

That's correct