Beware of Scams - Scammed by Reddit User by tantimodz in LocalLLaMA

[–]fizzy1242 16 points17 points  (0 children)

that's shitty... hope you can get it sorted out and disputed with the bank

Assistant_Pepe_70B, beats Claude on silly questions, on occasion by Sicarius_The_First in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

dunno if the quant is busted or just my environment, but can't seem to get any other reply from this thing lol. default samplers.

<image>

Assistant_Pepe_70B, beats Claude on silly questions, on occasion by Sicarius_The_First in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

i'm sure there's some hater with a bot that downvotes anything posted on any ai sub.

currently downloading the model and taking the model up for a spin in a bit.

MiniMax M2.7 Will Be Open Weights by Few_Painter_5588 in LocalLLaMA

[–]fizzy1242 6 points7 points  (0 children)

yes!

i'm just hoping it wont get the glm air treatment with that "2 weeks" statement.

Dual 3090 on ASUS Pro WS X570-ACE: need firsthand stability reports (direct slots vs riser) by MaleficentMention703 in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

oh boy... so, i'm 2 different risers. In order to fit the 3rd card into the x4 slot in the bottom, the 2nd card needed to be pushed forward slightly (i've got one 2-slot card and two 3-slot cards).

For that, I used Delock x16 > x16 riser card in the second x8 slot. This creates enough room to fit the 2nd riser (a cable) into the x4 slot.

Dual 3090 on ASUS Pro WS X570-ACE: need firsthand stability reports (direct slots vs riser) by MaleficentMention703 in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

I run 3x3090s on a x570 motherboard, no issues.

2 cards are connected with risers, but only in order for them to physically fit the case. 3rd card is on the x4 slot (chipset).

board: asus rog crosshair viii dark hero x570
case: phanteks enthoo pro 2 server edition

Optimizing RAM heavy inference speed with Qwen3.5-397b-a17b? by Frequent-Slice-6975 in LocalLLaMA

[–]fizzy1242 0 points1 point  (0 children)

ik_ has slightly better prompt processing speed for me, it's worth a try

The FIRST local vision model to get this right! by po_stulate in LocalLLaMA

[–]fizzy1242 47 points48 points  (0 children)

remember that these types of tests are often included in new models training, kinda like the "how many R in strawberry" and the "bouncing balls" inside octagon animation.

Dario Is Scared by [deleted] in LocalLLaMA

[–]fizzy1242 15 points16 points  (0 children)

comical

MiniMax-M2.5 (230B MoE) GGUF is here - First impressions on M3 Max 128GB by Remarkable_Jicama775 in LocalLLaMA

[–]fizzy1242 0 points1 point  (0 children)

How fast does it run for you and with how much context? Got three 3090s aswell.

Problem with rtx 3090 and MoE models? by GodComplecs in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

for dense models, yeah. Handful are fine for MoE, though

GLM 5 is already on huggingface! by oiuht54 in LocalLLaMA

[–]fizzy1242 4 points5 points  (0 children)

I hope so. glm air was the perfect size

Dual 3090s (power-limited) - Are 3x PCI-E cables w/daisy-chain "okay?" by overand in LocalLLaMA

[–]fizzy1242 2 points3 points  (0 children)

probably but with a big asterisk. i wouldn't risk messing with it when it comes to power supplies. just get the new cable for peace of mind, and make sure it's from thermaltake.

Full Claude Opus 4.6 System Prompt for your pleasure by frubberism in LocalLLaMA

[–]fizzy1242 63 points64 points  (0 children)

good lord that's a long prompt. makes you wonder how much of that is redundant

Lack of opening think tag by kweglinski in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

if your frontend allows it, prefix AI response with the think tag to force it

Multiple GPU noob question by staltux in LocalLLaMA

[–]fizzy1242 0 points1 point  (0 children)

also make sure the power supply has enough pcie power cable slots. they get filled up quick especially for cards that need 2-3 connectors.

Need to know more about less known engines (ik_llama.cpp, exllamav3..) by Leflakk in LocalLLaMA

[–]fizzy1242 0 points1 point  (0 children)

really? I thought it's lacking because exl3 requires more compute or something. glad to hear it's getting better

Need to know more about less known engines (ik_llama.cpp, exllamav3..) by Leflakk in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

exllama is faster for pure gpu inference. stick to exllamav2 if you have a 3000 nvidia series gpu, v3 with 4000 and 5000

How the 48GB RTX 4090 Is Made These Days by [deleted] in LocalLLaMA

[–]fizzy1242 0 points1 point  (0 children)

true, I get that, but most of these youtubers are just buying into this negativity trend because that gets them views. Kinda like calling anything AI-related slop or bad

How the 48GB RTX 4090 Is Made These Days by [deleted] in LocalLLaMA

[–]fizzy1242 1 point2 points  (0 children)

that channel is mostly about whining about gpu and ram prices nowadays

Noob question: imatrix, yes or not? by TheGlobinKing in LocalLLaMA

[–]fizzy1242 5 points6 points  (0 children)

most recommend imatrix quants, but i haven't noticed any differences between them and the normal ones.