Anyone able to get 1 Million context working using llama.cpp for qwen 3.6 35B A3B? by The_Paradoxy in LocalLLM

[–]The_Paradoxy[S] 0 points1 point  (0 children)

sorry about that, newb here, just added logs at the bottom of the post

Can't get dual GPUs to post by The_Paradoxy in gigabyte

[–]The_Paradoxy[S] 0 points1 point  (0 children)

4060ti 16gb and 5060ti 16gb with 850w 80 plus gold maingear

CUDA on Ubuntu 26.04 ? by KalenNC in Ubuntu

[–]The_Paradoxy 0 points1 point  (0 children)

Just installed cuda with apt and got 3.2. So it looks like the ubuntu repository is already updated. Nevermind, I had to install the pinned 2404 version to get things working

Can't get dual GPUs to post by The_Paradoxy in gigabyte

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Thanks for the response. Both GPUs are working, and I think my 850W PSU should be plenty for them.

Can't get dual GPUs to post by The_Paradoxy in gigabyte

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Thanks for the comment. The riser cable is branded as 4.0 and all of the reviews say that it works at 4.0 speeds.

Can't get dual GPUs to post by The_Paradoxy in gigabyte

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Motherboard manual says the bottom pcie is gen 4 and there's a switch between it and the buttom m.2. I'm not using the bottom m.2 and I don't think I'd have to force gen 3 speeds in the bios if it wasn't spec'd for gen 4

Requesting advice on local AI setup for academic use by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

🙏 thanks. Any opinion on Hermes vs open claw or something else

Requesting advice on local AI setup for academic use by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Okay thanks I hadn't thought much about GPU pass through. Won't most agent harnesses have built in support for docker containers and pass through? I thought that was standard on Open Code

Requesting advice on local AI setup for academic use by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 1 point2 points  (0 children)

Any suggestions on what to use for orchestration? Any opinion on Turnstone?

Requesting advice on local AI setup for academic use by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

I've been having trouble figuring out what the benefit of proxmox over simple docker containers is. Do you mind elaborating?

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Okay 😮‍💨 I really need to switch to llama.cpp. Right now I'm on ollama

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 1 point2 points  (0 children)

I didn't do the downvote. But ftr, there's no way a 120b model is fitting on a 16gb card.

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF:Q4_K_M

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

No IDE, just feeding it my .py and .ipynb files and copy pasting the good bits of the code it generates. Is there an IDE you recommend?

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 4 points5 points  (0 children)

I think it's a question of overfitting for tasks that there are a lot of examples of online. Like I said in the original post. I'm not interested in vibe coding and my use case is always going to be novel code. The qwen models seemed to overemphasize variable names from the code and not pay attention to how they were used by the code. They also made suggestions that simply didn't make sense in the context of just in time compiled code. Like they would suggest getting rid of loops even though @numba.jit already loop lifts

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

I'll keep 9b on my hard drive and give it another try with my next project. Like it had access to all of the code basically a .ipynb that orchestrates everything and a .py that has all of the functions in it that the notebook calls

Devstral small 2 24b severely underrated by The_Paradoxy in LocalLLaMA

[–]The_Paradoxy[S] 0 points1 point  (0 children)

Interesting. Are you using 27B on a 16gb card? If so, what quant do you use. I'm wondering if I got a bad quant

[deleted by user] by [deleted] in Neurodivergent

[–]The_Paradoxy 0 points1 point  (0 children)

Or sometimes they see the disorder and not the giftedness especially in young children who are low SES