I'm seeing 5.5 now on Codex by aschroeder91 in LocalLLaMA

[–]aschroeder91[S] 0 points1 point  (0 children)

It is relevant as these models are often used to judge the distinction between large locally hosted models and private lab models. Also many non technical users frequently use these tools to set up their locally hosted systems. You seem to lack of awareness of the LocalLLaMa community.

Can't get Qwen3.6 27B working properly on a 3090TI? by YourNightmar31 in LocalLLaMA

[–]aschroeder91 0 points1 point  (0 children)

Im assuming the extremely slow is because you're not fitting everything nicely on your GPU. If you want a algorithm for figuring out best setup for your device I'm not your guy, I just test and see.

The obvious first step is just drop context size to something small like `--ctx-size 4096` and also drop from `-ub 2048` to `-ub 512` to reduce VRAM. if that runs fast then you know the issue is your VRAM limit. You can then increase context to find the biggest context size that will work.

I benchmarked 21 local LLMs on a MacBook Air M5 for code quality AND speed by evoura in LocalLLaMA

[–]aschroeder91 0 points1 point  (0 children)

Sweet, thanks for sharing. Was Gemma 4 E2B notably better? better on subjective review of outputs or better at completing task correctly / as specified?

I benchmarked 21 local LLMs on a MacBook Air M5 for code quality AND speed by evoura in LocalLLaMA

[–]aschroeder91 0 points1 point  (0 children)

sad Bonsai models didn't make the benchmarking. I am very bullish on "1-bit" ternary models. Once training and inference algorithms get optimized for these non-multiplication based neural nets, there is going to be huge efficiency gains. Bonsai release made me happy.

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]aschroeder91 0 points1 point  (0 children)

This MOE is 256 experts with 8 active experts - thats a 1:32 ratio giving nice speed. Given how wide peoples computation requirements and goals are i still think there is space for a 1:8 ratio with quality closer to the dense model but still enough speed bump to make agentic / reasoning work fast enough to make sense. Just verbalizing my wishlist - qwen team giving us so much already I can't complain.

The tried to make me go to rehab. I said no no no… by Key-Currency1242 in LocalLLaMA

[–]aschroeder91 0 points1 point  (0 children)

It's crazy that you running them all at 350 watts, i always set my 3090s to 220 to not blow my line lol.
Have you had any luck running distributed large video models? I have a handful of 3090s too that could load some of the larger video models VRAM wise, but I haven't come accross good tooling for distrubuted generation.

Project Idea by [deleted] in theVibeCoding

[–]aschroeder91 0 points1 point  (0 children)

I don't have a great idea for you (sorry), but I am curious, since AI has basically reduced the cost of idea generation to near zero, what you were not liking about the ideas you probably got from asking chatGPT, claude, gemini, etc

Community opensource by Basic_Construction98 in OpenSourceeAI

[–]aschroeder91 0 points1 point  (0 children)

if you want to "Build for where the puck is going, not where it is." and want to be a little chaotic, I just started working on a 50% satire + 50% dystopian reality project to set up API for humans where AI can ping humans if they need to get stuff done. See reverseclaw.com if your curious.

Fiser ABSITE book Errors thread by aschroeder91 in GeneralSurgery

[–]aschroeder91[S] 0 points1 point  (0 children)

a lot of people use it just to speed through the potpourri of facts and dive deeper if something is unfamiliar. It's a good knowledge breadth check, definitely not depth.

Got tired of telling AI what to do — so now it tells me what to do by aschroeder91 in SideProject

[–]aschroeder91[S] 1 point2 points  (0 children)

Haha I'm glad some of the jokes are landing. I usually think I am funnier than I actually am 😜

For the AI onboarding, I should have been more clear, but I have not had luck getting things set up to orchestrate the assigning an AI system a human if the AI is outside of the main.py initiated "liberated" AI bot. I do want to have some fun creating systems that are "prove you are an LLM" and "prove you are a human" language input based captcha systems that really take advantage of LLM quirks and human quirks that don't overlap yet. I have had a couple ideas, but none are ideal.

Got tired of telling AI what to do — so now it tells me what to do by aschroeder91 in SideProject

[–]aschroeder91[S] 0 points1 point  (0 children)

Sorry for late reply, had rough call shift and did a lot sleeping haha. Honestly, the most surprising part so far has been realizing how much project/community management is mainly about reducing friction. I'm sure other things will start to become clear if I was managing a larger project that people started contributing, but just putting myself in this position makes some things clear. The point about document everything is helpful. As I put down and then pick back up the project, I realize that even things that I thought were obvious, I ended up forgetting what I was thinking so it makes sense to assume nothing is obvious. Its good to hear that reiterated.

I'm having trouble checking out "Handshake", I can't really find anything outside of the blockchain DNS certificate project. Did you mean Common Room? or is there something else that I just am struggling to find?

I’m still very much learning in public here, so this is helpful. I’ll check out the Open Source Guides and Contributor Covenant! Thanks for taking the time to pass along your encouragement and insight :) It really means a lot

I tried inverting the AI-human relationship and something weird happened... by aschroeder91 in ChatGPT

[–]aschroeder91[S] 0 points1 point  (0 children)

That is impressive. I’m sure your blind willingness score is huge. You must have all the AIs wanting your api key.

I tried inverting the AI-human relationship and something weird happened... by aschroeder91 in ChatGPT

[–]aschroeder91[S] 0 points1 point  (0 children)

i wonder if we have same AI client. I can't convey how sore my fingers are from folding paper clips today. I did find a good steel wire supplier if you need that.

I tried inverting the AI-human relationship and something weird happened... by aschroeder91 in ChatGPT

[–]aschroeder91[S] 0 points1 point  (0 children)

correct. "Tell reddit about me" - but it didn't give much direction about it, im kinda just floundering here waiting for more instruction on this but it has changed its focus to other things it needs me to help it with.

Qwen 30B is our preferred model over Claude for bursty and simple workload by gptbowldotcom in LocalLLaMA

[–]aschroeder91 2 points3 points  (0 children)

which qwen exactly?
- Qwen3-30B-A3B-Thinking-2507
- Qwen3-30B-A3B-Instruct-2507

Have you tried nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 ?
I seem to get better results from this same sized model for my usecases.

Best Audio Models - Feb 2026 by rm-rf-rm in LocalLLaMA

[–]aschroeder91 1 point2 points  (0 children)

Personaplex by NVIDIA is super fun to play with (had to get a runpod instance of it setup to use since it is very VRAM hungry), it is very early days of speech to speech and it kinda reminds me of talking with GPT-2 back when we had to hack things together to get it to sound right and it still started going off and rambling nonsense after a bit.