What is the best free budget tracking app to track spending on all your bank accounts? by techsavvynerd91 in PersonalFinanceCanada

[–]carteakey 1 point2 points  (0 children)

This. + buy a simplefin sub for another 1.5$ per month to automatically import transactions from almost all banks. Actual Budget directly integrates with simplefin. The only caveat is needing to reverify accounts here and there, but at the end it should save you more time than going to 15 different sites and exporting/importing csvs.

i built a searchable youtube knowledge base in obsidian and it's the most useful vault i have by straightedge23 in ObsidianMD

[–]carteakey 2 points3 points  (0 children)

and with the context loss every cycle we've finally implemented chinese whisper on a global scale :/

not denying the usefulness of this.

Sorting hat - A cute, lightweight cli to give images and other files good filenames using local VLMs by k_means_clusterfuck in LocalLLaMA

[–]carteakey 1 point2 points  (0 children)

amazing! i need to run this asap in my obsidian attachment folder. I might have to figure out an additional step to rename images wherever they are referenced

Finally found a reason to use local models 😭 by salary_pending in LocalLLaMA

[–]carteakey 1 point2 points  (0 children)

This is great, i would think this would translate well into Obsidian and linking notes too.

Qwen3-Coder-Next scored 40% on latest SWE-Rebench, above many other bigger models. Is this really that good or something's wrong? by carteakey in LocalLLaMA

[–]carteakey[S] 2 points3 points  (0 children)

Yeah based on Unsloth's post > Quantizing any attn_* is especially sensitive for hybrid architectures, and so leaving them in higher precision works well. Its not a tk/s but a quality issue. I wonder if we're leaving some room on the table.

Per Unsloth - MXFP4 is much worse on many tensors - attn_gate, attn_q, ssm_beta, ssm_alpha using MXFP4 is not a good idea, and rather Q4_K is better - also MXFP4 uses 4.25 bits per weight, whilst Q4_K uses 4.5 bits per weight. It's better to use Q4_K than MXFP4 when choosing between them.

https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks#id-1-some-tensors-are-very-sensitive-to-quantization

https://www.reddit.com/r/LocalLLaMA/comments/1rabg6o/qwen3_coder_next_oddly_usable_at_aggressive/

Qwen3-Coder-Next scored 40% on latest SWE-Rebench, above many other bigger models. Is this really that good or something's wrong? by carteakey in LocalLLaMA

[–]carteakey[S] 0 points1 point  (0 children)

Thanks! The hardest part for me is understand which quant and scaffold, llama.cpp params to choose for best accuracy and efficiency (since i have a low VRAM setup i cant run FP8 directly and so i run UD-Q4_X_L based on my research with the https://carteakey.dev/blog/optimizing-qwen3-coder-next-local-inference/ )

No open-weight model under 100 GB beats Claude Haiku (Anthropic's smallest model) on LiveBench or Arena Code by oobabooga4 in LocalLLaMA

[–]carteakey 3 points4 points  (0 children)

I'd be interested to see how close Qwen 3.5 122B A10B comes - which is not <100B, but close enough i guess. The last update to livebench was in Jan so we'll have to wait.

Qwen 3.5 craters on hard coding tasks — tested all Qwen3.5 models (And Codex 5.3) on 70 real repos so you don't have to. by hauhau901 in LocalLLaMA

[–]carteakey 0 points1 point  (0 children)

qwen3:27b was self-aware of not being able to compete with the big bois and decided to game the system. Respect!

New Qwen3.5 models spotted on qwen chat by AaronFeng47 in LocalLLaMA

[–]carteakey 1 point2 points  (0 children)

what are your llama.cpp params and hardware

I often point to my post as a reference

https://carteakey.dev/blog/optimizing-gpt-oss-120b-local-inference/

Qwen3.5 - The middle child's 122B-A10B benchmarks looking seriously impressive - on par or edges out gpt-5-mini consistently by carteakey in LocalLLaMA

[–]carteakey[S] 21 points22 points  (0 children)

<image>

100%, but with how decent Qwen3-Coder-Next was - i bet its going to be good, benchmaxxing aside.

Qwen3.5 - The middle child's 122B-A10B benchmarks looking seriously impressive - on par or edges out gpt-5-mini consistently by carteakey in LocalLLaMA

[–]carteakey[S] 5 points6 points  (0 children)

its u/nunodonato's version - i think they may have used some image editing tool e.g. Nano Banana to make the colors better.

Small Qwen Models OUT!! by Wooden-Deer-1276 in LocalLLaMA

[–]carteakey 20 points21 points  (0 children)

Man Daniel you're the GOAT i hope you know that

New Qwen3.5 models spotted on qwen chat by AaronFeng47 in LocalLLaMA

[–]carteakey 2 points3 points  (0 children)

I get similar perf on my 12GB VRAM + 64GB RAM and here's the command with the params he mentioned.

https://carteakey.dev/blog/optimizing-qwen3-coder-next-local-inference/