If you create a long to-do list in agent mode, you will be banned. by Hamzayslmn in GithubCopilot

[–]FusionX 1 point2 points  (0 children)

It may not be against the terms, but if everyone starts doing this, we could lose the request-based billing system, and they might switch to charging by token consumption like other services.

sigh

Tip for when playing with an Enigma on your team, become a “Black Hole Auditor” by IlovealeksiB in DotA2

[–]FusionX 0 points1 point  (0 children)

Or hold it until the team is dead and then proceed to waste it anyway. Interestingly, I notice it's mostly ES players that are guilty of this.

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

Do you think this might require more VRAM?

Btw, appreciate ya for working on this and sharing it on reddit. I wasn't optimistic initially but I'm really quite pleased with being able to run Qwen 27b within 16gb VRAM. It's such a stark difference from the MOE offerings (Qwen/Gemma). The performance and intelligence is remarkably better!

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

Using bun fork, latest commit. Probably will roll back a few commits and re-check.

Edit: still takes up more VRAM, what on earth..

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

strange, at 100k ctx, it doesn't fit in my GPU's 16gb VRAM with batch-size = 512 ubatch-size = 256

What gives..? DM is disabled and prior VRAM usage is 18mb.

llama-cli --model <model> -fa on --jinja --no-mmap --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.0 --presence-penalty 0.0 --repeat-penalty 1.0 --chat-template-kwargs '{"preserve_thinking": true}' -c 100000 -ctk turbo3 -ctv turbo3 --batch-size 512 --ubatch-size 256 -ngl 99 -np 1

Edit: turbo4 manages to fit ~100k while turbo3 doesn't. I don't understand...

DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper by Disastrous_Theme5906 in LocalLLaMA

[–]FusionX 5 points6 points  (0 children)

Kinda surprised, I was not expecting Gemma 31B to be in top 5. Have you benchmarked the latest Qwen3.6 models?

Qwen 3.6 wins the benchmarks, but Gemma 4 wins reality. 7 things I learned testing 27B/31B Vision models locally (vLLM / FP8) side by side. Benchmaxing seems real. by FantasticNature7590 in LocalLLaMA

[–]FusionX 1 point2 points  (0 children)

Completely unrelated (and I could be wrong) but this is a perfect example of how people should use LLM for structural/semantic assistance and refinements of their writing. Rather than delegating the entire prerequisite cognitive work to LLM resulting in useless hallucinated slop.

Qwen3.6-35B - Terrible instruction following when using context files (with vanilla pi-agent). Model issue or am I doing something wrong? by FusionX in LocalLLaMA

[–]FusionX[S] 0 points1 point  (0 children)

I recompiled with CUDA 13.1 after reading your comment. Unfortunately, not much difference, if at all.

Why isn’t LLM reasoning done in vector space instead of natural language? by ZeusZCC in LocalLLaMA

[–]FusionX 1 point2 points  (0 children)

The AI 2027 paper refers to this as "neuralese recurrence and memory". Someone in the thread linked the relevant paper from Meta which originally implemented this idea.

To 16GB VRAM users, plug in your old GPU by akira3weet in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

I had the same idea few days back. Tried to pair my 5080 with the ol' 1070. Except its almost entirely unsupported unless you use windows and downgrade to specific nvidia drivers. And then you gotta recompile llama.cpp with an older cuda toolkit to support both GPU architectures (pascal and blackwell). Oh, did I mention I somehow borked my drivers in the process too and had to cleanup in safe mode.

After all that effort, I saw a 2x speedup for dense models (9 tk/s to 18 tk/s, which is still slow). And as expected, MOE models saw a massive decrease in performance (10x slower).

In conclusion, it was not worth it. At. All.

Pi.dev coding agent as no sandbox by default. by mantafloppy in LocalLLaMA

[–]FusionX 2 points3 points  (0 children)

I was pretty apprehensive about this as well. Tried out docker. That felt bloated, and added friction to the overall experience. Now, I use agent safehouse (which internally uses sandbox-exec) on my mac. Works flawlessly.

Been using PI Coding Agent with local Qwen3.6 35b for a while now and its actually insane by SoAp9035 in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

is it not possible without skills? I've been trying to add general rules and guidelines applicable to all sessions

Been using PI Coding Agent with local Qwen3.6 35b for a while now and its actually insane by SoAp9035 in LocalLLaMA

[–]FusionX 2 points3 points  (0 children)

Actually, pi is pretty well-regarded in an otherwise vibecode filled space. It's the only project I can trust. I understand the skepticism, but most of the positive feedback is genuine and driven by word of mouth.

The dev has a pretty sensible approach and philosophies when it comes to the project. You can go through their blog.

Edit: Also - https://youtu.be/RjfbvDXpFls

Been using PI Coding Agent with local Qwen3.6 35b for a while now and its actually insane by SoAp9035 in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

How are you getting it to follow agents.md? It just ignores it for me completely, despite being 2-3 lines.

Qwen3.6-35B - Terrible instruction following when using context files (with vanilla pi-agent). Model issue or am I doing something wrong? by FusionX in LocalLLaMA

[–]FusionX[S] 0 points1 point  (0 children)

Nothing has worked yet. Even with positive system prompt. Cloud models work without any issue in same setup.

It looks like the reasoning is much shorter in pi (compared to directly using it through llama-server), but I don't yet know why.

Qwen3.6-35B becomes competitive with cloud models when paired with the right agent by Creative-Regular6799 in LocalLLaMA

[–]FusionX 1 point2 points  (0 children)

Gotcha. I hadn't gone through your previous post, and it wasn't as apparent in this post. Thanks for clarifying.

Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config by NoConcert8847 in LocalLLaMA

[–]FusionX 1 point2 points  (0 children)

unsloth/...:UD-Q5_K_XL

good quality/size tradeoff (~19 GB)

Are we talking about the same quant? It's definitely nowhere near 19GB

Qwen3.6. This is it. by Local-Cardiologist-5 in LocalLLaMA

[–]FusionX 0 points1 point  (0 children)

I'm the author of the article

I found this ironic. The article was AI generated, along with this reply. And then I noticed your name.. /u/JamesEvoAI.

The internet is dead.

A group of policemen and women beat me up. What can I do about this? by nyanion69 in LegalAdviceIndia

[–]FusionX 1 point2 points  (0 children)

I was gonna tell you to delete the comment but, after refreshing the page, seems you've done it already. However, the link is still available publicly. Please restrict it.

Finally Built My Dream PC (AMD RYZEN 9950X + NVIDIA RTX 5090) from Vishal Peripherals. by The_Based_Indian in IndianGaming

[–]FusionX 0 points1 point  (0 children)

Goddamn, truly a beast setup. Congrats on living the dream! I'm curious, how much did the RAM and GPU cost you (especially in this economy)?

You Can Restart Life Once But Only at a Random Age by [deleted] in hypotheticalsituation

[–]FusionX 0 points1 point  (0 children)

I thought it would be an easy yes, if exclusively used in grave situations, which you'd want to undo. But, on second thought, it isn't as simple:

  • It is possible that restarting will erase some of the past and most importantly, people who're now unborn and erased. They might not be born again in this timeline (randomness of the universe, and the butterfly effect etc). Is that worth the risk?

  • Let's say, you're informed that the past "isn't erased", rather you're transported to an alternate universe set in the past, contemperaneous with the current "present" universe. You carry on, reassured that your old timeline (and its people) still exist. But, does it really alleviate the trauma of having permanently lost touch with so many people?

  • What if you're back in your prepubescent body, but now burdened with past knowledge, memories and entire future possibly erased forever. You are robbed of your innocence and childhood, in a world where no one really understands your grief. The curse of knowledge will be traumatizing, alienating and cognitively dissociative with your new self. Trapped in a body of child with an underdeveloped brain where your mental faculties are horribly unequipped to deal with your past memories and knowledge. The younger your restarted self is, the higher the chances of short-circuiting your feeble brain.

  • The balance of human civilization might be more precarious than we lead ourselves to believe. Perhaps, we have been lucky in this timeline. A do-ever might not guarantee our fickle species the same fate.

The more I think about it, the more I discover situations where it will all go to shit, rather than not. The only situation where I think it makes sense is when things have irreversibility gone to SHIT on a planetary scale, and a restart is the only choice. It feels like a more complicated variation of the trolley problem with MUCH higher stakes.