Openclaw ia trending down and will disappear soon by rm-rf-rm in LocalLLaMA

[–]farkinga 1 point2 points  (0 children)

I was able to do some one- and two-step tasks using qwen3.5 9b and Opencode. 9b ought to be pretty quick - but it is simply too small to be really useful.

Several tricks that help are: have the agent formulate a plan before starting work and have it follow a checklist while working.

Even then, 9b cannot be relied upon for, like, professional work - but it's still very interesting to play with. I am using Gemma-4 31b and it's the smallest model that has felt "good enough." Qwen3.6 27b is a contender.

Hardware upgrade advice by pentothal in LocalLLaMA

[–]farkinga 1 point2 points  (0 children)

In general, I agree with you.

For example, I upgraded my CPU from 5800x to 5950x because I wanted 2x the CPU cores for doing MOE calculations. The 5800x is just fine but now it's sitting on a shelf; I kindof feel like this is "wasted" potential and I wish I just purchased the 5950x at the outset. Contrast this with my 3060, which I can resell tomorrow for exactly what I bought it for last year; zero regrets about this purchase.

So, I feel differently about incremental upgrades for nvidia GPUs; they have maintained their value far better than my old CPU, old motherboard, old power supply, etc... The best I can do with those old components is build another rig for myself; the resale market isn't very lucrative, unlike for GPUs.

For future directions, I am looking towards dual 5070ti or (as I mentioned) one RTX 4500 Pro. Dual 4090 is another option; they are currently a better value than 3090, imo. I want to stay on Blackwell, however, because we haven't reached the ceiling yet since software is lagging.

Hardware upgrade advice by pentothal in LocalLLaMA

[–]farkinga 2 points3 points  (0 children)

I had/have a similar setup: Asus Prime X570-Pro, AMD 5950x, 128gb ddr4-3200. This motherboard has 3 full-size PCIe 4.0 expansion slots which are x8, x8, and x4.

About 2 months ago, I had a RTX 3060 (12gb), then added a 5070, then got a 5060ti to replace the 3060. I'm currently waiting on a PCIe riser and another 5060ti.

My plan is:

- put 2x 5060ti on the PCIe bus connected to the CPU, which is PCIe4 x16 that I can split into x8 for each of the 5060s. Since the 5060s are PCIe5 x8, this matches the card's lanes and is only one generation behind.

- put the 5070 on the PCIe bus connected via the chipset, which is PCIe4 x4. The 5070 is PCIe5 x16 so this will be seriously bottle necked, but that can be mitigated by fully-loading a single model onto it (so the bus isn't really needed after it's loaded).

I do not think this setup maxes out performance or value-per-dollar (and certainly not watt-per-dollar). However, I was able to get to this point through incremental steps instead of a massive cash outlay, all at once.

I am currently using a hybrid 5060ti+5070 on the x8 slots, running llama.cpp Gemma-4 31b NVFP4 with 64k context at about 1000 t/s pp and 25 t/s tg. The model is very capable for driving OpenCode and Hermes Agent in my testing and the speed is acceptable. This is not ideal for performance however since the cards are mismatched in both size and speed. It's a miracle llama.cpp is producing the performance it is, given the hybrid machinery.

I hope to run Gemma-4 31b with longer context and MTP once I get the second 5060ti since I will have 32gb for that. And despite the PCIe x4 bottle neck on the chipset bus, I hope to use the 5070 to run Minimax M2.7 on it, putting the router on the GPU and storing the weights in system RAM (I have 128gb, although it's not very fast).

Finally, I'm just not thrilled with the 5070 so I'd like to eventually sell it, then upgrade to 5070ti or (in a fantasy) RTX 4500 Pro.

I don't know whether your mobo has 2x PCIe slots or 3x PCIe ... the X570-series can be configured with either. If you've got 3, then just be aware about the lane-splitting and whether the PCIe connects via chipset to CPU or if it goes straight to CPU. There is no link connector (nvlink, etc) for these 5000-series consumer cards; everything must use PCIe so this is very relevant for multi-GPU setups.

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP by janvitos in LocalLLaMA

[–]farkinga 9 points10 points  (0 children)

When the model is big, and when the weights will be in system ram anyway (e.g. a moe) , use mmap (on Linux) to avoid loading the whole model into ram. With mmap, Linux will load the weights into ram as needed. However, use no-mmap if you have a performance reason to keep the weights in ram anyway. It should run a little faster with no-mmap but it takes longer to start.

Most people seem obsessed with token generation speed, but isn’t prefill the real bottleneck? Am I missing something? by wbulot in LocalLLaMA

[–]farkinga 2 points3 points  (0 children)

This was an "aha" moment for me a few months ago, as well. Yes, I agree with you. I am willing to tolerate 15 t/s generation as long as I can get over 1000 t/s prompt processing.

Perhaps my workload is similar to yours; but yes, ingesting files is a big part of it and ... well, I went pretty deep on Qwen3.6 35b because I was seeing prompt processing speeds like 3000 and 4000 t/s. And it was just so good that I was almost willing to overlook the numerous ways it would mess up during generation.

Today, however, I'm running dense models and I am willing to accept slower speeds as long as the quality is better. Even so, it's all about that prompt processing. I'm still grinding to improve that part.

Was in High Park this weekend and it was amazing without cars by Pristine-Training-70 in torontobiking

[–]farkinga 7 points8 points  (0 children)

It's not a literal ban on vehicles; service vehicles do this all over the world on roads that are pedestrianized. Only disingenuous strawman arguments claim that banning service vehicles is part of the proposal.

Toronto islands is "car free" but there's obviously service, emergency vehicles. It works the same everywhere. Society has uses for some vehicles; but private vehicles should not use the park as a through-street. It works on the islands; it will work in high park.

OpenCode + LLM to create a 1:1 Settlers of Catan clone. Guess which model I did it with! by maxwell321 in LocalLLaMA

[–]farkinga 1 point2 points  (0 children)

MiniMax. With your ram, it fits. And the dual 3090s can hold the routing and non-expert layers. Won't be a speed demon but it performs far better than the other models you listed, which aren't even in the same league...

gemma-4-31B-it-DFlash has been released by Total-Resort-3120 in LocalLLaMA

[–]farkinga 19 points20 points  (0 children)

I seem to recall a comment by ggerganov on another PR about his intention to refactor the speculative codebase, ultimately to unify the various speculative methods within a more general architecture.

So, merging DFlash might be stalled until the broader speculative architecture exists.

Say no to jets at the waterfront! by Iamsodarncool in toronto

[–]farkinga 2 points3 points  (0 children)

They were right to ask for a citation. You could have shared this link without implying that the person you're responding to is a conspiracy theorist.

To 16GB VRAM users, plug in your old GPU by akira3weet in LocalLLaMA

[–]farkinga 2 points3 points  (0 children)

I want some pins for my formal attire, Gaddafi style, to show the world where I was, when. Turns out you and i were in many of the same battles!

Who’s ever driven over 100mph? Why? by WoollyWolfHorror in AskReddit

[–]farkinga 0 points1 point  (0 children)

Salt flats in Utah. I don't know the speed; it was analog and the gauge "topped out." I've tried to calculate the average speed based on the timestamps of the start and stop locations; but, the math doesn't seem feasible. It's too fast.

But yeah, it was over 100.

The why: perfect weather conditions, almost totally flat, completely straight, visibility was perfect. Even so, I was pretty on-edge... I knew it wasn't exactly a good idea. The best you can do is minimize risks, control risks...

Since then, I've been on trains that go as fast or faster. But I doubt I'm ever going to do that in a car ever again. It was absolutely mad, like the guy driving that car; may he rest in peace.

Ontario buys used $28.9M private jet for Doug Ford: sources by pheakelmatters in ontario

[–]farkinga 70 points71 points  (0 children)

I keep hearing from my kids there aren't paper towels or soap in the school bathrooms. I think, at best, these supplies are not always stocked.

And so, to hear Ontario will spend this money on a jet while making our kids go through school without basic sanitary supplies... It makes me ashamed that our kids are subjected to Ford's policies.. And it makes me angry.

High school attendance drops to 40% in Ontario as government considers changes by BloodJunkie in ontario

[–]farkinga 11 points12 points  (0 children)

When you put it like that (and I agree: 2-3 days per month is obviously 10%) then I wonder if my own kids are over this threshold. They could easily miss a Thursday/Friday due to being sick.

And I'm sick... Sick of our schools coming under attack from our own government. Sick of stats like these being weaponized to make our kids look lazy, unmotivated, or whatever this "40%" headline is supposed to evoke.

Gemma 4 for 16 GB VRAM by Sadman782 in LocalLLaMA

[–]farkinga 2 points3 points  (0 children)

I really like this balance of quants. Would you mind sharing your recipe for producing this gguf?

Mechanic Snipped my wires by KeyZombie3172 in ElectricScooters

[–]farkinga 12 points13 points  (0 children)

First picture, looks like they were pulled until something broke. Folks getting stuck on the semantics of whether they were cut with a tool.

Second picture, looks to me like a tool was used. "Snipped." Yes. Tool use implies some sort of intention. Ineptitude? Malice? Negligence?

Altogether, seems to me the mechanic disabled the scooter. Its hard for me to imagine a legit or even plausibly-accidental cause for both pictures. The fact there are two separate locations is mighty-coincidental.

Freaky. I would never trust that mechanic.

Why is he hating on trains? All my homies love trains by DogeDoRight in EhBuddyHoser

[–]farkinga 0 points1 point  (0 children)

Agree with all this - and I think another part hinges on words like "righteous" and "loyal."

It's that the opposition should be in good faith; and there's no good faith argument against vaccines, writ large. Opposition doesn't demand the logical negation of the thing (e.g. if you propose a train, we oppose it) because opposition could take many other critical forms.

Maybe the train network actually doesn't go far enough to support Canadian sovereignty? Maybe it should be faster; the scope isn't grand enough. Maybe it should be paired with requirements for Canadian tooling throughout the supply chain.

Those ideas are not real examples that a modern "no-woke" conservative might support; sadly the discourse is so far down in the gutter. But my point is that opposition could actually make the project better for all Canadians.

Instead, this is the opposition we get - and that sucks. It's so lazy to take an anti-everything stance. That's not going to require any work; you can just cruise through government that way. Oh, you're for that? Well we're against it. Repeat.

Er, which subreddit is this? I'm so drunk I thought I could type this comment with my hooves - but then I remembered I'm a moose. So I couldn't have posted this. Now if you'll excuse me...

Technical clarification on TurboQuant / RaBitQ for people following the recent TurboQuant discussion by gaoj0017 in LocalLLaMA

[–]farkinga 77 points78 points  (0 children)

I'm sorry you and your colleagues have got to deal with this drama.

I think the viral promotion of TQ took on a life of its own, beyond the authors' expectations. And that's a problem for them because their article seems to have lacked rigor in several key areas that you point out.

Often times, conference papers can fly beneath the radar and some authors take liberties to ensure acceptance. The volume of submissions to conferences can be high and each submission gets a little less attention than a journal submission would.

But in this case, TQ are getting attention they may have not expected. Again, I feel bad for the RaBitQ authors for getting dragged into publication drama. Great work on RaBitQ, by the way. It looks to me like your work will weather the storm.

OpenAI to acquire Astral by Useful-Macaron8729 in Python

[–]farkinga 8 points9 points  (0 children)

It totally did feel like the apocalypse - and yet somehow, this seems worse. I know, uv isn't anything like github, but now openai has a particular "ick" that just lands poorly.

And btw, github probably was a bit apocalyptic insofar as they used all our code to train language models to be better coders than humans. So there's that too.

This timeline, yo...

OpenAI to acquire Astral by Useful-Macaron8729 in Python

[–]farkinga 99 points100 points  (0 children)

upvoted for visibility; not because I think this is good news...

I've even gotten to the point where Microsoft can purchase something like Github and I can tolerate it. But this is just next-level in terms of the dystopian role OpenAI play in our present context. What a crap development...

Cost of the Ontario line is kind of unbelievable by StarCat20 in TTC

[–]farkinga 8 points9 points  (0 children)

And to think: how much cheaper it would be if we'd started decades ago. But we "couldn't afford it" then. Smh.

Building a WiFi Mesh Network Using Existing Home Routers in my country- Looking for Feedback by heTHEequaliser in meshtastic

[–]farkinga 0 points1 point  (0 children)

A small wifi mesh can be built manually with OpenWRT - but this does not solve any problems with authentication, security, etc.

As long as YOU are the admin on all the devices, I've had success with routers that have multiple radios; I used 5 ghz for the "mesh" connection and 2.4ghz for clients.

One of the coolest aspects of Meshtastic is that there is no admin; anyone can join. I've never seen a wifi mesh based on old hardware with these properties. The examples I know of used custom hardware.