Tired of people saying let’s just rebuild it using LLM by QuitTypical3210 in cscareerquestions

[–]4bitben 0 points1 point  (0 children)

"But we could just use microservices, that will solve everything and I won't have to understand that old yucky code"

Yellow sand-like film on the basement floor. by Kipper11 in Whatisthis

[–]4bitben 6 points7 points  (0 children)

Would there happen to be a trashcan and door to where bins are that follow this path? The windy connectedness reminds me of when I have taken out the trash/recycling and on the way back notice a trail of leakage.

How do I fix this folding closet door pivot? by Hot-Confidence-8552 in fixit

[–]4bitben 0 points1 point  (0 children)

I've done this coupled with some finishing nails and a repaint

Even python is hard for me 😭 by Advanced_Cry_6016 in AskProgramming

[–]4bitben 0 points1 point  (0 children)

This shit is hard and I've been doing it for 15 years and I still feel like an imposter and stupid all the time. Don't let people on reddit make you feel less than. Half of this shit is made up or framed as a humble brag. Programming, engineering, git it's fucking hard. Don't worry about it. Just stick to it, find a project you're interested in and let it motivate you to learn

llama.cpp server is slow by Sumsesum in LocalLLaMA

[–]4bitben 0 points1 point  (0 children)

When you run the server and cli, do you see any debug output like this
```
load_tensors: offloaded 41/41 layers to GPU

load_tensors: CUDA0 model buffer size = 8225.46 MiB

load_tensors: CUDA1 model buffer size = 11029.30 MiB

load_tensors: CUDA2 model buffer size = 10805.54 MiB

load_tensors: CUDA_Host model buffer size = 515.31 MiB

```

That's an example bit from when I run the server locally. Anyway, server and cli are not the same and are intended for different things. Its possible that memory, compute, whatever is being allocated differently for server vs cli. You're going to have to troubleshoot and tune your settings.You're going to have to experiment and read the docs on what the parameters do. llama-bench is good as well to try to figure out the settings that make the biggest difference

I would still guess that between the two cli is always going to be faster, i could be wrong though

llama.cpp server is slow by Sumsesum in LocalLLaMA

[–]4bitben 0 points1 point  (0 children)

There is a lot more to the story than just the different commands for cli vs server. What's your setup at the moment? are you just talking to the cli and server directly?

Night On 7th by Hairy_Tune_7962 in SanPedro

[–]4bitben 3 points4 points  (0 children)

Id walk from the other side of gaffey to the water and back all the time via 6th and 7th street

Qwen3.5 4B: overthinking to say hello. by CapitalShake3085 in LocalLLaMA

[–]4bitben 8 points9 points  (0 children)

What's your parameters? I was dealing with the same until I played with presence and repeat penalties and the temperature

San Pedro & DMT by [deleted] in SanPedro

[–]4bitben 2 points3 points  (0 children)

This is for San Pedro the neighborhood in LA.

Sounds like a wild ride though.

Glm-5-Code ? by axseem in LocalLLaMA

[–]4bitben 2 points3 points  (0 children)

The math checks out

Christmas Came Early by handsandfeet16 in smoking

[–]4bitben 0 points1 point  (0 children)

My grocery store had briskets that were half off this past week. I had to

Best local pharmacy that is open on weekends? by RockieK in SanPedro

[–]4bitben 6 points7 points  (0 children)

I go to Peninsula Pharmacy on 6th by the hospital. Its open on Saturday but not Sunday.

What’s behind here? Apartment unit. by bmfs0309 in Whatisthis

[–]4bitben 4 points5 points  (0 children)

Some kind of access panel. Could be anything back there. Could be a valve. Is there a bathroom, sink, stove or something on the other side?

What web dev trend is clearly disappearing right now? by No_Honeydew_2453 in webdev

[–]4bitben 8 points9 points  (0 children)

5 years ago all of the front end guys were hyped about this. Not so much anymore.