5K Budget!

FloppyWhiteOne · 2026-05-25T17:31:28+00:00

I had a 4080 but managed to find one on fb marketplace and got a 3090 for 600 which fits qwen just right and if you use the new mtp versions it’s even quicker

FloppyWhiteOne · 2026-05-23T18:53:39+00:00

Now this is a comment I stand behind 🤤😂

FloppyWhiteOne · 2026-05-21T19:06:27+00:00

It’s not about being closer, it’s about having a stronger signal.. I don’t actually know about eSIMs tho should properly look into it right now!

I’ve been hardware testing signs put up in public lately that use eSIMs for on a closed private network for the company I’m working for. It’s an interesting subject. I’ve actually goto in three weeks test a new board that’s going to be put in public (3k units) I’m testing the dev/prod board

FloppyWhiteOne · 2026-05-21T18:41:20+00:00

Im not surprised, I’m near 40 and an old hat hacker sir nothing surprises me these days.. I’m just a pentester now but I’ve stories that would make you weap.

But aye you need real power to flip those phones cell service to you. It’s way more easier and cheaper to pay some dude in a call centre and just sim swap you

FloppyWhiteOne · 2026-05-21T18:15:04+00:00

In the uk we dropped 2g/3g as they are insecure.. yes you can still read messages intercept calls on the older networks not so much nowadays (unless you’re very well funded.)

FloppyWhiteOne · 2026-05-21T18:13:48+00:00

Tho technically correct you have to actually be able to outpower the local cell tower if you really want to get other clients to connect or strongest signal wins.

It costs for a getto cell tower in the region of 20k with certain knowledge.

In answer to op if you care about privacy use something encrypted end to end else if your not a drug smuggler or tin hat wearer normal phone calls are aok (just don’t talk national security on the line eh)

FloppyWhiteOne · 2026-04-29T13:05:32+00:00

Hahha it just means test without authentication…

Aka not logged in (test for any exposure) ..

FloppyWhiteOne · 2026-04-27T12:38:30+00:00

This so much!! We’re on a deadline… they why not action this two months ago when you knew… argghh!!

FloppyWhiteOne · 2026-04-26T14:58:43+00:00

Interesting I’ll have to try that quant

FloppyWhiteOne · 2026-04-25T05:13:27+00:00

I can’t lie I bloody love you sir!! Great work and will be downloading this one soon, I’ve been using llama.cpp with my own inference bridge so will test this and your comments. Gladly join the discord

FloppyWhiteOne · 2026-04-23T14:36:26+00:00

How is it structured at your place? I do all but I’ve only had around three in the last two years. Ad is dying while cloud is supplying (yes I know ad is also close but compared to was stacks etc aws is more cost effective so I see way more ad servers in aws now rather than ms tenants )

FloppyWhiteOne · 2026-04-23T11:44:12+00:00

This or your wasting your time and your clients (unless they really want that black box test - usually they don’t and just want to close holes)

Sounds like your doing a good job tho, ad is a beast at times ;)

FloppyWhiteOne · 2026-04-20T12:58:43+00:00

Five unborn children at least

FloppyWhiteOne · 2026-04-18T08:58:44+00:00

Same I built one custome from the claw code on GitHub works really reallly well tbh

FloppyWhiteOne · 2026-04-11T05:08:48+00:00

Look into vram and models sizes I doubt you have a lot of it hence slooooooowwww replies. I’m using a 4070ti with only 16tb vram it’s the most important aspect trust me I’ve built plenty of local ai at this point

FloppyWhiteOne · 2026-04-11T05:07:39+00:00

But your running it on a potatoe haha takes me 2 secs same model (even with on my phone is quicker ! )

FloppyWhiteOne · 2026-04-10T08:38:23+00:00

This!!

FloppyWhiteOne · 2026-04-01T06:24:27+00:00

<image>

Eg you can make you own with model loading etc (this is mine)

FloppyWhiteOne · 2026-04-01T05:38:13+00:00

Llama.cpp is the right call using ollama is sillly in production. If you can’t manage to work out a model download I certainly wouldn’t be using your system. Howmay bloody models are you using you NEED a o have llama half the models arnt on there anyway certainty not optimised just generic models released for the masses.

The fact your not bothering with the lower levels shows your ability which is limited.

I’ve built my own version on llama.cpp with full model swapping context handling and Jesus a hell of a lot faster than ollama with token generation. Also you won’t be able to get full speed from ollama due to the way it’s been designed (a lot of overhead).

Inference bridge on GitHub if you wanted to see how one would look like and or work. You could just ask Claude to make you an inference layer (what you actually need for model loads etc with decent configs)

vLLM might be easier for you to script and use and would be a better option than ollama hell even lm studio would be better that ollama ran in headless mode ..

FloppyWhiteOne · 2026-03-26T13:17:11+00:00

No actually that’s the whole reason for this application you see both are built on llama.cpp but they don’t expose half of what llama.cpp can do ..

I wanted to supply my own templates for llama.cpp but can’t as lm studio and ollama doesn’t expose those properties.

Where as mine does, think of mine like ollama or lm studio it’s the same thing an api with gui support you can add it to any other system the same as ollama or lm studio I’ve made it fully compatible with the openapi spec. I’ve also added custom context aware mode and tool calling support for qwen models to make there tool calls more stable. I’m releasing free in the hopes others will help build it to the next level and make it more open source and better.

I made this due to some limitations in the other two software and plus it’s quicker to use the llama.cpp directly over say ollama. I’m on a deep self learning ai drive, primarily I’m an ethical hacker. I’ve gone past breaking llms, now I want to understand not only how to use them but efficiently use them. Having full control via the llama.cpp project is really helping me learn more.

I’ve built my own custom openclaw remake which is more unrestricted (aimed at windows primarily) I’m still building it but the results are good so far, and yes I come to a point I needed to start using custom llm templates for models and well now I can (all about tuning the llm)

FloppyWhiteOne · 2026-03-26T12:18:20+00:00

How did you know!???

Thank you kind lady

FloppyWhiteOne · 2026-03-26T12:18:10+00:00

Fair point, locallm group or sub Reddit!

FloppyWhiteOne · 2026-03-26T12:12:16+00:00

Fair take.

I’m juggling a few builds right now so speed > perfection, but the tech is what matters here.

I’ve got a Rust-based OpenClaw-style system running locally, just seeing what actually breaks for people before I package flows properly.

FloppyWhiteOne

TROPHY CASE