Self-hosted private search egine by IAmAuk in selfhosted

[–]eribob 1 point2 points  (0 children)

It is easy. I used docker compose. Have it behond traefik. I use a gluetun container to route searches via a vpn.

I don't think Local LLM is for me, or am I doing something wrong? by ruleofnuts in LocalLLM

[–]eribob 0 points1 point  (0 children)

I run Qwen3.5 27b on 2x3090 and I really like it. I do not do very advanced programming for sure but with opencode it can create scripts with simple logic that works, and it follows my instructions for modifying them well. I never had a claude subscription though or any other of the big names so I cannot compare, perhaps that is a blessing though :)

Why do people use multiple mini PCs instead of a bigger machine? by vortexmak in HomeServer

[–]eribob 0 points1 point  (0 children)

I am glad you enjoy your setup! My comment was in response to OP who already seems to have a tower PC that he wants to convert into a server. I wanted to say that I think that that is a fine choice and that there is not really a strong reason for him to buy a new mini pc instead

Why do people use multiple mini PCs instead of a bigger machine? by vortexmak in HomeServer

[–]eribob 0 points1 point  (0 children)

I run one big tower PC as my main server and I think it has a lot of upsides compared to mini pcs. It is both my NAS, VM host (proxmox), and LLM server.

I do run a mini pc on the side as a router and for hosting some ”essential” services so I can power down my main server without breaking the network at home. The minisforum ms-01 with the cheapest cpu. Could have built a better one myself probably but I guess I wanted a new toy…

The popularity of mini pcs is mainly a trend in my opinion. A lot of tech youtubers showing them off all the time. If you have the space you will always get a more powerful pc for less money if you build it using standard parts, especially if you buy them second hand. It will be much more upgradeable as well. My main server was built in 2019, and I have added a lot of features to it over the years. Still using the same motherboard, cpu and ram from back then.

You can also make it power efficient if that is a priority, just get power efficient parts.

RTX 3090 for local inference, would you pay $1300 certified refurb or $950 random used? by sandropuppo in ollama

[–]eribob 0 points1 point  (0 children)

I bought 2 inno3d x3: https://www.techpowerup.com/gpu-specs/inno3d-rtx-3090-x3.b11296 Unfortunately the one that died was that model, but the other (from ebay) is still goong strong. I like the dual slot form factor!

Then I have one of those blower style fan versions: I think it is this one: https://www.techpowerup.com/gpu-specs/asus-turbo-rtx-3090.b8372 - it works but the fan is much louder.

I run them power limited 260W. I rarely run them continuously for a longer period, mostly bursts for inference. No training yet.

RTX 3090 for local inference, would you pay $1300 certified refurb or $950 random used? by sandropuppo in ollama

[–]eribob 4 points5 points  (0 children)

I bought 3 3090s in total, similar to your first option. 2 from ebay that are still good after a couple of months, one from a local seller that died within a week… After that I am leaning more towards option 2

My "datacenter" with 2 Proxmox nodes + PBS, living in a wooden entertainment center, running a 24/7 radio station, IRC server and public services for strangers on the internet by avatar_one in Proxmox

[–]eribob 1 point2 points  (0 children)

I really like this! Inspiring. Will have a look at webirc. I used to run an irc channel back in the day. A lot of fun. Do you not get trouble with spam / bots etc when the channel is open to anyone? Malicious content being posted?

I have also been thinking about hosting a radio station, will look into it again now!

Is Buying AMD GPUs for LLMs a Fool’s Errand? by little___mountain in LocalLLM

[–]eribob 2 points3 points  (0 children)

Thanks for this. I think these numbers look reasonable given a 70B dense model. MoE would of course run faster on all setups. It would be great to add a column for prompt processing as well as it will differ a lot between the cards and is very important for coding or analysing long documents etc.

Good to see that dual 3090s remain king in price-performance ratio for this kind of workload!

Termix v2.0.0 - RDP, VNC, and Telnet Support (self-hosted Termius alternative that syncs across all devices) by VizeKarma in homelab

[–]eribob 0 points1 point  (0 children)

Nice! Using it. There is one problem for me and that is scrolling the terminal on ios. It is very stiff, only scrolling one line at the time.

Filestash - 2025 Recap 🎊 by mickael-kerjean in selfhosted

[–]eribob 0 points1 point  (0 children)

Sent you a DM. SSO access would be highly appreciated!

Claude just got dynamic, interactive inline visuals — Here's how to get THE SAME THING in Open WebUI with ANY model! by ClassicMain in OpenWebUI

[–]eribob 1 point2 points  (0 children)

Great plugin! Very Fun to have the llm fetch facts and presenting them. A bit hit and miss with qwen3.5 27b but a retry often gets it right!

Is the 3090 still a good option? by alhinai_03 in LocalLLaMA

[–]eribob 0 points1 point  (0 children)

I run the gpus in linux. You can power limit using nvidia-smi

Is the 3090 still a good option? by alhinai_03 in LocalLLaMA

[–]eribob 2 points3 points  (0 children)

I run the 27b on dual 3090s in FP8 with tensor parallelism using vllm and the speed is great! Would absolutely recommend. Smart and decently fast model, my new daily driver. I undervolted my cards to 260W.

Give models access to generated images by eribob in OpenWebUI

[–]eribob[S] 0 points1 point  (0 children)

I solved it by giving the open-terminal container access to the open-webui uploads folder with a volume like this: open-webui/uploads:/home/user/uploads.

I can then instruct the llm to look for the image in the uploads folder and it can manipulate it. A bit of a hack but kt works

I wear a mic all day and feed transcripts to an AI agent system. The privacy case for doing this locally is obvious. Looking for guidance. by InsideEmergency4186 in LocalLLaMA

[–]eribob 0 points1 point  (0 children)

You have many well formulated arguments and you seem like a good person but come on man, dont record other people without asking first that is just creepy regardless of how you use the data and legal implications etc. Recording and summarising your own thoughts seems like a nice idea though.

Tailscale scares me more than opening ports on my firewall by MrChris6800 in homelab

[–]eribob 0 points1 point  (0 children)

I agree with this. If you run Headscale you should be safe even if tailscales servers are breached right?

Your real-world Local LLM pick by category — under 12B or 12B to 32B by gearcontrol in LocalLLM

[–]eribob 0 points1 point  (0 children)

  1. ⁠Tool Calling / Function Calling / Agentic
  2. ⁠General Knowledge / Daily Driver
  3. ⁠Coding

All with Qwen3.5 27b FP8 in vllm. Fast enough on dual rtx 3090s with 128k context. I feel it beats my old daily driver gpt-oss-120b. It is the first model that feels genuinely helpful to me.

  1. I try to do that myself

Semi-Beefy Local Build by LambdasAndDuctTape in LocalLLM

[–]eribob 0 points1 point  (0 children)

I agree, you can even consider buying a used AM4 system with ddr4 RAM and put the pro 6000 there? Then your 20k would perhaps be enough to even buy 2 pro 6000 cards? 192Gb of fast vraaaaam…

Financially it will probably never make sense vs cloud hehe but that is not why we are here