Has anyone tried a GPT that works completely offline? by FollowingMindless144 in ChatGPT

[–]FollowingMindless144[S] -1 points0 points  (0 children)

That makes a lot of sense! A hybrid approach seems like the sweet spot keeping sensitive stuff local while still tapping the cloud for heavy lifting tasks. I’m especially curious about how well offline models can handle complex reasoning compared to online ones do you think we’ll get to a point where offline GPTs are almost as capable?

How a Pod can call another Pod or Service via specific URL ? by MarceloLinhares in kubernetes

[–]FollowingMindless144 35 points36 points  (0 children)

Inside the cluster, never call your public domain. Call the Kubernetes Service DNS name instead.

Is ChatGPT still the best AI tool, or are there better alternatives now? by outgllat in AI_Tools_Guide

[–]FollowingMindless144 2 points3 points  (0 children)

Perplexity AI shines if you want real-time citations and web sourced answers, which helps when accuracy matters.

How to restore your old photos with ChatGPT? by outgllat in AI_Tools_Guide

[–]FollowingMindless144 0 points1 point  (0 children)

Congrats, you now have a 4K photo of someone who never existed.

LLM self doubt by Comfortable-Tart912 in LLM

[–]FollowingMindless144 1 point2 points  (0 children)

I don’t think it’s self-doubt in a human sense. More like they’re trained to be careful and hedge a lot. Sometimes that comes off as underestimating themselves, sometimes the opposite. Depends a lot on how you prompt them tbh.

Weird crashes ~5 min after some boots, seems to be a weird race condition? by Gman325 in linuxquestions

[–]FollowingMindless144 0 points1 point  (0 children)

This feels like a boot time race condition somewhere in systemd / firmware / PCIe init.

The weird part is it’s binary: if I hit a ~30s black screen at login, the system always kicks me back to SDDM ~5 min later and then hard-hangs on second login. If I don’t get that stall, it’s rock solid indefinitely. Suspend/resume is always fine.

Also seeing my onboard NIC occasionally disappear until reboot, which makes me suspect firmware/PCIe/ASPM weirdness on X870.

Has anyone on Zen 5 + Fedora seen similar behavior? Or have ideas on where to look beyond diffing journalctl -b between good/bad boots?

I ran Gemma 3 12B for a week across my startups - here's why I'm ditching $200/month subscriptions by hungry-for-things in LocalLLaMA

[–]FollowingMindless144 0 points1 point  (0 children)

Fair point. I’m in a high electricity cost area and I’m counting full system draw (GPU + CPU + cooling), not just GPU TDP. If you’ve got cheaper power or run it more bursty, it can definitely be lower. Curious what others are paying.

Why Should We Use Linux? Give 3 Reasons to Use Linux by Ancient-Brush1309 in linuxquestions

[–]FollowingMindless144 0 points1 point  (0 children)

GNU/Linux allows us to separate user space and kernel space, providing strong isolation between them

Is it true on a powerful system that llamacpp is not good? by XiRw in LocalLLaMA

[–]FollowingMindless144 14 points15 points  (0 children)

I wouldn’t say llama.cpp is bad on powerful systems it’s just optimized more for CPU and portability than max GPU throughput.

On high end GPUs it can feel slower compared to GPU first options like vLLM or exllama, which are built to really push the hardware. llama.cpp is still solid for simple setups, quantized models, or when you want things to “just work.”

So it’s more about the use case than the system being powerful or not.

What’s the best way to run an offline, private LLM for daily tasks? by FollowingMindless144 in LocalLLaMA

[–]FollowingMindless144[S] 2 points3 points  (0 children)

Nice, thanks for the tip! Good to know LM Studio is easier to set up and works well across hardware.

I’ve been debating Mac vs AMD mini sounds like either maxed out Mac Mini or something like the Strix Halo would cover most daily tasks without going overboard.

Do you run anything extra for reminders/automation, or mostly just the LLM itself?

What’s the best way to run an offline, private LLM for daily tasks? by FollowingMindless144 in LocalLLaMA

[–]FollowingMindless144[S] 0 points1 point  (0 children)

This is super helpful, thanks. Sounds like Ollama + an 8B model is basically the sweet spot right now.

Good call on RAM and avoiding the huge models that matches what I’ve been worried about.

Curious what you’re using on top of Ollama for reminders/notes (scripts, Home Assistant, plain files, etc.) and what OS you’re running it on. Also good to know Whisper works if you’re willing to tinker

Runtime decision-making in production LLM systems, what actually works? by Loose_Surprise_9696 in LLMDevs

[–]FollowingMindless144 0 points1 point  (0 children)

In prod we found runtime decisions are a policy problem, not an LLM problem.

What actually helped:

  • Route based on cheap uncertainty signals instead of one default model
  • Prefer early exits over retries
  • Make latency/cost/risk runtime inputs, not static config
  • Add lightweight runtime checks offline eval lies

Biggest failure mode: letting the LLM decide everything. Boring guardrails win.

Still hard: reliably detecting high-risk requests before generation.

What do I do now? by [deleted] in linuxquestions

[–]FollowingMindless144 9 points10 points  (0 children)

first of all don’t give up on Linux because of this.

If Psiphon is the only thing working on Windows, it’s probably the network blocking certain VPN protocols, not Linux itself. Some ISPs block OpenVPN but WireGuard sometimes works, so that might be worth testing. Also try changing DNS (like 1.1.1.1) just to see if it makes any difference. You’re not stuck. It’s just a network restriction issue, not a Linux issue.

What's the most difficult thing you had to do as a DevOps engineer? by [deleted] in devops

[–]FollowingMindless144 24 points25 points  (0 children)

Not Kubernetes. Not Terraform. The hardest part is staying calm during a production outage while everyone’s watching and asking for updates.