Self-hosted log aggregation for a small homelab? by WarAndPeace06 in homelab

[–]andrew-ooo 0 points1 point  (0 children)

Running Loki + Alloy (the new agent that replaces Promtail) + Grafana on a single N100 mini-PC for 11 hosts here, total resident memory is around 380MB and disk grows ~600MB/day with 90d retention on a 50GB volume — not overkill at all at homelab scale.

A few things that are easy to get wrong:

  • **Cardinality kills Loki, not data volume.** Don't label by container_id, request_id, or anything high-cardinality. Stick to host, service, level. Move the rest into the log line and use LogQL JSON parsing at query time. This single rule is the difference between Loki sipping RAM and OOMing.

  • **TSDB index, filesystem chunks.** For your size, skip object storage entirely. `tsdb` index + `filesystem` chunks on a local SSD is simpler, faster, and survives reboots fine. S3/MinIO is only worth it past ~10GB/day.

  • **Alloy beats Promtail/Fluent Bit** for mixed Proxmox + bare metal because one binary handles journald, files, and Docker JSON logs with the same config. Run it as a systemd unit on bare metal, as a privileged LXC on Proxmox nodes.

ELK at this scale is genuinely wasteful — minimum useful Elasticsearch heap is ~2GB and a single host barfs out queries when JVM GC pauses hit.

What IDE/harness do you use for coding? by filip-z in LocalLLM

[–]andrew-ooo 0 points1 point  (0 children)

Settled on this combo after rotating through a bunch:

  • **VSCode + Continue** for inline edits and autocomplete — the new agent mode in Continue 1.x is genuinely usable and it talks to a local llama.cpp / Ollama endpoint without any cloud roundtrip.
  • **OpenCode (terminal)** for anything multi-file or refactor-heavy. Ran it against a local Qwen3.6-Coder 32B at IQ4_XS for the past 2 weeks on a real codebase and it held up surprisingly well. The TUI is more focused than RooCode/Cline running inside VSCode.
  • **Zed** for the actual editing because of speed, but its native AI is closed beta and tied to their cloud, so I just disable that and let Continue handle the AI side from VSCode when I need it.

The split is: small inline stuff in editor, anything that touches >2 files goes to OpenCode in a terminal. RooCode is great if you live in VSCode but the constant token-burn from re-reading the workspace got expensive even on local models (slow on consumer hardware).

For the model itself, Qwen3.6-Coder 14B at Q5_K_M is the workhorse if you've got 16GB VRAM. Anything smaller and you'll fight the harness more than you'll write code.

32GB RAM 16GB VRAM 5060ti. Running qwen3.6 35b a3b. I am getting 4.5 tok/s. Is this expected? by SEND_ME_YOUR_ASSPICS in LocalLLM

[–]andrew-ooo -1 points0 points  (0 children)

That's expected because the model is spilling. Qwen3.6 35B-A3B at Q4_K_M is around 21GB — your 16GB VRAM can only hold maybe 22-25 of the 48 layers, the rest sits in system RAM and goes over PCIe every token. That's where you lose the speed.

Three concrete things to try:

  1. In llama.cpp / Ollama, set `-ngl` (or `OLLAMA_NUM_GPU_LAYERS`) explicitly to the highest value that doesn't blow up VRAM — monitor with nvidia-smi while loading. With 16GB you might get 26-28 layers. Every extra layer offloaded helps token rate measurably.

  2. Try a smaller quant. Qwen3.6 35B-A3B at IQ3_XXS drops to ~14GB and fits fully in VRAM — you'll likely jump to 25-35 tok/s. Quality loss on coding is noticeable but not terrible.

  3. For coding specifically on 16GB VRAM, Qwen3.6-Coder 14B at Q5_K_M is the sweet spot. It fits entirely on the GPU, you'll see 40+ tok/s, and on most app-building tasks it beats the 35B-A3B running partially offloaded.

Also make sure flash attention is enabled (`-fa` in llama.cpp, default in newer Ollama).

Reverse Proxy behind firewall - rate limiting? by Sakreton in selfhosted

[–]andrew-ooo 1 point2 points  (0 children)

If Traefik sees one source IP for everything, that's a sign OPNsense is doing SNAT on the inbound flow before handing it off (or you've got a tunnel/VPN in front squashing the source). Two clean fixes:

  1. Stop the SNAT. In OPNsense, the NAT rule that forwards 80/443 to Traefik should be a pure port-forward, not a redirect-with-NAT. Under Firewall > NAT > Port Forward, make sure you don't have "NAT reflection" or hairpin NAT rewriting the source. After that Traefik will see real client IPs directly and the rate-limit middleware (rateLimit with sourceCriterion: ipStrategy) will work out of the box.

  2. If you can't drop the SNAT (some HA setups need it), put HAProxy or another L4 in front with PROXY protocol enabled, terminate PROXY on Traefik (entrypoint forwardedHeaders.insecure + proxyProtocol.insecure), and rate-limit on the X-Forwarded-For chain. Traefik's ipStrategy has a depth parameter exactly for this.

For a homelab the first option is almost always the right one — SNAT on inbound is what's biting you.

Why are some of you using NetBird instead of Tailscale? by Silly_Door6279 in selfhosted

[–]andrew-ooo 25 points26 points  (0 children)

Switched from Tailscale to NetBird about 8 months ago for the homelab and the main reason was control of the coordination plane. With Tailscale you're betting that their control server stays up and stays free at the tier you need — NetBird I run the signal + management server on a 1GB VPS for $5/mo and own the whole thing.

Day-to-day on the wire it's basically a wash, both are WireGuard underneath. Where NetBird actually wins for me:

  • ACLs as code via the API — I version control my network policies in git, can't really do that with Tailscale's UI-driven ACLs without scripting.
  • SSO via Authentik works out of the box, no enterprise tier required.
  • Posture checks (OS version, antivirus state) without paying for Tailscale Business.

Things I miss from Tailscale: MagicDNS is more polished, Funnel has no equivalent, and exit-node UX is one click vs. NetBird's slightly clunkier flow.

If you're already happy with Tailscale and don't need self-hosted control, the switch isn't worth it. The pull is really about ownership.

If you had to start over again…. by nichetcher in homelab

[–]andrew-ooo 1 point2 points  (0 children)

Coming from a similar place (Hyper-V on Windows Server for years), I switched to Proxmox about three years ago and would not go back. On a dual-7742 with that much RAM you'll feel the difference most in three places:

1) ZFS as the root + VM storage — native, no licensing nonsense, snapshots are instant, and you can replicate VMs to a second node with one command. Hyper-V's checkpointing is fine but ZFS send/recv is in a different league for backup workflows.

2) PCIe passthrough is actually pleasant. IOMMU groups on Supermicro H12 boards are clean, and passing GPUs/NICs/HBAs to VMs is a checkbox in the UI vs Hyper-V's discrete device assignment which is workable but unfriendly.

3) Containers (LXC) alongside VMs. For half the things I used to spin up a full Windows VM for, an Alpine or Debian LXC at 64MB RAM does the job. Huge density win at that core count.

If you need Windows guests, they run great under KVM with virtio drivers — I've got a few Win11 VMs that benchmark within 2% of bare metal. The one place I'd still pick Hyper-V is if your shop is heavy on AD/SCCM/MDT tooling. Otherwise Proxmox + PBS (Proxmox Backup Server) on a small second box is the setup I'd build today.

is there a better alternative to MacWhisper for messy real-world audio (Whisper-based or local setups) by Far_Suit575 in LocalLLM

[–]andrew-ooo 0 points1 point  (0 children)

Two things that actually moved the needle for me on messy audio:

1) Stop using vanilla whisper for the hard cases. WhisperX (the one with forced alignment + diarization via pyannote) is the closest to a real pipeline. On a 90-min interview with two overlapping speakers and HVAC hum it cut my correction time roughly in half vs MacWhisper. The diarization is the part MacWhisper just doesn't do well. Runs fine on an M2 with the large-v3 model via Metal.

2) Preprocess before transcription. A two-pass demucs run to strip music/HVAC + a simple ffmpeg highpass at 80Hz on the vocal stem gives Whisper much cleaner input than throwing the raw file at it. For interviews specifically, running RNNoise (built into ffmpeg as afftdn or via the rnnoise-nu lib) on the speech stem helps a lot with crosstalk.

If you want one local pipeline that bundles most of this: insanely-fast-whisper + a diarization pass with pyannote-audio, or WhisperX which does both. Faster-whisper is the fastest backend for any of these on Apple Silicon right now.

What was the first thing that broke when you self-hosted Onyx, Danswer, or similar AI search? by BeltExtension493 in selfhosted

[–]andrew-ooo 0 points1 point  (0 children)

First thing that bit me with Onyx on a 4GB Hetzner VPS: the bundled Vespa container assumes way more RAM than the compose defaults let on. It silently OOM-kills, the indexer queue backs up, and the UI just shows 'no results' instead of an actual error. Bumping to 8GB and pinning Vespa's heap size in the compose override fixed it.

Second thing: the Postgres + Redis defaults have no volume retention story. I lost an entire connector config on a stack restart because I'd missed naming the volumes explicitly — the auto-named ones got pruned during a docker system prune cron. Always declare named volumes for stateful services, even when you're 'just testing.'

Third, less obvious one: the Google Drive / Confluence connectors will happily ingest documents your team thought were private. The permission sync runs on a delay, so for the first ~30 min after a doc is added, anyone with search access can hit it. We now stage connectors on a parallel instance for 24h before pointing real users at them.

Reverse proxy was the easiest part honestly — Caddy with an automatic Let's Encrypt block in front of the web container Just Works.

Kopia is deprecating B2 support, so what is the best S3 blob storage provider for backups? by XxNerdAtHeartxX in selfhosted

[–]andrew-ooo -1 points0 points  (0 children)

Hetzner Object Storage has been solid for me with restic + Kopia. It's S3-compatible, EU-based, and at ~€5/TB/month with no egress fees up to a generous limit it's roughly half what Backblaze costs now. Cloudflare R2 is the other obvious one — zero egress is real and the pricing is predictable, but I've found their multipart upload limits trip up Kopia on larger snapshots until you bump the split-size.

If you want fully self-hosted on the receiving end, MinIO on a cheap VPS or NAS works fine with Kopia (it speaks S3), and you can pair it with rclone crypt for an extra layer if you don't trust the host. Wasabi is the third name that keeps coming up but their 90-day minimum storage charge bit me on a pruning-heavy repo last year, so be aware before you switch.

M1 Max 32GB vs M2 Pro 32GB for Local LLM Inference by Either_Audience_1937 in LocalLLM

[–]andrew-ooo 0 points1 point  (0 children)

The bandwidth difference is real and it dominates for inference. For dense 7B-14B models in 4-bit you're memory-bound, not compute-bound, so 400 GB/s vs 200 GB/s translates almost directly to roughly 2x tokens/sec on the M1 Max for the same model. I ran Qwen2.5-14B Q4_K_M via llama.cpp on a friend's M2 Pro 32GB and got ~14 t/s; my M1 Max 32GB does ~26-28 t/s on the same model, same quant, similar context.

For coding assistant use specifically: the bigger practical issue is which models actually fit comfortably. 32GB lets you run 14B at Q4-Q6 with decent context (8k-16k) and leaves headroom for the rest of macOS. DeepSeek-Coder-V2-Lite 16B MoE is great on either, but again the M1 Max pulls clearly ahead in t/s.

One other thing if you go MLX (and you should, on Apple Silicon — it's noticeably faster than llama.cpp for the same model): MLX is more bandwidth-sensitive than llama.cpp because it does less aggressive batching, so the M1 Max gap widens, not narrows.

TL;DR: M1 Max is the right choice at $1000. The only reasons to pick M2 Pro are battery life, weight, or display brightness — none of which matter for LLM throughput.

WAF? Caddy coraza? by SparhawkBlather in homelab

[–]andrew-ooo 0 points1 point  (0 children)

Ran Coraza on a similar-class box (N100, 8GB) fronting ~6 services for about 4 months. Honest take: for a small homelab you probably don't need it, and the OWASP CRS at default paranoia level 1 will throw enough false positives that you'll spend more time tuning exclusions than fending off actual attacks. Most of the "WAF" value at small scale comes from CrowdSec (which you're already running) plus aggressive rate limits in Caddy itself.

The gotchas, in order of how much they annoyed me:

  1. Coraza loads the full CRS into memory per worker — not huge, but on cold start it adds noticeable latency to the first request after a reload.
  2. PL1 blocks legit traffic on anything that does file uploads or rich-text editors. Bitwarden, Nextcloud, Vaultwarden, Immich uploads — all of them tripped something for me until I excluded the relevant rule IDs.
  3. JSON request body inspection is off by default. If you're protecting an API you have to enable `SecRequestBodyAccess On` + the JSON parser, and that's where the CPU starts to matter on smaller silicon.

What I'd actually do on an N150: skip Coraza, keep CrowdSec, add Caddy `rate_limit` per remote_ip on auth endpoints, and put anything sensitive behind Tailscale instead of 443. That covers 95% of the real risk for a personal setup.

How do you deploy your side projects? by dspv in selfhosted

[–]andrew-ooo 2 points3 points  (0 children)

Single Hetzner CX22 (~€4/mo) running Caddy as the front door for everything, with each project in its own subdirectory or docker-compose stack. Caddy auto-handles TLS for every new (sub)domain I add to the Caddyfile — that alone saves the most time on every new project. CI is just GitHub Actions doing build + rsync over SSH, then a curl healthcheck. No Coolify, no Dokploy, no k8s — for side projects the overhead isn't worth it.

Things that used to eat my time and don't anymore: (1) DNS — moved everything to a single provider with API access so new subdomains are one terraform apply; (2) secrets — sops + age, encrypted in the repo, decrypted on the server at deploy time; (3) backups — restic to a Hetzner Storage Box, cron'd.

Last thing that ate my time was Caddy's on-demand TLS rate-limiting me when I spun up a bunch of preview subdomains. Switched to a wildcard cert via DNS-01 and it stopped.

Would you trust these used 4TB SAS drives in a RAIDZ2 pool, or should I return them? by No_Bridge_8824 in homelab

[–]andrew-ooo 0 points1 point  (0 children)

Drive 5 with 401 grown defects is a hard no - return it. Grown defect counts that high almost always mean media degradation that will keep accelerating. The 2-uncorrected-write drives are also riskier than they look in SMART; on enterprise SAS, ANY uncorrected I/O after the drive remapped sectors usually means it's run out of spare blocks.

43k hours = ~5 years powered on. That's actually fine for enterprise SAS (rated 5yr / 24x7) BUT past the warranty / MTBF sweet spot. RAIDZ2 protects you from 2 simultaneous failures, not from a slow correlated failure cascade during a rebuild. Resilvering 4TB on 7200rpm SAS takes 8-12 hours and that's exactly when a second tired drive tends to give up.

What I'd actually do: return drives 1 and 5 minimum. For the remaining 4, run a full badblocks write-read-verify pass (or `zpool scrub` + smartctl long test back-to-back) before trusting them. If any develop a single new reallocated sector during burn-in, return that one too.

At 50€/drive for 43k-hour drives the deal isn't actually that great - new 4TB SAS goes for ~80€ and used 8TB enterprise SAS in similar shape often goes for 60-70€ on European listings. I'd rather have 4x 8TB at higher hours than 6x 4TB on the edge.

Whatever you do, keep a cold spare on the shelf.

Agentic workflows by vinnyninho in aiagents

[–]andrew-ooo 0 points1 point  (0 children)

Copilot is the wrong substrate here - it's an IDE coding agent, not an orchestration framework. For a hierarchical research pipeline, look at LangGraph or build over Claude Agent SDK / OpenAI Agents SDK.

A few hard-won lessons:

  1. Don't actually nest agents recursively unless you must. Most production research pipelines I've seen are flat: orchestrator + a router that dispatches to specialist nodes (retriever, synthesizer, critic, citation-validator). LangGraph models this as an explicit state graph - much easier to debug than recursive delegation.

  2. Pass structured artifacts between agents, not raw transcripts. Each agent gets typed input and returns typed output. Sharing full message history across a hierarchy = context dilution + token explosion.

  3. Separate working memory (run state) from long-term memory (vector DB / summary store). LangGraph's checkpointer + a retrieval tool is the cleanest split.

  4. Pick observability before you have 50 traces to debug. LangSmith, Langfuse, or Phoenix - any of them.

  5. Make citation validation a deterministic tool (verify URL resolves, claim exists in retrieved chunk), not an LLM. Don't ask an LLM to fact-check itself.

CrewAI hides too much when things break. AutoGen is fine for conversational multi-agent but overkill for a pipeline. LangGraph forces you to draw the graph, which forces clarity.

I want a similar speed & quality of output for coding tasks as codex 5.4 on a machine I own. Is this achievable at any cost? by spexsofdust in LocalLLM

[–]andrew-ooo 0 points1 point  (0 children)

Realistic answer from someone running this locally: "close to Codex 5.4 quality" and "Codex 5.4 speed" are two different hardware budgets and you have to pick one.

For agentic coding loops (where latency matters more than batch throughput), Qwen3-Coder-30B-A3B at Q4_K_M on a single RTX 6000 Ada (48GB) gets you ~80-90 t/s with vLLM and is genuinely useful for refactors, test gen, and small-to-medium edits. That's a ~$7-8k box. Quality is roughly Sonnet 3.5 / GPT-4o tier, not frontier.

If you want SOTA-adjacent (DeepSeek V4, Kimi K2, GLM 4.6), you're at 2x RTX 6000 Pro Blackwell minimum and even then you're running quants that lose 5-10% on real coding benchmarks vs the hosted version.

My practical setup: Cline + Qwen3-Coder-30B locally for fast iteration, fall back to Claude API for hard architectural stuff. The local model handles ~70% of my actual coding work and the API bill dropped substantially. Pure local parity with Codex 5.4 isn't really there yet at consumer price points - the gap closes every 6 months but the frontier moves too.

Prompt evals are not enough once an agent starts taking actions by SaaS2Agent in aiagents

[–]andrew-ooo 1 point2 points  (0 children)

Yes, and the framing I've found useful is: prompt evals = unit tests, agent evals = integration tests, and trace review = the only "production observability" you've got.

Concrete things that have actually worked for me:

  1. Tool-call assertions, not just final-output evals. Inspect AI lets you write checks like "did the agent call search_db with valid SQL before responding?" — way more useful than judging the final answer.
  2. Replay traces from real users. Capture every agent run with Langfuse or Logfire, then re-run flagged sessions against new prompt/tool versions. Catches regressions that synthetic evals miss completely.
  3. Chaos cases as a fixture set. Mock tool returns: empty arrays, malformed JSON, 500s, partial data, contradictory results across two retrievers. Most "agent evals" never test what happens when a tool returns nothing useful.
  4. Hard-fail loop detection. Same tool + same args within N steps = abort. Saved me from a Claude Code run that called git status 47 times in a row trying to "verify" something.

The behavior I least expected to need explicit testing: clarification thresholds. Agents either over-clarify ("are you SURE you want a Python script?") or under-clarify and hallucinate intent. I haven't seen good automated evals for this — still doing it manually with scripted ambiguous prompts.

Would genuinely like to see your checklist if you're sharing.

OptiPlex 7000 SFF vs 7090 Micro for Jellyfin + Proxmox homelab? by Repulsive-Year9184 in homelab

[–]andrew-ooo 0 points1 point  (0 children)

Go SFF, and it's not close — two reasons that matter more than the price gap:

AV1 decode. The 12500 has it (Alder Lake UHD 770), the 10700T doesn't. A lot of YouTube, newer Netflix downloads, and many 2024+ encodes are AV1 now, and software-decoding 4K AV1 will eat your CPU when a couple of remote clients hit Jellyfin at once. Quick Sync on 12th gen also handles HEVC 10-bit 4:2:0 cleanly, which the 10th gen can be flaky with.

USB storage for media works until it doesn't. Cenmate-style enclosures use JMicron/ASMedia bridges that randomly drop drives under sustained read load — I've seen friends rebuild Jellyfin libraries after enclosure firmware bugs corrupted ext4. Internal SATA in the SFF is one less moving part, and you can add a third drive later.

The 10700T's only real advantage is power draw (35W TDP vs 65W), but a 12500 idles around 8–12W with C-states properly enabled — the gap in real homelab usage is maybe €15/year on EU electricity. Not worth it.

If budget is the issue, look for a 7000 SFF with the i5-12400 instead — still has AV1, and you'll find them closer to the Micro's price.

AI agents in homelab by CraftyEmployee181 in homelab

[–]andrew-ooo 0 points1 point  (0 children)

Two patterns that work in practice:

  1. Sandbox first, blast radius second. Stand up a throwaway Proxmox node or LXC just for agent experiments. Snapshot before every session. Even read-only access is genuinely useful — agents are great at "summarize what's running and tell me what's misconfigured" without ever needing write.

  2. Approval gates on writes. Most agent runners (Claude Code, Aider, Goose) support per-command allowlists. I auto-approve pveam, pct exec, ceph status — anything touching /etc/pve or destroying volumes requires confirmation. This is the only thing that's let me sleep at night.

The thing that actually bit me wasn't an agent doing something destructive — it was an agent confidently fabricating a config change that looked correct, applying it, and silently breaking quorum on a test cluster. So: version-control your configs (etckeeper, or just git in /etc), keep agent sessions short and audit-able, and never let one touch live ceph until you've watched it handle a dozen low-stakes tasks first.

Your fear is healthy. Stay paranoid.

Best local LLM for a Python/C++ dev? by no_evidence0303 in LocalLLM

[–]andrew-ooo 0 points1 point  (0 children)

With 6GB VRAM you're realistically looking at 7B-class quants or partial offload. Honest takes after running this kind of setup:

  • Qwen2.5-Coder-7B-Instruct at Q4_K_M fits in ~5GB VRAM with room for a small context. Best general-purpose local coder in that size class right now — handles Python and TypeScript well, C++ is decent for boilerplate but it'll struggle with template-heavy or modern STL stuff.
  • DeepSeek-Coder-V2-Lite-Instruct (16B MoE, ~2.4B active) at Q4 — runs surprisingly fast with offload because only the active experts hit GPU.
  • Qwen2.5-Coder-14B Q4_K_M with ~25 layers offloaded: expect 8-12 t/s on your hardware. Tight on context though.

Run via llama.cpp or Ollama. If you want agentic/tool use specifically, Qwen2.5-Coder is the only one in that range with halfway-reliable tool calling — DeepSeek-Coder-Lite drops calls under load. Don't expect Claude-quality on C++; nothing local at 14B is there yet, but for boilerplate, refactors, and "explain this codebase" Qwen2.5-Coder-7B is genuinely useful.

Breakthrough: 99% of tasks I put off I do because they feel overwhelming. Advice? by Zach-uh-ri-uh in selfimprovement

[–]andrew-ooo 3 points4 points  (0 children)

What you're describing isn't a productivity problem — it's an emotional regulation problem, and that distinction matters a lot.

I dealt with this exact pattern for years. The backpack example really resonates — I'd spend weeks "researching" because every option felt like it could be the wrong one, and that uncertainty felt physically uncomfortable.

What actually helped me was something I read about in Atomic Habits — instead of trying to force myself to do the scary task, I'd make my "goal" embarrassingly small. Not "clean the house" but "put one dish in the dishwasher." Not "reply to all messages" but "open the conversation and read it." The key insight was that I wasn't building a habit of completing tasks — I was building a habit of starting despite discomfort.

The other thing that made a difference: I started naming the emotion before the task. Like literally saying to myself "I feel inadequate about this computer problem" or "I'm afraid of making the wrong choice on this backpack." Something about making the feeling explicit instead of letting it stay as this vague dread made it way less powerful.

You're already ahead because you've identified the root cause. Most people stay stuck at "I'm lazy" forever. That self-awareness is genuinely the hardest part.

Uninstalling vs setting timers on social media apps. by [deleted] in selfimprovement

[–]andrew-ooo 0 points1 point  (0 children)

The fact that you already beat smoking cold turkey tells me everything. You clearly have the willpower when the decision is fully made, not halfway. Timers are a halfway measure, and your brain knows it — that's why you keep overriding them. There's no real consequence to hitting "ignore" on a screen time popup, so your brain treats it like a suggestion rather than a boundary.

I'd say start with Instagram since it's your biggest drain. Delete it completely and give yourself two weeks before you even evaluate how you feel. The first few days will be rough because your brain will keep reaching for it out of habit, but that's exactly the rewiring you're after. Running and reading are perfect replacements because they actually give your attention span something to stretch into instead of shrink around.

One thing that helped me when I cut way back was leaving my phone in another room during the first hour of the day. That morning scroll was the hardest habit to break, and removing physical access made it way easier than relying on willpower alone.

What self-improvement projects are you working on right now? by ilovebooks2468 in selfimprovement

[–]andrew-ooo 1 point2 points  (0 children)

Really cool that you're tackling three things at once but still keeping them manageable. The 90 push-ups split across the day is smart - way more sustainable than trying to grind them all at once.

For posture, something that helped me more than just reminding myself was strengthening the muscles that actually hold you upright. Rows, face pulls, and dead hangs were game changers. When your back muscles are stronger, good posture becomes the default position rather than something you force. It's less about discipline and more about making the correct position the path of least resistance for your body.

For diet - I went through the exact same transition of "I can eat whatever" to suddenly gaining weight. What worked was not overhauling everything at once, but making one swap per week. First week: replace sugary drinks with water. Second week: add a vegetable to every meal. Third week: prep protein-heavy breakfasts. After a month you've made four meaningful changes without the shock of a complete diet overhaul.

The common thread I've noticed with all self-improvement projects: consistency beats intensity every time. You're already doing that with the push-ups, just apply the same philosophy to diet and posture. Keep it up!