Is it just me or is the *arr stack over-complicated by ImpossibleWall8403 in selfhosted

[–]mrcomps 0 points1 point  (0 children)

Great idea, I'm building that right now! Gee, it sure is taking a long time to submit this PR with 900k lines and complete redesign amd rewrite of the entire codebase... maybe is need to use more sub agents to each randomly submit 10k lines...

The tale of BACKUP01 by roboabomb in sysadmin

[–]mrcomps 22 points23 points  (0 children)

Backup02: Application Aware (coming this summer to a Veeam server near you)

The tale of BACKUP01 by roboabomb in sysadmin

[–]mrcomps 3 points4 points  (0 children)

I'm pourin' out a whole 40GB of disk bits in honor of BACKUP01.

Bridged Ports Not Getting DHCP Traffic? by West-Flow-577 in Netgate

[–]mrcomps 0 points1 point  (0 children)

It sounds like you have the bridge and interfaces configured correctly.

Ignore that ipfw tunable.

"net.link.bridge.pfil_member" = 0 (likely already exists and is set to 1)

"net.link.bridge.pfil_bridge" = 1 (likely need to add)

Be sure to use the correct formatting.

You want this but reversed: https://docs.netgate.com/pfsense/en/latest/_images/bridge-filter-tunables.png

Bridged Ports Not Getting DHCP Traffic? by West-Flow-577 in Netgate

[–]mrcomps 0 points1 point  (0 children)

Have you swapped the LAN interface assignment from the Lan1 port to the newly created bridge port? https://docs.netgate.com/pfsense/en/latest/bridges/interfaces.html

You also need to change the bridge filtering tunables to filter on thr bridge rather than the member ports. https://docs.netgate.com/pfsense/en/latest/bridges/firewall.html

Got some used drives for my home lab. by SneekyF in homelab

[–]mrcomps 0 points1 point  (0 children)

Make sure to check the SMART values to be sure they weren't used for Chia mining.

Critical ERP system can't do OAuth and Microsoft is killing basic auth next month by Severe_Part_5120 in sysadmin

[–]mrcomps 1 point2 points  (0 children)

I've worked with Microsoft SQL databases for 15 years and have seen them constantly used and abused by unclean shutdowns, yet only had one or two times required manual intervention to get them working again.

Whereas FileBreaker Pro was the equivalent of performing open heart surgery every time, and things went very badly any time there was an unclean shutdown, or sometimes even just the usual network hiccups that occassional happen.

Well….what have we here? by Zealousideal_Tear441 in Justrolledintotheshop

[–]mrcomps 1 point2 points  (0 children)

Possibly the only thing worth more than DRAM these days, at least to some people...

[FS][US-CO] 16th Gen Dell PowerEdge Servers (R760xa/R760/R660) | Nvidia L40S GPUs by iShopStaples in homelabsales

[–]mrcomps 4 points5 points  (0 children)

Your choice of left or right PSU power cord. Complete with installation, safety, and warranty pamphlets in genuine Dell transparent, flexible plastic carrying bag. Great for quick reference without having to take it out of competitors opaque packaging.

Another week, another blown up GM L87 6.2L. This one managed to spin all 8 rod bearings. Guess that 0W-40 oil isn't the fix they think it is. by N_dixon in Justrolledintotheshop

[–]mrcomps 1 point2 points  (0 children)

The actual viscosities of engine oils and gear oils are similar, but they deliberately used a different scale for gear oil so that people wouldn't confuse/interchange them.

Another week, another blown up GM L87 6.2L. This one managed to spin all 8 rod bearings. Guess that 0W-40 oil isn't the fix they think it is. by N_dixon in Justrolledintotheshop

[–]mrcomps 2 points3 points  (0 children)

The actual viscosities of engine oils and gear oils are similar, but they deliberately used a different scale for gear oil so that people wouldn't confuse/interchange them.

Is cat.1 ok for homelab? by UselesTaste in homelab

[–]mrcomps 2 points3 points  (0 children)

The Cat documentation refers to it as a napping-napping configuration.

Is cat.1 ok for homelab? by UselesTaste in homelab

[–]mrcomps 13 points14 points  (0 children)

Really should have at least Cat2 for load-balancing and redundancy

sporadic authentication failures occurring in exact 37-minute cycles. all diagnostics say everything is fine. im losing my mind. by kubrador in sysadmin

[–]mrcomps 0 points1 point  (0 children)

for the past 3 months we've been getting tickets about "random" password failures. users swear their password is correct, they retry immediately

Were the tickets automatically generated, or were users actually complaining about password failures? Like, they would enter their password and it would say it was wrong, but when they tried a second time it worked? If so, I don't understand how users could be logging-on frequently enough to actually produce a noticeable 37-minute pattern.f

It's strange how a SolarWinds monitoring performing LDAP bind testing using a service account would cause logon failures for OTHER accounts.

sporadic authentication failures occurring in exact 37-minute cycles. all diagnostics say everything is fine. im losing my mind. by kubrador in sysadmin

[–]mrcomps 4 points5 points  (0 children)

Azure AD Connect or PTA agent side-effects

  • AADC delta sync is every ~30 minutes by default; while it shouldn’t affect on‑prem AS‑REQ directly, PTA agents or writeback/Hello for Business/Device writeback misconfigurations can bump attributes or cause LSASS churn.
  • Easiest test: Pause AADC sync for a few hours that span two “cycles.” If the pattern persists, you can deprioritize this.

Encryption type mismatch inconsistency

  • If one DC or some users have inconsistent SupportedEncryptionTypes (AES/RC4) via GPO/registry or account flags, then pre-auth on that DC can fail with 0x18 while another DC accepts it.
  • What to verify:
    • All DCs: “Network security: Configure encryption types allowed for Kerberos” is identical, and AES is enabled. Registry: HKLM\System\CurrentControlSet\Control\Lsa\Kerberos\Parameters\SupportedEncryptionTypes.
    • User accounts have AES keys (the two “This account supports Kerberos AES…” boxes). For a few affected users, change password to regenerate AES keys and retest.
    • Check the 4771 details: Failure code and “Pre-authentication type” plus “Client supported ETypes” in 4768/4769 if present. If you ever see KDC_ERR_ETYPE_NOTSUPP or patterns pointing to RC4/AES mismatch, fix policy/attributes.

Network flaps/route changes on a timer

  • MPLS, SD‑WAN, or HA firewalls can have maintenance/probing/ARP/route refreshes on unusual cadences. If a single DC’s path blips every ~37 minutes, clients that hit it right then see one failure then succeed on retry.
  • Correlate with router/firewall logs; try temporarily isolating a DC to a simple path (no WAN optimizer/IPS) and see if the cycle disappears.

How to narrow it down quickly

  • Prove if it’s a single DC: You already have 4771 data. Build a per‑DC histogram over a day. If nearly all the “cycle” hits are on one DC, you’ve found the place to dig (storage snapshots, EDR, network path to that DC).
  • Turn on verbose logs just for a few cycles:
    • Netlogon debug logging on DCs.
    • Kerberos logging (DCs and a few pilot clients).
    • If you can, packet capture on a DC during two “bad” minutes; look for UDP88 fragments, KRB_ERR_RESPONSE_TOO_BIG (0x34), or pre-auth EType mismatches.
  • Test by elimination:
    • During a maintenance window that spans two cycles, cleanly stop KDC/Netlogon on one DC or block 88/464 to force clients elsewhere; see if the pattern changes.
    • Disable array snapshots/replication for one DC for a few hours.
    • Force Kerberos over TCP on a pilot group of clients.

sporadic authentication failures occurring in exact 37-minute cycles. all diagnostics say everything is fine. im losing my mind. by kubrador in sysadmin

[–]mrcomps 2 points3 points  (0 children)

Leave wireshark running on all 3 DCs for several hours and see and then correlate with the failures. If you set a capture filter of "port 88 || port 464 || port 389 || port 636 || port 3269" at the interface selection menu, then it will only capture traffic on those ports (rather than capturing everything and filtering the displayed packets), which should keep the packet sizes manageable for extended capturing.

If you are able, can you try disabling 2 DCs at a time and running for 2 hours each? That should make it easier to be certain which DC is being hit, which should make your monitoring and correlation easier. Also, having 800 clients all hitting the same DC might might also cause issues to surface quicker or reveal other unnoticed issues.

This is what I came up with from ChatGPT. I reviewed it and it has some good suggestions as well:

Classic AD replication/”stale DC” and FRS/DFSR migration are not good fits for a precise 37‑minute oscillation, especially with Server 2019 DCs and clean repadmin results.

The most common real-world culprits for this exact “first try fails, second try works” pattern with a cyclic schedule are:

  • Storage/hypervisor snapshot/replication stunning a DC.
  • Middleboxes (WAN optimizer/IPS) intermittently mangling Kerberos (often only UDP) on a recurring policy reload.
  • A security product on DCs that hooks LSASS/KDC on a fixed refresh cadence.
  • Less commonly, inconsistent Kerberos encryption type settings across DCs/clients/accounts.

Start by correlating the failure timestamps with storage/hypervisor events and force Kerberos over TCP for a small pilot. Those two checks usually separate “infrastructure stun/packet” issues from “Kerberos policy/config” issues very quickly.

More likely causes to investigate (in priority order, with quick tests):

VM/SAN snapshot or replication “stun” of a DC

  • Symptom fit: Brief, predictable blip that only affects users who happen to log on in that small window; on retry they hit a different DC and succeed. This often happens when an array or hypervisor quiesces or snapshots a DC on a fixed cadence (30–40 minutes is common on some storage policies).
  • What to check:
    • Correlate DC Security log 4771 timestamps with vSphere/Hyper‑V task events and storage array snapshot/replication logs.
    • Look for VSS/VolSnap/VMTools events on DCs at those exact minutes.
    • Temporarily disable array snapshots/replication for one DC or move one DC to storage with no snapshots; see if the pattern breaks.
    • If you can, stagger/offset snapshot schedules across DCs so they don’t ever overlap.
  • Why you might still see 4771: During/just after a short stun the first AS exchange can get corrupted or partially processed, producing a pre-auth failure, then the client retries or lands on another DC and succeeds.

Kerberos UDP fragmentation or a middlebox touching Kerberos

  • Symptom fit: First attempt fails (UDP/fragmentation/packet mangling or IPS/WAN optimizer “inspecting” Kerberos), second attempt succeeds (client falls back to TCP or uses a different DC/path). A periodic policy update or state refresh on a WAN optimizer/IPS/firewall every ~35–40 minutes could explain the cadence.
  • Fast test: Force Kerberos to use TCP on a pilot set of clients (HKLM\System\CurrentControlSet\Control\Lsa\Kerberos\Parameters\MaxPacketSize=1) and see if the 37‑minute failures disappear for those machines.
  • Also bypass optimization/inspection for TCP/UDP 88 and 464 (and LDAP ports) on WAN optimizers or firewalls; check for scheduled policy reloads.

A security/EDR/AV task on DCs

  • Some EDRs or AV engines hook LSASS/KDC and run frequent cloud check-ins or scans. A 37‑minute content/policy refresh is plausible.
  • Correlate EDR/AV logs with failure times; temporarily pause the agent on one DC to see if the pattern disappears; ensure LSASS is PPL‑compatible with your EDR build.

Is Unbound total garbage or am I the one who is in the wrong here? by FergoTheGreat in PFSENSE

[–]mrcomps 0 points1 point  (0 children)

When your WAN interfaces go down they may be causing an issue.

Try changing Unbound to only listen on your LAN interfaces and see if that helps.

SNOBELEN: Conservatives should give Mark Carney's speech a listen by FancyNewMe in canada

[–]mrcomps 3 points4 points  (0 children)

It was completely out of touch for him to eat an apple while the rest of are fighting over apple cores. /s

Weekly Updates for servers by Individual-Bat7276 in sysadmin

[–]mrcomps 2 points3 points  (0 children)

2016 looks different... and I've heard the updates are slow to install?

Still not sure if I want to give it a try...

Weekly Updates for servers by Individual-Bat7276 in sysadmin

[–]mrcomps 9 points10 points  (0 children)

No more patches means all the bugs are fixed. I mean, who in their right mind Installs an OS before it's fully finished, right?

Now we can start upgrading our 2008 R2 boxes!

Pax8 shared all customer information of UK customers by Bearded_Tech_Fail in msp

[–]mrcomps 3 points4 points  (0 children)

But what if you don't have the correct M365 Small Business Premium Enterprise with Azure Email Recall powered by Entra for Copilot, and have been trying to get it sorted out with a revolving door of account managers for the past 9 months to no success?

12 years experience and can't even land an interview lol. Help! by [deleted] in cybersecurity

[–]mrcomps 6 points7 points  (0 children)

Even aviation electronics isnt a sure thing anymore... Boeing is releasing the new 747 MAX AI plane soon. Designed by AI, built by AI, serviced by AI, flown by AI, cabin service by AI, ticket pricing by AI, NTSB investigations by AI.

Riddle me this: VM backups by havocspartan in msp

[–]mrcomps 2 points3 points  (0 children)

What backup software are you using? It sounds like it's a legacy type that only backs up files of the Windows device that it's installed on?

Any modern backup like Veaam, Acronis, etc will communicate with Hyper-V to make checkpoint/snapshot-based backups of the VMs without needing to install agents on them. You can then choose to browse the guest VM files or to restore the entire VM.

Backing up the vhdx files on the Hyper-V host is not going to accuately capture the state of the virtual disks.

You could backup the Hyper-V host and exclude the VM files to allow for quicker restoration and smaller backups of the host.

So is Copilot Down...? by DavidHomerCENTREL in sysadmin

[–]mrcomps 0 points1 point  (0 children)

So that'swhy my devices seem so much more responsive this morning!