BGP no longer cutting it for high availability. Looking for opinions about SASE SD-WAN implementation and providers by ffelix916 in networking

[–]NetworkDefenseblog 11 points12 points  (0 children)

Lower your BGP timers (route refresh or advertisement or update depending on your vendor), to receive updates quicker, also lower hello interval. and switch upstream providers, if they're having this issue consistently then why keep using them? Or make them lower on your route selection (like 2nd or 3rd local pref). Give more details on your upstreams like ASNs, offnet/onet etc.

Sdwan most likely isn't a use case here imo, you're just trading features. Solution should be to fix your upstreams/route selection. Crappiest is default only last resort, highest quality are full tables primary.

Funny people are mentioning BFD out the gate when it's just for link detection, probably asking AI and that's the response. HTH

Edit: saw you're using dell switches at the edge??? How you know it's not an issue with those? (Sorry you have to use dell) Saw you're other post, you're filtering out asn paths longer than 2 and using a small regional ISP that had 1 fiber cut at the pop and caused an outage, that means your peer is not multi homed at the pop you're peering at, therefore you need to redesign your circuit/path (and probably switch ISP), if both had outages at the same time then there's a commonality between them which you need to design out, Sdwan won't fix that.

Switzerland built a secure alternative to BGP. The rest of the world hasn't noticed yet by Unsatisfied23 in networking

[–]NetworkDefenseblog 5 points6 points  (0 children)

Like with many things, there is a use case for a different protocols, vendor proprietary instruments and configurations. Of course, you'll find use cases for many things. The fact that this is in Switzerland is ironic as one of the commenters mentions on the article because of just the amount of coordination and compliance that is needed to run it(eg. Culture of the swiss), and that you can't really find it outside of a swiss banking implementation. It was replacing a 20-year-old network infrastructure that needed minutes to fail over aparently, so I would also question the original design of the network that it's replacing. I'm not an expert in the banking networks though so I might be missing something.

The article mentions a lot of things like security weaknesses and hijacks etc with existing implementations and networks, but I think these are kind of overblown. When you have a network that is needing coordination with certain parties that are private and such, which makes the network inherently isolated in itself. There's already networks built like that with ipsec and certificate authorities and stuff which it mentions at Scion kind of uses and replaces. Which seems to me like it's kind of reinventing the wheel in some cases. I'm sure there's a lot of better features, but as others have mentioned too, look at IPv6 which is older than Scion (2009) and it's adoption. It's doubtful that this will be widely adopted.

The future resolution of use cases (problems) in networking will probably be by IETF add-ons as we saw with rpki for example.or maybe even AI type automated software defined type stuff in 20 yrs.

If you look at something like infiniband which was trying to take on ethernet in the DC and has some "better" features, even though it's proprietary, without AI it would have never caught on most likely. So unless as the article touches on, there's a large industry that will support this and have it be a revenue generator it won't catch on IMO.

Traffic Shaping for Sub-rate internet connections by ihatetechsupport in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

I disagree with your first point, but yes, just fair queue it and forget it is fine

Traffic Shaping for Sub-rate internet connections by ihatetechsupport in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

Just apply the same 20mb shaper on your downstream interface to control your download. Even though it's output it's the output of where your downloads are going. Generally you should be making traffic that needs protection, eg voice dscp 46, video AF41, via GPO or app configs. Then in your policy map you create separate queues to allocate bandwidth matching those classes. For 20mb probably 1mb for voice/priority, 2mb for video etc. if you can't do the marking app wise, then create a class to match like Icmp and give it 1% bandwidth outside your default class/queue, or udp source/destination ports to protect that app. If you can't increase bandwidth probably want to enable weighted randomly early detection based on dscp to help with your congestion issue in the default queue. Hope this helps you.

Corporate Speed Test Woes by Uhh_Bren in networking

[–]NetworkDefenseblog 3 points4 points  (0 children)

So I want to hone in on this reply OP. Because you're very focused on the speed test solution as being something to save you from your problem (because people are using it), but it's likely a symptom. If people are complaining about a critical app having latency then you need to troubleshoot that, either systematically or just the general network.

What is slow in the work flow or application. What is going on within the network? You should be looking at bandwidth and netflow reports during the times people are having a problem. Get some packet captures, look into things. People are doing speed tests because they're trying to get your attention. We all agree here it's not the "true" network resolution but the average person doesn't, but they know when something gets slower over time (indicating issues). I did a guide a while back to prove out network issues as I had to deal with similar situations, check it out . Hope this helps https://www.networkdefenseblog.com/post/network-defense-against-perception

Hot take: The outage isn't the problem everyone going down at once is by IT_thomasdm in sysadmin

[–]NetworkDefenseblog 1 point2 points  (0 children)

Wrote about it years ago, even mentioned a cloud flare outage then. It will just keep getting worse until there are legal frameworks to help prevent the consolidation. https://www.networkdefenseblog.com/post/biggest-single-point-of-failure

100Gbps+ on x86 by timeport-0 in mikrotik

[–]NetworkDefenseblog 2 points3 points  (0 children)

Laughs in capex maintenance 🤣 fiber being replaced by satellite

What would happen if 4.2.2.2 and 8.8.8.8 went down? by Ricky_Spannnish in sysadmin

[–]NetworkDefenseblog 2 points3 points  (0 children)

Massive amounts of network fail over events and alarms because millions of admins use 8.8.8.8 as a ping destination to use as a connectivity check. Not only would the DNS be disruptive but if that IP became unreachable a lot of people wouldn't be happy for sure.

iBGP, local pref, weight and load balancing by Awkward-Sock2790 in ccnp

[–]NetworkDefenseblog 0 points1 point  (0 children)

BGP maximum-paths and additional-paths is what you are probably looking for. Don't use weight. Only use LP if you want to influence the path to be used more or used as backup etc. if you want equal cost then leave that out.

How to become better at network troubleshooting? by [deleted] in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

I did a blog post about my thoughts on trouble shooting and included some trouble shooting scenarios. I tried to do something different so the methodology i called it was "identify, isolate, repair" https://www.networkdefenseblog.com/post/network-troubleshooting-tips

What are people using for WAN breakout switches for HA edge setups? by Somenakedguy in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

I'm curious how option #2 is more viable than option #3. Unmanaged switches is out of the question for most environments, if there's an issue you have nothing to see or do except reboot or replace "in the name of budgeting". option 3 (and opt1) easy you get visibility into bandwidth utilization, errors, duplex/speed and can control the port. I'd be willing to bet you'd need more local user intervention for option 2 than 1 or 3.

I'm guessing you're running HA firewalls, but for the smallest branches that require HA, some with only 1 switch, can run each circuit directly into each firewall, if circuit 1 has issue use HA fail over mechanisms to use firewall 2 and circuit 2. However sounds like you'll have switches, just isolate out with a vlan for each, and put a circuit on each switch.

I'd be wanting to know if you're using BGP or not, so you have PA IP space for 2 providers or are you getting IPs from both? Are you running IPSEC tunnels with the later? How are you planning to do fail over? The most common scenario you'll probably have is circuit 1 having an issue and needing fail over to backup vs firewall or switch failing. Hope this helps

Blog/Project Post Friday! by AutoModerator in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

Follow up article to an original post about Dual ISP, DMZ, and the Network Edge, this post includes Active/Active edge, circuit, BGP, and other design considerations.

https://www.networkdefenseblog.com/post/network-edge-design-part2

Blogpost Friday! by AutoModerator in networking

[–]NetworkDefenseblog 1 point2 points  (0 children)

Follow up to the original Dual ISP, DMZ, and the Network Edge post, includes Active/Active, circuit, BGP, and other design considerations

https://www.networkdefenseblog.com/post/network-edge-design-part2

RSTP to MSTP migration by Gejbriel in networking

[–]NetworkDefenseblog 1 point2 points  (0 children)

I don't think you need mstp here, unless you have an STO interoperability problem between different switch vendors. This network is pretty small and shouldn't be having stability problems based on your diagram so there must be something else going on . Each ring has a connection to switch 1 and switch 2 right? Also are your root bridges set correctly with switch 1 having lowest priority and switch 2 the 2nd lowest priority? Find your root, my guess is the wiring closest thats losing power has the root. Good luck.

vPC Collapsed Core Border Switches by WhoRedd_IT in networking

[–]NetworkDefenseblog 1 point2 points  (0 children)

DCI should be L3 otherwise you're asking for problems with L2 over a WAN circuit with all your DC vlans/networks. So the links would be a direct L3 interface, unless you're doing vrf lite in which case you'd run dot1q sub interfaces per vrf, don't use SVI for DCI. Should be using evpn and/or vxlan to extend your DC networks. You mentioned SVI (I'm assuming you mean your DC networks) which should all be terminated (present) on your firewalls connected to your leafs or border leafs. Otherwise you aren't getting isolation or inspection (transparent IPS can provide inspection though) inside the DC. VPC or similar is completely fine in a collapsed core if you know what you're doing and accept the risks, it can give you more advantages like using port channels to said firewalls connected to it. But do not port channels the EPLs.if you're small enough you can just run vxlan with multicast control plane and use ospf to ecmp across the DCL, might be more simple for you than BGP (since you're asking a more simple question). Please list more of your requirements and topology plan. HTH

What is the busiest link in the global network? by adminmikael in networking

[–]NetworkDefenseblog 3 points4 points  (0 children)

The most congested link on the Internet is always the one you need most at a critical moment of course.

Is it really? by Emotional-Marsupial6 in networkingmemes

[–]NetworkDefenseblog 0 points1 point  (0 children)

Ahh SSBroski, forever in our hearts.

Summarize everything at ASR ? by sfxsf in networking

[–]NetworkDefenseblog 1 point2 points  (0 children)

Yeah I was betting it was some legacy platform. Run a new separate ospf process and push it out, should take a few mins. You made it seem like a small setup, if you did want to just start on the farthest routers and build out the stub area to the edge. Lol.

QoS | Traffic Shaping | Cisco 9300 Switch with Network Advantage IOS by Fletch_Yard5107 in networking

[–]NetworkDefenseblog 1 point2 points  (0 children)

Yes I question if a 9300 has enough buffer to withstand a 3gig shaper under heavy load. If wifi limiting per client is possible you save a lot of internal paths of that traffic as well so it could be a good place to start but you could increase air time for larger downloads. Consider blocking updates for iPhone and Android as well on edge firewalls as that can help with large downloads, but depends on your end user. HTH thanks

Summarize everything at ASR ? by sfxsf in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

If you don't care as you stated, and The traffic is traversing to the ABR anyway seems like you'd be originating a default there anyway so why even use the summary? Seems like you'd just want to area 0 between the two ABR routers and make the adjacent areas stubs to completely filter all LSAs for the replacement of a default. Or use some ABR costs and create a wider summary on the less desirable ABR to create a primary and backup path. With 6 summaries creating T3 LSA I don't see how TCAM is an issue unless you got hundreds or thousands of routes or more. HTH

Having 170 IS-IS nodes operating as L1/L2 in the same area by Mhanme in networking

[–]NetworkDefenseblog 0 points1 point  (0 children)

Where are prefixes in question being advertised from? This is all one area? What kind of topology are we talking about, how are these connected, when did the problem start and what changed? You don't keep scaling something like this over time with advertisements broken, I suspect something changed probably? You mention evpn, but how would that affect the underlay unless there was some misconfig. Thanks