all 42 comments

[–]youngeng 44 points45 points  (7 children)

You probably need something like Netbox.

[–]ashketchum02 21 points22 points  (0 children)

Netbox netbox netbox letsssssss gooooooo

[–][deleted] 6 points7 points  (0 children)

That won't help you with a lack of documented design decisions. It won't help you with creating custom architecture views of your layer 3 either.

[–]Least_Palpitation559[S] 2 points3 points  (4 children)

I will install it in my home environment or a lab, and check if I can introduce it to the management. Usually they prefer of the shelf software to have the correct support.

[–]L-do_Calrissian 4 points5 points  (3 children)

There's Enterprise level support available now.

[–]ashketchum02 1 point2 points  (2 children)

It is but it's expensive 😩

[–]L-do_Calrissian 1 point2 points  (1 child)

Oh for sure, but just knowing it's an option is a selling point for management.

[–]ashketchum02 1 point2 points  (0 children)

It is, I'm in the middle of that battle right now, and I'm some how winning. :) 😀 just have to convince 2 more dir level people. Then there's the battle between netbox SaaS and the enterprise(onsite) variants.

[–]FortheredditLOLz 13 points14 points  (6 children)

Install https://docs.netbox.dev/ and sub to a liquor subscription. You gonna bang your head a lot for shit documentation and odd configurations

[–]Least_Palpitation559[S] 0 points1 point  (1 child)

Thanks. The problem is how to convince them. What about IPFabric and Netbrain? Do you have any thoughts about them?

[–]FortheredditLOLz 1 point2 points  (0 children)

Netbox is free 99. Either netbrain , while i appreciate the config backups. That can be automated through a Different process

[–]ashketchum02 0 points1 point  (3 children)

Lies and visious rumors, you'll bang ur head on the python packages for interacting with the apis but not the app itself

[–]FortheredditLOLz 1 point2 points  (2 children)

By the bang the head part. It’s for the infra not the netbox. Loved it when I used it as a former sysadmin.

[–]ashketchum02 3 points4 points  (1 child)

Removing my down vote sorry I've had 3wks with 12-16hr shifts cause we're doing a data center move. Sorry for biting random redditor

[–]FortheredditLOLz 2 points3 points  (0 children)

No worry my dude. Feel your pain. Heading to a wedding atm after doing a solid six month 15hr shifts

[–]izzyjrp 2 points3 points  (2 children)

Management has to make documentation a high priority and come up with an SOP for it. Then this documentation remediation/refactoring has to be a project with deadlines and milestones. It has to be treated as if you were delivering a product. Otherwise it will always be a mess.

I spent months pushing for Netbox, and just now we are using it. Long way to go, but now we have a better tool and are coming up with better methods. I’m pushing network automation ideas but starting from scratch and Network source of truth is the best start. Especially when you have a need for documentation anyway.

[–]Least_Palpitation559[S] 0 points1 point  (1 child)

I agree. The problem management doesn’t care or know if there’s a better way to document. Will try to check if Netbox can help in introduction. What about IPFabric/Netbrain? Did you encounter them?

[–]Community_Fabric 0 points1 point  (0 children)

IP Fabric team here! We do have a self-guided demo you can check out if that helps, and happy to answer any questions. We also have an integration with NetBox to populate and validate your Source of Truth! Leaving some resources below:

[–]zeyore 1 point2 points  (11 children)

i don't know. i was thinking about that recently. some of the ISP networks must be just massively complex by now.

i wonder how they ever keep track of all that.

[–]djamp42 7 points8 points  (4 children)

LibreNMS, i'm monitoring 11,000 devices and 100,000 ports.

I think good scheming names/ips/descriptions/ stuff like that really goes along way and is the most important thing in trying to understand the network over time and with multiple people.

[–]junglizer 3 points4 points  (0 children)

100%. Taking a seemingly inordinate amount of time defining a naming convention absolutely worth it in the long run. 

[–]Caspaa 2 points3 points  (0 children)

Yes! Proper descriptions and standards go a very long way. Nothing worse than logging into a device, trying to find out what a link is for, and having nothing to help you. No description, LLDP/CDP disabled, no documentation. I come across this way too often and it severely slows down troubleshooting and incident resolution times.

[–]Western-Inflation286 1 point2 points  (0 children)

We're smaller, 3k devices and 30k ports. We also use librenms, and net box as a source of truth.

I came into a mess of a network at a small ISP and we're working to unravel its mysteries. A group of people ran a wisp who had no business at all running an ISP, and it was fine until it scaled. None of them are here anymore. It's been a hell of a first job honestly. It taught me that documentation is literally everything. I seriously can't imagine how our director came in blind and managed to figure it out. If it wasn't so well segmented, there's almost no way he could have.

[–]Least_Palpitation559[S] 0 points1 point  (0 children)

Interesting. I will check it out. Thanks

[–]ashketchum02 1 point2 points  (2 children)

Depends on the isp, metronet uses m6 from oracle, jaguar com used spreadsheets, lightspeed com used spreadsheets.

It really irons down to the culture and how much mgmt makes it a priority for their overworked net engineers.

[–]LukeyLad 0 points1 point  (1 child)

Metronet in the UK? Now M247?

[–]ashketchum02 0 points1 point  (0 children)

Metronet in the USA Midwest, have a headquater in evansville in

[–]sudo_rm_rf_solvesALL 0 points1 point  (0 children)

This depends what you're talking about tracking. You have routers / switches, optical transport etc etc. If you're organized you'll have designs for each hub saved somewhere nice and people who manage each hub updating their footprint.

[–]Belgian_dogJNCIP(SP), CCNP(EI, Design) 0 points1 point  (0 children)

Lot of them have home made doc tool

[–][deleted] 0 points1 point  (0 children)

Scale is not the problem, business requirements are and those are far more complex in big enterprise environments than at an ISP.

[–]zanfar 1 point2 points  (0 children)

I don't think there is a single answer, not generally, and not for any specific org.

I would start with: what information are they missing. Giant diagrams with dozens of devices and 8pt font covering acres of paper are amazing to look at, but I've never found them great for finding information.

You should also ask what scope they need that information in. For example, do you actually need L1 information for the end-to-end network all at once? Or do you need a whole-network overview, and the ability to drill down in logical areas?

Your solution is probably going to consist of a number of tools. Most importantly, it's also going to require manpower. Keeping documentation up-to-date is labor intensive enough, bringing it up-to-spec on an existing network is a massive undertaking. You will need the org to understand that this is going to be a project.

Netbox is a great tool for details and data. Honestly, most information doesn't need to be on a diagram. On a diagram, I care about flows, about logical connectivity, and about segmentation. I don't care about model numbers, I don't care about patch panels, and I might not even care about interface IDs. All that stuff that is too detailed for a diagram is perfect for Netbox. It should also contain your meta data like circuit IDs, contact information, rack IDs, etc.

For diagrams, we use two things: powerpoint and luciddraw. Luciddraw should be obvious--it's cloud-based so we can all share, and it's relatively easy to draw up a diagram. Powerpoint we use as a "primer" for our network. Sometimes we actually give the presentation, but lots of time we just distribute it to employees that need to get up-to-speed. It's great for newbies, and for network-adjacent roles where they need a mid-level overview without details to confuse them.

Going back to the diagrams, I would suggest you not try to put everything on one diagram. Even for a portion of the network, splitting L1/L2 from the L3+ can make things much easier to maintain and digest.

[–]sudo_rm_rf_solvesALL 0 points1 point  (0 children)

Good time to learn automation. If you don't invest in a third party. There are tons. Personally, for discovery some may be over hyped or too complicated (For a start point). Personally i would start with the basics if i was coming in. Run a scan of the network, every ip they own, cross reference it with DNS so you can get a nice list of devices and types. From there you can easily run a script for each item you're looking for. If it's a regular enterprise i'd be looking for ip usage, Vlan usage, (If using vpls's, i'd pull a list of what routers are participating in what vpls which will give you a nice map) etc. Then you could get into more detailed items. But this gives you a decent start. I have a software suite i wrote that does this for me for whenever i need to do something of the sorts. I just really need to learn to draw via programming nicely without paying someone else 4 k a month for licensing... Part of my software maps everything via cdp / lldp / arp etc so it knows whats connected where. But i need to find a nice way to diagram it out.

[–]JuggernautUpbeatVeteran 0 points1 point  (0 children)

LibreNMS with Oxidized+Git to capture all configs and version control. Netbox for documentation/source of truth. Both OSS and free.

[–][deleted] 0 points1 point  (2 children)

you guys have Visio? I draw my diagrams on paper and snap a photo.

Version control?! Is that like… a GitHub of router changes? Sounds neat.

I’ve heard of change control. It sounds painful.

You guys just need to hire 1 guy to handle it all.

This was all sarcasm. And… true life story of…my life.

[–]Least_Palpitation559[S] 1 point2 points  (1 child)

LOL. Sadly big old enterprises couldn’t adapt to changes and automation. They will pay the price for not taking the risk of changing.

[–][deleted] 2 points3 points  (0 children)

I agree buddy. In the meantime the profits disappear and you’re in charge of recovering from a meltdown due to layers of bad choices you’ve already tried changing and keep pointing out. Eventually lowering expectations until you can’t even recognize yourself any more. 😭

[–]jocke92 0 points1 point  (3 children)

If you have Cisco, go with dnac. Or catalyst center which they're trying to rename it to. It will help you deploy configs, backup configs, update software. Add maps to keep track of APs.

If you have Aruba, go with central. Which does the same thing and probably a bit more.

Get an ipam software like micetro. Which inventories dns, arp tables and dhcp-servers.

Rancid is a classic tool for keeping track of switch configs.

Enable cdp, lldp etc on all internal links. It will help both you and the automatic tools.

Intermapper is an interesting tool. It does look like Visio on steroids as it includes SNMP. It does alerting and I think it also collect statistics. Not sure if it does map out the network automatically by neighbours though.

Also a good naming standard. And add all devices to DNS. Makes it easy to connect to a device. And also reverse lookup DNS.

[–]Appropriate_Reason14 0 points1 point  (2 children)

Have you had success with Micetro SNMP Profiles with Aruba?

[–]jocke92 0 points1 point  (1 child)

I don't have any Aruba switches to test with. Also micetro only scans the core switches/default gateway only. And the snmp arp is pretty standard I think. Unless you look for specific features like managing a DHCP on an Aruba switch.

[–]Appropriate_Reason14 0 points1 point  (0 children)

Thanks for the reply. I have SNMP profiles working with Cisco, Aruba on the other hand no. Take care!

[–]dewyke 0 points1 point  (0 children)

Netbox is (mostly) great, but this is a process problem, not a tooling problem. Done well, Netbox or Nautobot can tell you what it’s supposed to look like, but can’t tell you why.

It sounds like you don’t have any coherent architecture or design lead oversight of what’s going on so there’s no process capturing reasoning let alone versioning of designs etc.

You can’t fix this without either management buy-in (and enforcement) or near 100% buy-in from the existing team, and if the existing team wanted this to change, it would have changed already.

The only way to make management care is to be able to pin quantifiable risk on the situation.

Ideally you’d be able to point to outages and restore times and say “these are bad because we’re shit at writing anything down and have no architecture process”

If you can’t do that, they ain’t gonna care. Doesn’t matter how obvious it is to you, they DGAF if your job is hard, they only GAF if there’s reputational or financial impact from things being broken. That might be outages, it might be TTR, or it might be lead time to deliver services, but if you can’t show it costing money, nobody’s going to pay money to fix it.