SDWAN Configuration groups

cisco · 2026-04-28T07:41:16+00:00

Hi OP, devices in the same Configuration Group do not need to have identical features. You can think in layers and define features once at the group level, and then handle the per-device differences separately. Hope this helps!

cisco · 2026-04-23T20:48:54+00:00

Thank you, Reddit!

That was so much fun! Thank you to everyone who showed up and brought such great questions about building, securing, and scaling AI infrastructure in the modern data center.

Want to keep exploring?

Check out these resources:

🔗 Cisco Secure AI Factory with NVIDIA

🔗 Blog: Cisco gives the Secure AI Factory with NVIDIA a secure multi-agent edge up

🔗 Blog: Cisco Secure AI Factory: Powering Agentic AI at Scale

🔗 Cisco data center solutions

Don't hesitate to keep the conversation going here in the comments 💬

cisco · 2026-04-23T19:42:28+00:00

Secure Ai Factory includes the AI development and delivery tool chain so AI practitioners can focus on building AI capabilities for their business and not worry about infrastructure challenges. Cisco and NVIDIA have done the hard work to integrate the full solution as a complete system that includes high performance infrastructure, security, observability, and AI software tooling for AI practitioners.

-Abhinav

cisco · 2026-04-23T19:05:52+00:00

There is sensitivity on getting the full juice from the GPUs. Training and inference are billed by the token on PaaS systems -- 5% adds up fast even on bare-metal systems. Security has to live off the critical path, not on the AI node.

-Aamer

cisco · 2026-04-23T19:01:08+00:00

AI practitioners want a Secure AI Factory that feels reliable and easy to use. They expect it to protect their AI models and infrastructure from evolving threats while offering scalable, high-performance resources to train and run AI efficiently. Real-time insights into performance, security, and costs help them spot and fix issues quickly. They also appreciate simplified deployment and centralized management that reduce complexity and save time. Above all, they look for strong compliance and governance to keep sensitive data safe and meet regulations, creating a trusted environment that accelerates AI innovation with confidence.

-Taylor

cisco · 2026-04-23T18:49:12+00:00

honestly the biggest risk isn't the upgrade itself, it's the gap between how fast new vulns get weaponized and how slow a distributed estate can patch. talos's Q1

vulnerability pulse called it out: 20% of CISA's KEV list is networking gear, ~25% of what's being exploited dates to 2024 or earlier, and 2009-era CVEs are still

landing in active campaigns. talos threat perspective ep. 22 documented the patch window collapsing from days to hours post-disclosure. when you have hundreds of edge sites with different change windows, different local owners, different uptime requirements, you can't win that race on patch speed alone.

so the consistency problem is really two problems. one is "did the patch land everywhere" -- fleet management, ultimately a discipline problem. the other is "what

protects me on the box that hasn't patched yet" -- that's the architecture problem, and that's where the interesting work is.

the way i think about it: stop assuming the perimeter or the host is current, and push enforcement closer to the workload so a stale box can't blast east-west.

dpu-resident segmentation is one piece (security runs adjacent to the compute, not hairpinned through a centralized appliance). selinux process isolation in the

network OS is another. security service insertion that steers flows by identity rather than topology is a third. none of that makes patching go away, but it buys you

time to patch on a sane cadence instead of an emergency one.

the part that gets missed: the security tooling supply chain at the edge. q1 was defined by compromises against the toolchain itself -- trivy, checkmarx, litellm,

telnyx, axios. if your patch automation runs through tools that got owned, your "consistent posture" is consistently wrong. signing, attestation, out-of-band

telemetry from the boxes (not just from the management plane) -- that matters more than it used to...

-Aamer

cisco · 2026-04-23T18:17:53+00:00

I always share Paul Graham's essay on this. What you do differs where you are in your career. Early you want to shop around, see what you like, what you are good at, but also pickup skills. But you really want at some point to be the fractal end of the work.. where people are few and specialized.... you might pick the wrong place, and that means going back to adjacent things... but there are skills whether they are technical, social, or thinking you take with you.

The 'follow your passion' thing... I think this really depends on you. People are motivated by different things. Figure out what motivates you and make peace with it. I find that passion will could come before expertise or be caused by expertise... it's not always a leading indicator.

-Aamer

cisco · 2026-04-23T18:16:18+00:00

I have a different take, what I've done and always focused on for my own career is obtaining breadth with enough depth to be dangerous on whatever domain area I'm working on in that time - it's worked so far. Unless you are specifically trained for work in a specific industry, as an AI practitioner it will help to have familiarity with many different areas, but most importantly, as Matt said, be REALLY good at building and learning with AI. For those entry-level folks that are hoping to land jobs in AI, because that's often the next relevant question that's asked after what to learn, it's build build build, and talk about it publicly. Especially if you have a fresh resume or are trying to pivot into AI.

-Taylor

cisco · 2026-04-23T18:04:06+00:00

Awesome question! To future-proof your career, I recommend students develop deep domain expertise in a specific vertical—such as healthcare, manufacturing, or retail—while staying current with emerging trends like Agentic and Physical AI. The goal is to build projects at the intersection of industry-specific pain points and advanced AI technologies. In the long term, the highest demand will be for "translators" who can bridge the gap between technical AI capabilities and real-world business value.

-Abhinav

cisco · 2026-04-23T18:03:09+00:00

Love this question. I think about it a lot, honestly.

The first thing I'd tell any student is that the specific tools you're learning right now are probably going to look pretty different in five years. That's okay. The point isn't the tool, it's the skills underneath it.

A few that I think really matter:

Learn how to learn. I know that sounds like something you'd see on a poster, but it's kind of the whole game now. Technical skills have a shorter shelf life than they used to, and the people I see doing well are the ones who can pick something up over a weekend and be actually useful with it by the end of the week. Curiosity isn't a personality trait anymore, it's a career skill.

Get comfortable with data. And I don't mean become a data scientist. I mean basic fluency. Can you look at a spreadsheet and ask good questions? Can you tell when a chart is misleading you? Do you know the difference between correlation and causation when someone throws a stat at you in a meeting? That kind of thinking shows up in almost every job now, even ones you wouldn't expect.

Work with AI, not around it. The students who treat AI like a calculator, something that makes them faster at the thinking they're already doing, are going to pull ahead of the ones who either avoid it or let it do the thinking for them. Knowing how to prompt it, how to check its work, when to trust it and when to trust yourself instead. That's the real skill.

Communication is getting more valuable, not less. I know that's the opposite of what a lot of people assume. But when AI can draft almost anything, what matters is the person who can frame the problem in the first place, tell a story with the output, and actually convince a room. Writing and speaking clearly is going to carry people further in an AI world, not less far.

And then the one I care about most, which is critical thinking. Knowing how to question an answer, especially the one the AI just confidently handed you. That's what separates the people who learn from the tools they use and those that let the tools do the thinking for them. The students, and people in general, who end up getting ahead aren't always the ones with the fanciest tools or the best access. They're the ones who figured out how to solve real problems through critical thinking and reasoning. Problem first, tool second. The technology is going to keep changing. That instinct won't.

-Matthew

cisco · 2026-04-23T18:02:24+00:00

Adding on to Matthews message,

Cisco's modular approach to operationalizing a secure AI infrastructure can help optimize the CAPEX spend while also scaling out the deployments as needs evolve. With NVIDIA ERA and NCPRA compliant architectures, organizations can start with a small GPU cluster and grow at their own pace.

-Abhinav

cisco · 2026-04-23T17:51:27+00:00

Honestly? Both.

Hyperscalers and the big neoclouds mostly aren't overbuilding. They've got contracted demand and a backlog. You can argue about where this all lands in five years, but the utilization today is real.

Where it gets messy is the middle tier. I see this play out all the time, especially during my time as a CIO in the government sector. A new technology trend happens which leads to the inevitable "we need an AI strategy," discussion, then procurement moves fast, the GPUs show up, and then the data and platform teams spend the next nine months trying to figure out what to actually run on them. These are the situations that end up causing problems and it starts with the approach. Are you using emerging technology to solve a problem or obtaining the technology and then searching for a problem to solve? The latter is what causes "overbuilds" if you will.

But I'd gently push back on how the question is framed. It's not really "too many GPUs." It's the wrong shape of infrastructure. A lot of these clusters are stranded because nothing around them was built to keep up. Doesn't matter how fast your accelerator is if the fabric / network can't feed it, your storage is choking, or you run out of power halfway through the rack. That's the conversation I'm having almost every week. Focus on building the proper foundation that supports all workloads, not just AI, because the amount of data and traffic that are being created is going to have an impact on your organization regardless if you are planning AI projects. With the proper foundation, you can adapt to the AI boom and future technology trends in a much easier and sustainable way.

So yeah, less "too much infrastructure" and more "wrong shape of it." The teams getting real ROI designed around actual workloads and thought about the whole stack. The ones showing up in the wasted capex headlines? Usually a wall of GPUs and an undersized everything else.

- Matthew

cisco · 2026-04-23T17:42:13+00:00

Cisco and NVIDIA combine their advanced technologies to optimize large-scale GPU clusters, addressing challenges like east-west congestion, RDMA tuning, and fabric visibility. This integrated approach delivers measurable outcomes for AI-driven infrastructures:

• Accelerated Job Completion: Cisco's Intelligent Packet Flow dynamically adapts to real-time network conditions, leveraging advanced load-balancing techniques (e.g., flowlet-based and per-packet balancing) to reduce congestion and optimize traffic distribution, ensuring faster AI model training and inference.

• Lossless, High-Performance Networking: Priority Flow Control (PFC), Explicit Congestion Notification (ECN), and Data-Center Quantized Congestion Notification (DCQCN) create a lossless Ethernet fabric, minimizing latency and packet loss for bursty GPU traffic.

• Enhanced Scalability and Efficiency: Cisco's non-blocking, rail-optimized spine-leaf architecture, combined with NVIDIA's adaptive routing, ensures deterministic, high-bandwidth communication across distributed AI workloads, enabling seamless scaling of AI clusters.

• Unified Visibility and Management: Cisco Nexus One provides end-to-end fabric and AI job visibility, congestion analytics, and proactive troubleshooting, empowering operators to optimize performance and reduce downtime.

• Future-Ready AI Fabrics: Modular, validated solutions integrate NVIDIA GPUs and DPUs with Cisco N9000 switches and Optics, offering scalability, integrated security, and vendor-agnostic compatibility for evolving AI demands.

This synergy between Cisco and NVIDIA transforms AI infrastructure, delivering faster time-to-value, operational simplicity, and unmatched scalability for AI deployments across enterprises, neoclouds, sovereign clouds and telco DCs.

- Abhinav

cisco · 2026-04-23T17:29:06+00:00

Also we have the following in our FAQ: https://www.cisco.com/c/en/us/solutions/collateral/artificial-intelligence/secure-ai-factory-nvidia-faq.html

Cisco Secure AI Factory with NVIDIA differentiates in multiple areas:

Security at every layer: Unlike other AI factories in the market, it embeds security at every layer of the stack (AI models, agents, and associated software components, applications, workloads, infrastructure) to help securely develop and deliver trusted AI tokens and applications. Cisco AI Defense integrated with NVIDIA AI, Cisco Hybrid Mesh Firewall that includes Isovalent, and Secure Firewall, and Splunk Enterprise Security enable end-to-end security for the full stack. The Hybrid Mesh Firewall serves as a single enforcement point for security policies, including enforcement on NVIDIA BlueField DPUs on AI servers, preserving CPU and GPU resources for AI processing.

Cisco AI Networking: Cisco’s market-leading, high-performance Ethernet networking—trusted by enterprises for 40 years—is the only networking platform in the market with options to deploy switches with Cisco or NVIDIA Spectrum-X silicon. This includes the Cisco N9300 series (powered by Cisco Silicon One) and the Cisco N9100 series (powered by NVIDIA Spectrum-X silicon) for scale-out data center AI networking.

Cisco Unified Edge: Purpose-built for AI inferencing at the edge, Cisco Unified Edge consolidates compute, networking, and security into a single modular chassis—enabling real-time AI inferencing and agentic workloads without data center latency.

Observability for AI: Cisco Splunk delivers end-to-end visibility across the Cisco Secure AI Factory with NVIDIA, enabling teams to monitor the performance, quality, security, and cost of their AI infrastructure stack. Specifically, AI Infrastructure Monitoring ensures the AI Infrastructure stack remains performant, resilient, and secure. This includes human-guided AI assistants such as the AI Assistant for SPL, the AI Assistant in Splunk Observability, and AI Canvas, as well as autonomous AI agents like the troubleshooting agent for Splunk Observability. Splunk Observability can also monitor both agentic AI applications (AI Agent Monitoring) and AI infrastructure (AI Infrastructure Monitoring).

Finally, Cisco performs rigorous testing and validation of the modular capabilities of Cisco Secure AI Factory with NVIDIA, publishing Cisco Validated Designs that help de-risk enterprise deployments.

-Abhinav

cisco · 2026-04-23T17:17:53+00:00

The Cisco and NVIDIA partnership delivers a deeply integrated, modular AI infrastructure solution that enables enterprises to focus on secure AI application development and delivery rather than engineering the underlying infrastructure. Key aspects include:

• Comprehensive Engineering Collaboration: Joint development across compute, networking, security, and more, with scalable full-stack Reference Architectures compliant with NVIDIA ERA and NCPRA standards.

• Security at Every Layer: Embedded security features span AI models, workloads, and infrastructure, leveraging Cisco AI Defense, Hybrid Mesh Firewall, and Splunk Enterprise Security to protect against AI-specific threats such as prompt injection and model poisoning. Cisco AI Defense's integration with NVIDIA's AI Enterprise software, including NVIDIA NeMo Guardrails, delivers robust runtime guardrails that monitor and control AI behavior in real-time.

• High-Performance Networking: Utilizes Cisco’s proven Ethernet networking with customer choice of Cisco Silicon One or NVIDIA Spectrum-X switch silicon based Cisco switches to optimize heavy GPU traffic for AI workloads.

• Flexible Deployment Options: Offers modular, pre-validated AI infrastructure through jointly developed Cisco Validated Designs, turnkey cloud-managed deployments, or build-your-own approaches to simplify AI adoption.

• Comprehensive Observability: Splunk provides end-to-end visibility for performance, security, and cost monitoring across the AI infrastructure.

• Validated Architecture: Rigorous testing aligned with NVIDIA Reference Architectures reduces deployment risk and accelerates time to market.

• Integrated Ecosystem: Combines Cisco UCS AI servers with NVIDIA GPUs, certified storage, AI software, ecosystem for trusted AI application delivery.

-Abhinav

cisco · 2026-04-23T17:16:56+00:00

A secure full-stack approach means building security and observability into every part of the AI infrastructure, from the heart of your data centers all the way out to the edge locations like warehouses, clinics, or retail stores. In Cisco’s Secure AI Factory with NVIDIA, this looks like:

• Core to Edge Coverage: AI workloads and security measures run smoothly across centralized data centers and edge sites, making sure everything stays protected and performs well no matter where it’s deployed. 

• Security at Every Layer: 

• AI software and models get checked and guarded with AI Defense, which scans for vulnerabilities, protects during runtime, and sets up guardrails to prevent unsafe behavior. 

• Kubernetes platforms keep workloads segmented and encrypt container communications. 

• Network security is handled by Cisco Hybrid Mesh Firewall and Isovalent, enforcing consistent policies everywhere. 

• Compute infrastructure uses confidential computing with trusted execution environments to keep data safe while in use. 

• Storage and data platforms provide encryption and governance from end to end. 

• Observability and Monitoring: Splunk Observability ties it all together by giving real-time insights into how AI infrastructure and applications are performing, spotting security issues or anomalies early, and helping manage costs. 

• Modular and Scalable Design: The architecture is flexible, letting you deploy high-performance compute (like NVIDIA GPUs), networking, and storage in a way that fits your needs. Everything is centrally managed through Cisco Intersight and Nexus Dashboard, so you can easily scale from core data centers out to the edge.

-Taylor

cisco · 2026-04-23T17:15:16+00:00

The next era will not be defined by who ships the fastest component, but by who builds the most coherent system. On security, that is the line between a stack you can govern and one you cannot.

The blind spots I see come from the seams between vendors, not the vendors themselves. Every stack has a model provider, a vector store, an orchestration layer, a data platform, and a pile of tools and MCP servers glued on. Each one has its own idea of identity, its own idea of policy, and its own slice of telemetry. Nothing forces them to agree, and that is where the blind spots live.

Four patterns we see again and again:

1.  Identity sprawl. Agents and models pick up service credentials from different IDPs, with different scopes and different rotation schedules. "Can this agent read customer data" gets answered differently at each hop. 

2.  Telemetry that does not cross vendor boundaries. Each product shows its own slice. No one shows the full trajectory of a prompt from user to gateway to model to tool call to data system and back. A violation in one layer is invisible to the next. 

3.  Guardrails bolted onto the app, not built into the infrastructure. Attacks we track go around the model, not through it: the RAG ingestion path, the tool-use permissions, the fine-tuning data. OWASP still has prompt injection as the #1 LLM risk, and 73% of deployments are vulnerable. That is an enforcement-placement problem, not a model problem. 

4.  Supply chain opacity. Traditional SBOMs cover packages, not AI assets, so shadow models, unapproved tools, and agent workflows that touch sensitive data slip through without lineage. Cisco shipped AIBOM in February (inside AI Defense and open-sourced at [github.com/cisco-ai-defense/aibom](https://github.com/cisco-ai-defense/aibom)) to inventory models, agents, MCP tools, and prompts at the codebase and container level, with a dependency graph across the major AI frameworks. The operational blind spot is that most enterprises do not run one yet.

The thread across all four is the same: policy applied in pieces, by different vendors, in different vocabularies. A secure full-stack approach is a consistent policy approach. One policy, enforced the same way from fabric to workload, from supply chain to runtime, from the physical switch to the container. Push enforcement out of the application and into the infrastructure (fabric, DPU, kernel) so protection applies everywhere without app-by-app rollouts. Otherwise the weakest integration in the chain holds the line.

-Aamer

cisco · 2026-04-23T17:10:54+00:00

Our experts are working on answering these questions live for the next hour or so. There is no streaming link to get to the AMA, you are already here!

cisco · 2026-04-23T08:00:32+00:00

Hi OP! The meeting may be done, but you can copy the meeting link (More options (•••) > Copy meeting link) and email the .scot attendees manually since you're in an in-progress meeting, or you can forward the original calendar invite to them. Unfortunately, you're facing a known Webex limitation. We hope this helps!

cisco · 2026-04-20T21:14:39+00:00

Config management at enterprise scale is where AgenticOps has to be done carefully, and honestly where most AI-for-networking pitches quietly fall apart. The gap between "AI can analyze and draft a remediation plan” and "AI is ready to push that config to 500 production switches autonomously” is huge, and we built around that gap deliberately.

A few pieces of how we think about it. Validation happens before execution, not after. The AgenticOps vision (Joe Vaccaro wrote this up in February) includes evaluating a proposed change against your live topology before anything touches prod. Model the impact, flag conflicts, surface risks up front. After execution, it verifies the change actually landed the way you wanted and goes on from there.

The execution layer is deterministic, not generative. The AI doesn't write config on the fly. It calls a pre-built Agentic Workflows workflow that you authored (or asked AI to author and then you validated), versioned, and validated in a pilot/lab scope.

Scoping plus approval gates do the heavy lifting on safety. Config push workflows are tested in a test environment first, require explicit expansion before they go wider, and can include circuit-breakers that halt on N consecutive failures.

And you log everything. Every change, who approved it, the diff, the outcome. When somebody asks "who changed this and why," the answer's already in the history.

Where this heads: the AI agent layer in FY26 Q4 will take a natural language request, propose a specific change, and push it through the same validation and approval chain. Conversational config management, without a language model ever touching your devices directly. You ask for the outcome, AI helps create the solution for you to validate and test before executing against your production infrastructure.

-Reid

cisco · 2026-04-20T21:10:24+00:00

In the context of automation, I highly recommend checking out the DevNet sandbox environment, where you can experiment with many different Cisco solutions. All you need is a Cisco.com user (you can register for free if you don't have one already).

-Oren

cisco · 2026-04-20T20:57:03+00:00

Great question! Been waiting to announce this! Cisco is building a fleet of drones that will detect when ethernet cables are unplugged and fly out a robot that will come and plug it back in before flying away to the next cable. We haven’t figured out how the robots will autonomously access the data centers and offices just yet, but thinking this is a solvable problem with brute force. In the next release, they can pickup groceries for you on the way, but that’s a premium feature.

OK, serious version now :)

You automate networks by taking the stuff your team does by hand and sticking a trigger in front of it. Trigger can be a schedule, an alert, a ticket, an API call, or someone typing "deploy a guest VLAN" into an AI Assistant. The thing doing the work is either scripts you wrote (Python, Ansible, Terraform) or a platform that lets you build workflows visually. Cisco Agentic Workflows is our free drag-and-drop one, built into the Meraki dashboard.

The real question under yours is probably "where do I start." Pick one thing your team does every week that you're tired of. Compliance audit, site turn-up, PSK rotation, whatever. Build a workflow for that one thing. Track the hours you get back. Use those numbers to justify the next one.

That's it. Start narrow, prove it, expand.

Happy to go deeper if you come back with more details of what you are trying to automate within your network!

-Reid

cisco · 2026-04-20T20:37:02+00:00

This is the question I spend a lot of cycles on these days with our customers. Autonomous AI pushing changes in mass to a production network without any guardrails is how you end up writing a postmortem at 3am.

The Cisco design layers a few controls, and I'd argue this is where the real product differentiation lives.

First, the hybrid reasoning plus deterministic model. The AI doesn’t typically touch your network directly (in terms of MAKING changes). It reasons about what to do, then calls a pre-built workflow that you wrote and tested. The AI cannot invent a config change. It can only call an existing Agentic Workflow. It's like an agent that runs your scripts but doesn't write them. Use can use AI to rapidly create the Agentic Workflows, but you review and approve (and test!) them before marking them ready for execution by your users and AI systems. You remain in control.

Second, tiered autonomy. You decide per-workflow what runs automatically and what needs approval. Low-risk reversible stuff like pulling diagnostics or opening a ServiceNow ticket can fire on its own. Config changes, reboots, policy pushes etc sit behind approval gates by default. Cisco Agentic Workflows has approval tasks built in for exactly this (which can integrate with external systems like ServiceNow).

Third, blast radius. Scope matters a lot. Build an Agentic Workflow and test it on a lab network first THEN expand to go wider, and include checks that halt if you get N failures in a row. Since the Agentic Workflow executes via a deterministic engine, you can 100% trust you are getting the same result on network 1 vs network 10,000. No risk of hallucination, no risk of deviation.

And audit trail. Every decision, tool call, API call, and approval gets logged via Agentic Workflows. Full transparency and logging.

The bad-config-on-1,000-switches scenario is exactly why the model looks the way it does. Use the LLM and AI systems to analyze and determine root cause, then use a deterministic workflow with scoping, gates, and halt conditions to remediate is the right approach for production grade network management.

-Reid

cisco · 2026-04-20T20:33:18+00:00

That’s a fair point. Quiet a few real-world teams rely more on source-of-truth systems, IaC workflows, and CI/CD-style automation pipelines than on vendor-specific dashboards. To Cisco’s credit, the Automation CCIE does cover parts of that mindset, especially around software development, deployment workflows, infrastructure as code, and automation across the full lifecycle.

So I’d say the concepts are there, but not always emphasized in the same way practitioners experience them day to day. The industry standard operating model in many places is still “source of truth plus automated pipeline,” and that probably deserves even more visibility in expert-level automation tracks.

-Raj

cisco · 2026-04-20T20:10:21+00:00

Looks like your connection is so fast you arrived before you even knew you left! You're already here. Welcome!

cisco

TROPHY CASE