The Cisco Collector Architecture
At the core of the network operations layer is the Cisco collector — a purpose-built SSH subsystem that maintains authenticated sessions to every managed device in your inventory. The collector does not rely on SNMP polling or RESTCONF endpoints. It opens a direct SSH channel to each device, authenticates using locally stored credentials from an encrypted vault, and drops into an interactive IOS exec session just like an engineer would from a terminal.
Each collector instance manages a pool of persistent SSH connections. When you ask the system a question about a router, it does not spin up a new session from scratch. It picks a warm connection from the pool, sends the appropriate command, reads the output buffer until the prompt returns, and parses the result. Connection pooling keeps response times under two seconds for most show commands, even across inventories of several hundred devices.
Legacy Cipher and KEX Support
Anyone who has managed a fleet of Cisco devices knows the pain: that 3825 running IOS 15.1 in the branch closet does not speak diffie-hellman-group14-sha256. It needs diffie-hellman-group1-sha1 and aes128-cbc, ciphers that modern SSH clients reject by default. The Cisco collector handles this transparently. During device onboarding, the collector performs a cipher negotiation probe against each host and records the highest mutually supported key exchange, cipher, and MAC algorithm. When it connects later, it presents exactly those parameters. No manual ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 flags. No separate jump hosts for legacy gear. A single collector handles everything from a Catalyst 2960 running 12.2(55) to a Catalyst 9300 on IOS-XE 17.x.
Why not NETCONF or RESTCONF? Many older IOS images ship without NETCONF subsystem support, and even when it is available, the XML schema coverage is inconsistent across platforms. SSH with screen-scraping is the universal denominator. Every Cisco device with a management IP speaks SSH (or at minimum, Telnet, which the collector also supports as a fallback). We chose the transport that works on 100% of deployed gear, not the one that works on the newest 20%.
Natural Language to IOS Command Translation
When an engineer types a request like "check OSPF neighbor status on core-rtr-01," the system does not perform a fuzzy keyword search against a static command map. The LLM interprets the intent, identifies the target device from the inventory by hostname or alias, selects the appropriate IOS command (in this case, show ip ospf neighbor), and dispatches it through the collector. The translation layer understands context: if you follow up with "what about its routing table," the system knows "its" refers to core-rtr-01 and sends show ip route to the same device without you re-specifying the hostname.
This is not a brittle intent-matching engine. The LLM can decompose complex requests into multi-command sequences. Asking "is there an OSPF flap on the WAN edge?" might trigger show ip ospf neighbor, show ip ospf interface, and show logging | include OSPF across multiple devices, with the AI correlating the outputs into a single coherent answer.
Show Command Parsing and Routing Table Analysis
Raw IOS output is human-readable but machine-hostile. The output of show ip route varies between IOS versions, between VRF and global table contexts, and between IPv4 and IPv6 address families. The platform's parsing layer handles all of this. It does not rely on TextFSM templates alone — the LLM itself acts as a flexible parser, extracting structured data from semi-structured output and cross-referencing it against known topology.
When you ask "does core-rtr-01 have a route to 172.16.50.0/24," the system runs show ip route 172.16.50.0 255.255.255.0, parses the result, and returns a direct answer: the route exists via OSPF with next-hop 10.1.0.2, or the route is missing entirely. If the route is missing, the AI can proactively check the OSPF LSDB with show ip ospf database on the advertising router to determine whether the prefix is being originated, filtered, or simply not reachable.
OSPF Verification Workflows
OSPF is the most commonly deployed IGP in enterprise networks, and it is also one of the most common sources of outages when adjacencies flap or areas are misconfigured. The platform provides deep OSPF verification that goes beyond checking neighbor state. It can:
- Verify area ID consistency between neighbors using
show ip ospf interfaceon both ends of a link - Compare hello and dead timers across adjacencies to identify timer mismatches before they cause a flap
- Check for MTU mismatches that prevent adjacencies from reaching FULL state by correlating
show ip ospf interface detailwithshow interfaceoutput - Detect stub/NSSA area misconfigurations by examining
show ip ospfand comparing area types across all routers in the area - Trace LSA propagation through the LSDB to determine why a prefix is or is not reachable from a given vantage point
The Approval Queue: Write Command Governance
Read commands are executed immediately. show, display, ping, traceroute — these are non-destructive and return results in real time. Write commands are a different story. Any command that would modify running-config, alter a routing process, shut down an interface, or change a VLAN assignment passes through the approval queue before execution.
Here is how it works: when the AI determines that fulfilling your request requires a configuration change, it generates the exact IOS command sequence (including the configure terminal entry, the specific commands, and the end), presents it to you in the chat for review, and places it in a pending state. You explicitly approve or reject the change. Only after approval does the collector push the commands into the device's config mode. There is no "auto-apply" mode. There is no way for the AI to bypass the gate.
After execution, the system performs post-change verification. If you approved a command to add a network statement to OSPF, the system will automatically run show ip ospf interface and show ip ospf neighbor after the change to confirm the new network is being advertised and adjacencies are stable. If it detects a problem — say, the neighbor on that interface dropped to INIT state — it alerts you immediately and can propose a rollback.
What the AI Actually Sees
Transparency matters. When the platform answers a question about your network, it shows you exactly which commands it ran and on which devices. There are no hidden API calls, no cached data from an hour ago presented as live. Every piece of information is traceable to a specific SSH session, a specific command, and a specific timestamp. If the AI says an interface is up, you can see the show interface output it based that conclusion on.
This traceability is also the backbone of the anti-fabrication engine. The AI is architecturally constrained from making claims about device state without first retrieving live data from the device. If the SSH connection to a device is down, the AI will tell you it cannot reach the device rather than guess at its state. There is no hallucination pathway because the system does not have a "make something up" fallback.
Multi-Vendor Reality
While the Cisco collector is the most mature integration, the same architectural pattern — persistent SSH, show command parsing, approval-gated writes, post-change verification — applies to the broader network operations roadmap. The collector framework is extensible. Adding a new vendor means defining the prompt patterns, command syntax, and output parsers for that platform. The approval queue and verification loops remain identical regardless of whether the target is an ISR 4431 or a different vendor's equipment.
Practical Scenarios
Day-to-day, engineers use the network operations module for tasks that would otherwise require opening four terminal windows and cross-referencing output manually:
- Morning health checks: "Show me any OSPF neighbors that are not in FULL state across all routers" triggers a sweep of every managed device and returns only the anomalies.
- Trunk troubleshooting: "Is VLAN 150 allowed on the trunk between dist-sw-01 and dist-sw-02?" pulls
show interface trunkon both ends and compares the allowed VLAN lists. - Change validation: After a maintenance window, "verify all OSPF adjacencies are stable and no interfaces are error-disabled" gives a fleet-wide health report in seconds.
- Capacity planning: "What is the current CPU and memory utilization on all 4000-series routers?" gathers
show processes cpuandshow platform resourcesacross the relevant devices and summarizes the results.
Every one of these workflows is a conversation. You ask a follow-up question, the system maintains context, and each answer links back to the raw device output. This is AI network troubleshooting built for engineers who need to trust the data, not a dashboard that hides the commands behind a pie chart.