The Challenge

“Configure OSPF on this router” sounds simple. It is one sentence, one clear intent. But in practice, engineers never say it the same way twice. We hear “add this subnet to OSPF,” “make sure R3 is in area 0,” “set up routing between these two interfaces,” “put the 10.0.3.0 network into the OSPF process,” and “enable OSPF on the link facing the distribution switch.” Every one of those requests maps to some variation of router ospf and a network statement, but the phrasing is wildly different each time.

Building a natural language interface for Cisco IOS means solving this mapping problem at scale — across hundreds of IOS command families, across show commands and configuration commands, across devices running everything from IOS 12.2 to IOS-XE 17.x. We could not use a static lookup table. The system needed to genuinely understand what the operator was asking and translate that intent into syntactically correct IOS commands. Here is how we built it.

Intent Mapping: Read vs. Write

The first and most critical classification the system makes is whether a request is a read operation or a write operation. This distinction drives everything downstream — execution path, approval requirements, and risk level.

Read operations are show commands. They are non-destructive, they do not modify running-config, and they can be executed immediately without human approval. When someone says “check the routing table,” “what are the OSPF neighbors,” or “is the WAN interface up,” the system identifies the intent as informational and maps it to the appropriate show command: show ip route, show ip ospf neighbor, show interface GigabitEthernet0/0.

Write operations are configuration commands — anything that enters configure terminal mode and changes device state. “Add a static route to 10.5.0.0/16,” “shut down interface Gi0/2,” “change the OSPF hello timer to 5 seconds.” These are flagged immediately. The system generates the full command sequence but does not execute it. It goes into the approval queue. We will get to that.

The intent classifier is not a keyword match. “Show me the OSPF config” is a read (“show” is informational, maps to show running-config | section router ospf). “Set up OSPF on this router” is a write. “Show” appears in both sentences. The LLM understands the difference because it reasons about the full request, not individual tokens.

The SSH Layer

Natural language translation is useless if you cannot actually reach the device. The Cisco collector establishes SSH sessions to every managed router and switch, and this is where real-world networking gets ugly.

Modern SSH clients default to strong ciphers: aes256-gcm, curve25519-sha256, chacha20-poly1305. That is great for your Catalyst 9300 running IOS-XE 17.9. It does not work on the ISR 2811 in the branch closet running IOS 15.1(4)M, which tops out at aes128-cbc and needs diffie-hellman-group1-sha1 for key exchange. Modern OpenSSH will refuse to connect to these devices without explicit override flags.

Our collector handles this transparently. During device onboarding, it probes each host's SSH capabilities and records the highest mutually supported key exchange algorithm, cipher, and MAC. When it connects later, it presents exactly those parameters. No manual -oKexAlgorithms=+diffie-hellman-group1-sha1 -oHostKeyAlgorithms=+ssh-rsa flags. No separate jump hosts for legacy gear. A single collector handles the entire fleet, from a 2960 on 12.2(55) to a 9500 on 17.x, negotiating 3des-cbc with the old gear and aes256-gcm with the new.

Connection management follows an on-demand model with keep-alive. The collector opens a session when a device is first queried, holds the connection warm with periodic keep-alive packets, and tears it down after an idle timeout. For inventories under a few hundred devices, this keeps response times under two seconds for most show commands while avoiding the overhead of maintaining thousands of persistent sessions that might never be used.

Command Translation in Practice

Abstract architecture is less useful than concrete examples. Here is what natural language to IOS translation actually looks like in the system.

tli-platform
you → How are my routers doing?
 
[LIVE SSH] core-rtr-01# show version | include uptime
core-rtr-01 uptime is 142 days, 7 hours, 23 minutes
[LIVE SSH] core-rtr-01# show processes cpu | include CPU
CPU utilization for five seconds: 8%/2%; one minute: 7%; five minutes: 6%
[LIVE SSH] core-rtr-01# show ip ospf neighbor | count FULL
Number of lines which match regexp = 4
 
[LIVE SSH] wan-rtr-01# show version | include uptime
wan-rtr-01 uptime is 142 days, 7 hours, 19 minutes
[LIVE SSH] wan-rtr-01# show processes cpu | include CPU
CPU utilization for five seconds: 42%/12%; one minute: 38%; five minutes: 35%
 
AI → Both routers are online. core-rtr-01 is healthy — 142 days uptime, 6% CPU, 4 FULL OSPF adjacencies. wan-rtr-01 has elevated CPU at 35-42%. One-minute and five-minute averages are both above 35%, which suggests sustained load rather than a transient spike. Want me to pull the top processes on wan-rtr-01?

The vague question “how are my routers?” decomposed into show version, show processes cpu, and show ip ospf neighbor across all managed router devices. The AI chose those commands because they cover the fundamentals: is it reachable, how long has it been up, is the CPU healthy, are routing adjacencies stable. A more specific question produces more targeted commands.

tli-platform
you → Add network 10.0.3.0/24 to OSPF area 0 on core-rtr-01
 
APPROVAL REQUIRED Write operation — the following commands will be sent to core-rtr-01:
 
configure terminal
router ospf 1
network 10.0.3.0 0.0.0.255 area 0
end
 
Risk: MEDIUM — Adds a network statement to an active OSPF process.
Target: core-rtr-01 (10.0.0.1) — OSPF Process ID 1

Notice the subnet-to-wildcard conversion. The operator said /24. The system generated 0.0.0.255. This is exactly the kind of translation that a natural language interface needs to handle correctly — Cisco IOS uses wildcard masks in OSPF network statements, not CIDR notation or subnet masks. Getting this wrong does not produce an error; it produces a wrong network match that silently breaks routing. The LLM knows IOS syntax conventions and generates the correct wildcard mask for any prefix length.

The Approval Queue

We made an early architectural decision that we have never reconsidered: write commands always require human approval. There is no auto-execute mode for configuration changes. There is no “trust level” that bypasses the gate. If a command would modify running-config, it goes through the queue.

The approval flow works like this. The AI generates the exact IOS command sequence, including the configure terminal entry and the end. It presents this to the operator as an approval card that includes the full command preview, the target device hostname and management IP, a risk classification (low, medium, high based on command scope), and a brief explanation of what the commands will change. The operator reviews and explicitly approves or rejects.

Risk classification is not arbitrary. shutdown on an interface is high risk. Adding a network statement to OSPF is medium. Setting a description on an interface is low. But all three still require approval. The risk level affects how prominently the approval card is displayed and whether additional warnings are shown — it does not determine whether approval is required.

tli-platform
you → approved
 
[LIVE SSH] Commands applied to core-rtr-01.
 
[POST-CHECK] core-rtr-01# show ip ospf interface brief
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Gi0/0 1 0 10.0.1.1/24 1 DR 1/1
Gi0/1 1 0 10.0.2.1/24 1 BDR 1/1
Gi0/2 1 0 10.0.3.1/24 1 DR 0/0
 
AI → Verified. 10.0.3.0/24 is now active in OSPF area 0 on GigabitEthernet0/2, process ID 1, cost 1. The interface is in DR state with no neighbors yet — expected if the adjacent device has not been configured for OSPF on that segment. Neighbor discovery is in progress.

Post-change verification happens automatically. After pushing the approved commands, the system runs relevant show commands to confirm the change took effect. If it detects a problem — say, the OSPF process did not pick up the new network, or an existing adjacency dropped — it alerts immediately and can propose a rollback command sequence (which also goes through the approval queue).

What We Learned

The hardest part of this project was not generating IOS commands. The LLM is remarkably good at producing correct IOS syntax given a clear intent. The hard part was parsing the output that comes back.

IOS CLI output is designed for humans reading a terminal, not for machines parsing structured data. The output of show ip ospf neighbor uses fixed-width columns, but the column widths vary depending on the length of the interface name and the neighbor ID. show ip route uses indentation to indicate ECMP paths, but the indentation is not consistent between IOS versions. show running-config includes tab completion artifacts if someone typed partial commands and hit tab during an interactive session — those artifacts end up in the saved config and show up in the output.

We found that every device has slightly different formatting quirks. A Catalyst 3750 running 12.2(55) wraps long lines differently than a 4431 running 16.9. An ISR with a service module inserts additional output sections that do not exist on the base platform. VRF-aware show commands return different column headers than global table commands. Parsing all of this reliably required treating the LLM itself as a flexible parser — one that can handle formatting variation because it understands the meaning of the output, not just its column positions.

Key lesson: We initially tried to parse every show command output with structured templates (similar to TextFSM). This approach broke constantly as we added new device models and IOS versions to the inventory. Letting the LLM parse unstructured output into tagged, structured data — and then validating those tags against known schemas — turned out to be far more resilient. The AI handles formatting variation gracefully because it reads the output the same way a human engineer does: by understanding what the fields mean, not where the columns are.

The other major lesson was the importance of structured data tagging on every piece of information the AI returns. When the system tells you a router has 4 FULL OSPF adjacencies, that “4” is not just text in a chat message. It is a tagged data point linked to a specific show ip ospf neighbor output from a specific device at a specific timestamp. This tagging feeds the anti-fabrication engine — every claim is traceable to a real command output, which makes it architecturally impossible for the AI to fabricate network state.

Building a natural language interface for Cisco IOS is not about wrapping a chatbot around a command reference. It is about understanding engineer intent, producing syntactically correct IOS commands, navigating the mess of legacy SSH negotiation, parsing inconsistent CLI output, and enforcing the safety boundaries that production networks demand. We are still improving it — every new device model we onboard teaches us something about an edge case we had not seen before — but the core architecture has proven sound.