
On Fri, 1 Aug 2025 at 16:44, Mel Beckman via NANOG <nanog@lists.nanog.org> wrote:
Also, non-management interfaces do packet processing in silicon at the ASIC level and don’t have the capacity to do anything more than statistical sampling of packets that require CPU-level processing to retrieve counters and generate SNMP responses. 62 % is as good a sampling rate as any other.
Absolutely not. We expect to process 100% of legitimate control-plane traffic, e.g. BGP, ISIS, LDP, ARP, SNMP etc. 62% would be devastating. In fair weather this is easy, in bad weather you need hardware based discrimination on what is expected good traffic and what is unexpected bad traffic. Drew is in the right to expect functioning SNMP and is experiencing significant regression in behaviour compared to previous devices from the same vendor. It would take a very long time to explain how to troubleshoot this, as it is an extremely complicated topic with a lot of nuance that even the best experts of Cisco are unaware of. I've regularly had TAC handwave problems away 'sometimes it be like that' because they didn't want to do the work. Once our NOC spent months on a case where TAC was blaming our QoS configuration for BGP flaps, by the time I got on it, I escalated it to Xander, and initially even Xander agreed with TAC that we need to look into QoS configuration, until I reminded him that LPTS is not subject to QoS or ACL (which is terrible design choice, for reasons I'm happy to elaborate), which immediately reminded him how LPTS works and the TAC case finally got some traction. This is a completely untenable situation, IOS-XR regularly has complicated problems that TAC is not equipped to solve and the expectation is that the user has deep enough knowledge to rebuff them. -- ++ytti