Hi all.

I thought I'd share our recent experiences, per subject, just in case others run into the same problems.

So... we finally decided to try 17.3(4a)MD for the CSR1000v, after years of happy operation. Good Lord, what a drama!

At first, we couldn't figure out why iBGP sessions to all Cisco boxes could not stand up. Then we realized it's because IS-IS to them could not stand up. Then we realized it's because BFD sessions could not stand up.

But even after removing BFD, IS-IS remained down.

After 3 days of searching, we finally landed on CSCuz58508. In case you don't have CCO access, it is the same issue as described here:

    https://community.cisco.com/t5/cisco-cloud-service-router-csr/b00ocg4q4e-csr-1000v-16-3-1a-can-t-set-mtu-on-gig-interface/td-p/3054853

This was even more confusing for us, because our interface driver on VMware ESXi is vmxnet3.

The bug ID suggests the problem is fixed in 16.3(2) and 16.4(1). So to be safe, we tested 16.12(5)MD, which allowed us to enable jumbo frames, but that only appeared to be a cosmetic thing. In the background, the box was simply dropping packets, silently. We found this out when we tried to copy other files to the node, and it would just hang without any feedback. Removing the jumbo frame support allowed the files to come through.

We noticed that nodes still running 3.17(0)S did not have any issues with IS-IS or BFD, or MTU. However, this code was only ever released as an ED train (and to be fair, we've been having dodgy issues with it in recent years), so we decided to downgrade to 3.16(9)S (which is actually an upgrade from 3.17(00)S, since the 3.16 train is an MD release, with the latest release being March 2019, vs. July 2017 for 3.17(4)SED).

With that, no more MTU issues, BFD and IS-IS are happy, iBGP is happy.

We definitely won't be wasting any more time trying to make Denali, Gibraltar, Fuji, Everest or Amsterdam work on our CSR1000v complement.

Needless to say, moving the ASR1000 platform to 17.3 has also come with its own avenue of pleasure, what with all the ROMMON, CPLD and FPGA upgrade mess that is. What the documentation says and what happens in real life are two very different things. It has taken us a week to come up with our own working procedure to upgrade just one box, worse if it's a dual-RP system.

Mark.