We have a moderately dense deployment of 100-Gig LR4 (Both DWDM Lambdas and Juniper MX) around our WAN and we don't clock any background input errors on our interfaces unless there is an ongoing problem. That said, we have experienced issues with sub-millisecond link state changes between two endpoints that are physically cross connected to one another with no intermediary Layer 1 (DWDM, Etc.). There doesn't seem to be rhyme or reason to this and we've looked at each lane extensively and so far, everything has been inconclusive. We also experienced some code issues on Juniper MPC3D-NG's running 100-Gig's and our DWDM Client Ports where timing would start to slip and eventually cause the link to fail. Both Juniper and the DWDM Vendor found code variances they patched. We haven't had any such issues on Juniper MPC5's 7's or the 10003 Line Cards. TL;DR: In my experience, 100-Gig might require some more TLC then 10-Gig to run clean and is more sensitive to variations in transport. Other's mileage may vary. Best, JJ Stonebraker | Associate Director The University of Texas System | Office of Telecommunication Services (512) 232-0888 | jjs@ots.utsystem.edu ________________________________ From: NANOG <nanog-bounces+jjs=ots.utsystem.edu@nanog.org> on behalf of Graham Johnston <johnston.grahamj@gmail.com> Sent: Monday, July 19, 2021 12:19 PM To: Saku Ytti <saku@ytti.fi> Cc: nanog list <nanog@nanog.org> Subject: Re: 100G, input errors and/or transceiver issues Saku, I don't at this point have long term data collection compiled for the issues that we've faced. That said, we have two 100G transport links that have a regular background level of input errors at ranges that hover between 0.00055 to 0.00383 PPS on one link, and none to 0.00135 PPS (that jumped to 0.03943 PPS over the weekend). The range is often directionally associated rather than variable behavior of a single direction. The data comes from the last 24 hours, the two referenced links are operated by different providers on very different paths (opposite directions). Over shorter distances, we've definitely seen input errors that have affected PNI connections within a datacenter as well. In the case of the last PNI issue, the other party swapped their transceiver, we didn't even physically touch our side; I note this only to express that I don't think this is just a case of the transceivers that we are sourcing. Comparatively, other than clear transport system issues, I don't recall this sort of thing at all with 10G "wavelength" transport that we had purchased for years prior. I put wavelengths in quotes there knowing that it may have been a while since our transport was a literal wavelength as compared to being muxed into a 100G+ wavelength. On Mon, 19 Jul 2021 at 12:01, Saku Ytti <saku@ytti.fi<mailto:saku@ytti.fi>> wrote: On Mon, 19 Jul 2021 at 19:47, Graham Johnston <johnston.grahamj@gmail.com<mailto:johnston.grahamj@gmail.com>> wrote: Hey Graham,
How commonly do other operators experience input errors with 100G interfaces? How often do you find that you have to change a transceiver out? Either for errors or another reason. Do we collectively expect this to improve as 100G becomes more common and production volumes increase in the future?
New rule. Share your own data before asking others to share theirs. IN DC, SP markets 100GE has dominated the market for several years now, so it rings odd to many at 'more common'. 112G SERDES is shipping on the electric side, and there is nowhere more mature to go from 100GE POV. The optical side, QSFP112, is really the only thing left to cost optimise 100GE. We've had our share of MSA ambiguity issues with 100GE, but today 100GE looks mature to our eyes in failure rates and compatibility. 1GE is really hard to support and 10GE is becoming problematic, in terms of hardware procurement. -- ++ytti