[NANOG] OAM and multiple choice questions

Hi, all. A few months ago, I got really good guidance here to pursue OAM instead of trying to use BFD in unnatural ways. I've been reaching out to my vendors and through various searches, since the OAM space is wholly foreign to me, and I'm currently on four different paths: * IEEE 802.1ag<https://www.ieee802.org/1/pages/802.1ag.html> now part of IEEE 802.1Q-2022 CFM * IEEE 802.3ah<https://www.ieee802.org/21/doctree/2006_Meeting_Docs/2006-11_meeting_docs/802.3ah-2004.pdf> Ethernet OAM (aka Link OAM) * ITU-T Y.1731<https://www.itu.int/rec/T-REC-Y.1731> OAM functions and mechanisms for Ethernet based networks * MEF 17<https://www.mef.net/resources/mef-17-service-oam-framework-and-requirements/> Service OAM Framework and Requirements One of my vendors pointed out to me that recent search guidance is probably a bit hard to find because this is mostly service provider space (not the enterprise space I live in), and they generally have solved for this in the past and moved on. So I feel like these are four hills I'm simultaneously trying to climb instead of knowing one or two better ones and focusing on those. I'm curious for the folks here that have in fact solved for OAM on your networks — particularly in an MX480/MX304 Junos environment (I also have a bunch of Cat 8K/9K IOS-XE and some NCS5500 IOS-XR, but let's focus on Junos) — which standard you went with. If you can also speak to why you chose that path, that would help a lot. Extra bonus points if you have some configuration that you can share (privately if you wish) that helps my decision process for craft/complexity. -dp

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Saturday, 19 April 2025 at 02:57, David Zimmerman via NANOG <nanog@lists.nanog.org> wrote:
Hi, all. A few months ago, I got really good guidance here to pursue OAM instead of trying to use BFD in unnatural ways. I've been reaching out to my vendors and through various searches, since the OAM space is wholly foreign to me, and I'm currently on four different paths:
* IEEE 802.1aghttps://www.ieee802.org/1/pages/802.1ag.html now part of IEEE 802.1Q-2022 CFM
* IEEE 802.3ahhttps://www.ieee802.org/21/doctree/2006_Meeting_Docs/2006-11_meeting_docs/80... Ethernet OAM (aka Link OAM)
* ITU-T Y.1731https://www.itu.int/rec/T-REC-Y.1731 OAM functions and mechanisms for Ethernet based networks
* MEF 17https://www.mef.net/resources/mef-17-service-oam-framework-and-requirements/ Service OAM Framework and Requirements
Hi David, I was working on OAM stuff until circa 3 years ago, I made some notes comparing the different technologies you mentioned here: https://null.53bits.co.uk/page/ethernet-oam-and-cfm-standards You didn't say what your goal is (or maybe I missed it?). I was working for a service provider, providing wholesale point-to-point Layer 2 connectivity. This meant p2p pseudowires, and/or multi-segment pseudowires. We wanted to check the one way latency, two way latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently over the L2 service. If this is also your goal, you want Y.1731 (some vendors still call it IEEE 802.1ag even though they implement features not in IEEE 802.1ag, Y.1731 is kind of a superset of IEEE 802.1ag + fault management features). If this is you goal, I don't have any Juniper configs I can share with you, but a tip; we had this running over 3 different vendors, and they mixed the terminology quite a bit (new at ten, inter-op is tricky, shocking!). We tested it extensively so it was all definitely working but, as an example, the A end of a pseudowire was using vendor A, and the B end was using vendor B; on one end we configured a down MEP, and the other end an UP MEP, which was just plain wrong for our topology, but if we configured the same at both ends the CFM session wouldn't come up. We made lots of packet captures and opened TAC cases, and eventually found the magic config needed to get interop working. Some vendors acknowledged bugs which were fixed in later software releases, but that was of no use to us because, that would mean truck rolling everything vs sticking with our working but-slightly-misleading config (which was well documented anyway). So if my goals are/were similar to your goals, you want IEEE 802.1ag/Y.1731. IEEE 802.3ah is more for single physical segment monitoring and testing, we wanted L2 VPN testing, not single segment testing. As for MEF, we didn't look into this because we had no desire to be MEF certified. I hope that helps someone. Cheers, James. -----BEGIN PGP SIGNATURE----- Version: ProtonMail wsG5BAEBCgBtBYJoEgzmCZCoEx+igX+A+0UUAAAAAAAcACBzYWx0QG5vdGF0 aW9ucy5vcGVucGdwanMub3Jn5LQqSxSUCOfZrb+H1Wr9smh45mZfTB9G+hFA hOqJwG8WIQQ+k2NZBObfK8Tl7sKoEx+igX+A+wAAIkgP+gIpvThyV/lOrzMu 5UxE5K7gxPyleAIb8pD4Jq9NNC20sXNhwIUykGBXNHWJpLLpoFOZFW1S3n4V y5DZMpVjhzQeFbdSq5P7C23RsPr8udm8W1gLjSNxFu+6rjPmrgQfjNLuW3Ud BflnXrjEfoxJk/TVTbAt6qiH6PegCs12sMZNbyUD9DuY/gxh74lNi+nP8wxV R9q20v3Z9hh3Dn5Ks8MWrl/7QAekUni3GJEmQgzvPfiNBkudXB0ChADaiAIF 8GeaR8qFnvubh0ZKZdciaphEnnNnRtHfkNrRFfk1ckdA2fI1qLaFZQ203h3S nmwVbfXCjxvKxwxU2QhdKMeI7UUK8NXLdK+PWlCm3fFO/RTUNhjy93z8jhn+ p0Hhk0mNZdhUPGj8bYDFPUHmpn7RXwgkV4+1iKI/dhCBvrha2UL1cRUC4eTF EARS/L13R9y9CDZ7akSt/t/z9ZDvf6jfcHeVEoiM50OtCLJQr6bQB+fMb6Fd DFvPZL9JHp7tXYNT0qeHm3XQN1Jsbyk+AfSZ5UPD1cE48uv86LVguvk4TZ4t Ohhqf2hNxacb8j0bJ3bISGPysxXbDNn85HzXex7tN2+cSHrEm36NzEsonx10 SDi2v1gUtvY8nTa1yoWmiQu6FZ/yHQmA1HUxspbC5OGhrjg2ACj9Cz9x3sgV UC6XUpBQ =JNlk -----END PGP SIGNATURE-----

Hi, James. My email earlier this year to NANOG questioned using BFD through a "how far can I turn this knob" lens, which I got great feedback to "don't do that", but look into OAM instead, and here we are. My use case is pretty specific at the moment — frame loss measurement, reporting, alerting, mitigation — for single-segment L1 or L1.5 links, or L2VPN links that are provided by someone else and I don't see underneath the abstraction. Or put another way, my two ends of a link can already signal an actual link loss but are blind to losses or corruption of individual frames. Something that reads like link synthetic testing is what gets my attention. That's a great page, thanks for the URL and for the tips below! I'm absorbing anything I can to lean into the right direction, and I particularly find the "Present in IEEE802.1ag" tagging helpful to understand the standards' overlap. The part of your page that jumps out at me is "Ethernet Frame Loss Measurement (ETH-LM)". I'm primarily looking for guidance on what others like you are doing, and stanza contribution helps a lot but isn't nearly as necessary as picking the right direction to point myself out of the gate. The feedback I've received does indeed seem to be leaning towards Y.1731 functionality. -dp From: James Bensley <lists+nanog@bensley.me> Date: Wednesday, April 30, 2025 at 4:44 AM To: David Zimmerman <dzimmerman@linkedin.com>, nanog@lists.nanog.org <nanog@lists.nanog.org> Subject: Re: [NANOG] OAM and multiple choice questions -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Saturday, 19 April 2025 at 02:57, David Zimmerman via NANOG <nanog@lists.nanog.org> wrote:
Hi, all. A few months ago, I got really good guidance here to pursue OAM instead of trying to use BFD in unnatural ways. I've been reaching out to my vendors and through various searches, since the OAM space is wholly foreign to me, and I'm currently on four different paths:
* IEEE 802.1aghttps://www.ieee802.org/1/pages/802.1ag.html now part of IEEE 802.1Q-2022 CFM
* IEEE 802.3ahhttps://www.ieee802.org/21/doctree/2006_Meeting_Docs/2006-11_meeting_docs/80... Ethernet OAM (aka Link OAM)
* ITU-T Y.1731https://www.itu.int/rec/T-REC-Y.1731 OAM functions and mechanisms for Ethernet based networks
* MEF 17https://www.mef.net/resources/mef-17-service-oam-framework-and-requirements/ Service OAM Framework and Requirements
Hi David, I was working on OAM stuff until circa 3 years ago, I made some notes comparing the different technologies you mentioned here: https://null.53bits.co.uk/page/ethernet-oam-and-cfm-standards You didn't say what your goal is (or maybe I missed it?). I was working for a service provider, providing wholesale point-to-point Layer 2 connectivity. This meant p2p pseudowires, and/or multi-segment pseudowires. We wanted to check the one way latency, two way latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently over the L2 service. If this is also your goal, you want Y.1731 (some vendors still call it IEEE 802.1ag even though they implement features not in IEEE 802.1ag, Y.1731 is kind of a superset of IEEE 802.1ag + fault management features). If this is you goal, I don't have any Juniper configs I can share with you, but a tip; we had this running over 3 different vendors, and they mixed the terminology quite a bit (new at ten, inter-op is tricky, shocking!). We tested it extensively so it was all definitely working but, as an example, the A end of a pseudowire was using vendor A, and the B end was using vendor B; on one end we configured a down MEP, and the other end an UP MEP, which was just plain wrong for our topology, but if we configured the same at both ends the CFM session wouldn't come up. We made lots of packet captures and opened TAC cases, and eventually found the magic config needed to get interop working. Some vendors acknowledged bugs which were fixed in later software releases, but that was of no use to us because, that would mean truck rolling everything vs sticking with our working but-slightly-misleading config (which was well documented anyway). So if my goals are/were similar to your goals, you want IEEE 802.1ag/Y.1731. IEEE 802.3ah is more for single physical segment monitoring and testing, we wanted L2 VPN testing, not single segment testing. As for MEF, we didn't look into this because we had no desire to be MEF certified. I hope that helps someone. Cheers, James. -----BEGIN PGP SIGNATURE----- Version: ProtonMail wsG5BAEBCgBtBYJoEgzmCZCoEx+igX+A+0UUAAAAAAAcACBzYWx0QG5vdGF0 aW9ucy5vcGVucGdwanMub3Jn5LQqSxSUCOfZrb+H1Wr9smh45mZfTB9G+hFA hOqJwG8WIQQ+k2NZBObfK8Tl7sKoEx+igX+A+wAAIkgP+gIpvThyV/lOrzMu 5UxE5K7gxPyleAIb8pD4Jq9NNC20sXNhwIUykGBXNHWJpLLpoFOZFW1S3n4V y5DZMpVjhzQeFbdSq5P7C23RsPr8udm8W1gLjSNxFu+6rjPmrgQfjNLuW3Ud BflnXrjEfoxJk/TVTbAt6qiH6PegCs12sMZNbyUD9DuY/gxh74lNi+nP8wxV R9q20v3Z9hh3Dn5Ks8MWrl/7QAekUni3GJEmQgzvPfiNBkudXB0ChADaiAIF 8GeaR8qFnvubh0ZKZdciaphEnnNnRtHfkNrRFfk1ckdA2fI1qLaFZQ203h3S nmwVbfXCjxvKxwxU2QhdKMeI7UUK8NXLdK+PWlCm3fFO/RTUNhjy93z8jhn+ p0Hhk0mNZdhUPGj8bYDFPUHmpn7RXwgkV4+1iKI/dhCBvrha2UL1cRUC4eTF EARS/L13R9y9CDZ7akSt/t/z9ZDvf6jfcHeVEoiM50OtCLJQr6bQB+fMb6Fd DFvPZL9JHp7tXYNT0qeHm3XQN1Jsbyk+AfSZ5UPD1cE48uv86LVguvk4TZ4t Ohhqf2hNxacb8j0bJ3bISGPysxXbDNn85HzXex7tN2+cSHrEm36NzEsonx10 SDi2v1gUtvY8nTa1yoWmiQu6FZ/yHQmA1HUxspbC5OGhrjg2ACj9Cz9x3sgV UC6XUpBQ =JNlk -----END PGP SIGNATURE-----

Take a look at TWAMP, which may solve your problems. Shane
On Apr 30, 2025, at 7:56 AM, James Bensley via NANOG <nanog@lists.nanog.org> wrote:
latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently

Interesting talk. I’ve been wanting to dive deeper into this topic for a while. Do you all know if I can easily enable these types of OAM on my mpls pseudowire/vpls through my mpls cloud in Juniper Junos? Furthermore, can I also send some sort of OAM on my ENNI vlan subinterfaces to pump synthetic OAM pdu’s through neighboring operator clouds towards a remote nid I manage (Accedian)? This is similar to what someone previously mentioned about measuring third-party clouds that are abstracted from their point of view, but can at least get some sort of measurement. Currently, we plug in Accesian NID’s at various places in the network and run Y.1731 or RFC2544, but as I mentioned in the previous paragraph, I would like to insert points along the way in my network and furthermore through my neighboring operator clouds as well. Aaron
On Apr 30, 2025, at 7:08 PM, sronan--- via NANOG <nanog@lists.nanog.org> wrote:
Take a look at TWAMP, which may solve your problems.
Shane
On Apr 30, 2025, at 7:56 AM, James Bensley via NANOG <nanog@lists.nanog.org> wrote:
latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently
NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/52REISY2...

Look at TWAMP, no NID required.
On Apr 30, 2025, at 8:32 PM, Aaron1 <aaron1@gvtc.com> wrote:
Interesting talk. I’ve been wanting to dive deeper into this topic for a while.
Do you all know if I can easily enable these types of OAM on my mpls pseudowire/vpls through my mpls cloud in Juniper Junos?
Furthermore, can I also send some sort of OAM on my ENNI vlan subinterfaces to pump synthetic OAM pdu’s through neighboring operator clouds towards a remote nid I manage (Accedian)? This is similar to what someone previously mentioned about measuring third-party clouds that are abstracted from their point of view, but can at least get some sort of measurement.
Currently, we plug in Accesian NID’s at various places in the network and run Y.1731 or RFC2544, but as I mentioned in the previous paragraph, I would like to insert points along the way in my network and furthermore through my neighboring operator clouds as well.
Aaron
On Apr 30, 2025, at 7:08 PM, sronan--- via NANOG <nanog@lists.nanog.org> wrote:
Take a look at TWAMP, which may solve your problems.
Shane
On Apr 30, 2025, at 7:56 AM, James Bensley via NANOG <nanog@lists.nanog.org> wrote:
latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently
NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/52REISY2...

Thanks, Shane — as the OP, I can say that TWAMP (and STAMP) did cross my radar early on, but didn't appear to be a home run in solving for my loss-alerting use case (although it does help for other adjacent synthetic testing contexts). -dp From: sronan@ronan-online.com <sronan@ronan-online.com> Date: Wednesday, April 30, 2025 at 5:08 PM To: nanog@lists.nanog.org <nanog@lists.nanog.org> Cc: David Zimmerman <dzimmerman@linkedin.com>, James Bensley <lists+nanog@bensley.me>, nanog@lists.nanog.org <nanog@lists.nanog.org> Subject: Re: [NANOG] OAM and multiple choice questions Take a look at TWAMP, which may solve your problems. Shane
On Apr 30, 2025, at 7:56 AM, James Bensley via NANOG <nanog@lists.nanog.org> wrote:
latency, and packet loss of the service, and in the case of multi-segment pseudowires check this e2e, and for each segment. We also wanted to provide link-loss forwarding for the services. Finally, we wanted to allow our customers to also use CFM over our service, so we used levels 0-3 internally, and passed levels 4-7 transparently
participants (4)
-
Aaron1
-
David Zimmerman
-
James Bensley
-
sronan@ronan-online.com