Hey, For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier. On paper, BFD between two devices over a local or metro dark fibre or wave seems pretty trivial: Assuming your gear can a) support echo mode b) hardware offloads echo processing c) automatically treats echos as vital and puts them into the appropriate high priority queue, then setting the timers down to their lowest possible values (3ms on some of the gear that I’ve seen) and some low multiplier seems more than reasonable. But? From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on? From yet another angle, what if your link is a long-haul wave, or for that matter a wave of any distance that imposes a one-way latency that is higher than the minimum tx and rx timers that are supported by your gear? We’ll assume an unprotected wave, because I’m sure if it’s protected, you have no choice but to consider the one-way latency of the longest of the two segments. I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions? Thanks!
In practice, the vendor's recommendations regarding Routing Engine HA provide a lower bound. I'm just starting out with 1000ms x 3 multiplier, but my network is not national or global. I believe I could go as low as 500ms to keep HA happy. On Wed, Mar 21, 2018 at 09:10:28AM -0400, Jason Lixfeld wrote:
For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier.
Using 250ms x 3 on fiber connecting Pennsylvania to Florida... Best regards, Alex -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Jason Lixfeld (External) Sent: Wednesday, March 21, 2018 9:10 AM To: NANOG Subject: How are you configuring BFD timers? Hey, For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier. On paper, BFD between two devices over a local or metro dark fibre or wave seems pretty trivial: Assuming your gear can a) support echo mode b) hardware offloads echo processing c) automatically treats echos as vital and puts them into the appropriate high priority queue, then setting the timers down to their lowest possible values (3ms on some of the gear that I’ve seen) and some low multiplier seems more than reasonable. But? From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on? From yet another angle, what if your link is a long-haul wave, or for that matter a wave of any distance that imposes a one-way latency that is higher than the minimum tx and rx timers that are supported by your gear? We’ll assume an unprotected wave, because I’m sure if it’s protected, you have no choice but to consider the one-way latency of the longest of the two segments. I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions? Thanks! This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
Using 200 ms / 200 ms / x3 on either metro dark fiber or longhaul waves (Paris / Frankfurt / Amsterdam) successfully. Best regards. Y. 2018-03-21 16:11 GMT+01:00 Alex Lembesis <Alex.Lembesis@tevapharm.com>:
Using 250ms x 3 on fiber connecting Pennsylvania to Florida...
Best regards,
Alex
-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Jason Lixfeld (External) Sent: Wednesday, March 21, 2018 9:10 AM To: NANOG Subject: How are you configuring BFD timers?
Hey,
For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier.
On paper, BFD between two devices over a local or metro dark fibre or wave seems pretty trivial: Assuming your gear can a) support echo mode b) hardware offloads echo processing c) automatically treats echos as vital and puts them into the appropriate high priority queue, then setting the timers down to their lowest possible values (3ms on some of the gear that I’ve seen) and some low multiplier seems more than reasonable. But?
From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on?
From yet another angle, what if your link is a long-haul wave, or for that matter a wave of any distance that imposes a one-way latency that is higher than the minimum tx and rx timers that are supported by your gear? We’ll assume an unprotected wave, because I’m sure if it’s protected, you have no choice but to consider the one-way latency of the longest of the two segments.
I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions?
Thanks!
This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
Wouldn't any tangible problem on a dark-fiber link result in an interface shutdown, ostensibly creating the trigger one would need to begin re-convergence? On 3/21/18 11:31 AM, Alex Lembesis wrote:
To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible.
Best regards,
Alex
-----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers?
Silly question perhaps, but why would you do BFD on dark fiber?
Kind regards,
Job
This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
A few years ago I did some testing and found that the time between the transceiver detecting LOS and the routing protocol (ISIS in this case) being informed that the link was down (triggering the recalculation) took longer than it took BFD to signal ISIS to recalculate.
On Mar 21, 2018, at 12:35 PM, Bryan Holloway <bryan@shout.net> wrote:
Wouldn't any tangible problem on a dark-fiber link result in an interface shutdown, ostensibly creating the trigger one would need to begin re-convergence?
On 3/21/18 11:31 AM, Alex Lembesis wrote:
To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
Which platform ? What context ? Best regards.
Le 21 mars 2018 à 18:10, Jason Lixfeld <jason+nanog@lixfeld.ca> a écrit :
A few years ago I did some testing and found that the time between the transceiver detecting LOS and the routing protocol (ISIS in this case) being informed that the link was down (triggering the recalculation) took longer than it took BFD to signal ISIS to recalculate.
On Mar 21, 2018, at 12:35 PM, Bryan Holloway <bryan@shout.net> wrote:
Wouldn't any tangible problem on a dark-fiber link result in an interface shutdown, ostensibly creating the trigger one would need to begin re-convergence?
On 3/21/18 11:31 AM, Alex Lembesis wrote: To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
They were ME3600s. AFAIR it was two of these things in a lab connected back to back with two links between them, one metric higher than the other. Some sort of traffic generator running between the two that would generate fixed size UDP frames at some tens of milliseconds interval, yanking the preferred link, counting how many packets were lost and doing some math, correlating with various logs debugs on the boxes.
On Mar 21, 2018, at 1:34 PM, Youssef Bengelloun-Zahr <bengelly@gmail.com> wrote:
Which platform ? What context ?
Best regards.
Le 21 mars 2018 à 18:10, Jason Lixfeld <jason+nanog@lixfeld.ca> a écrit :
A few years ago I did some testing and found that the time between the transceiver detecting LOS and the routing protocol (ISIS in this case) being informed that the link was down (triggering the recalculation) took longer than it took BFD to signal ISIS to recalculate.
On Mar 21, 2018, at 12:35 PM, Bryan Holloway <bryan@shout.net> wrote:
Wouldn't any tangible problem on a dark-fiber link result in an interface shutdown, ostensibly creating the trigger one would need to begin re-convergence?
On 3/21/18 11:31 AM, Alex Lembesis wrote: To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
On 21/Mar/18 19:10, Jason Lixfeld wrote:
A few years ago I did some testing and found that the time between the transceiver detecting LOS and the routing protocol (ISIS in this case) being informed that the link was down (triggering the recalculation) took longer than it took BFD to signal ISIS to recalculate.
You also have the issues of: * Deciding whether you want to have a uniform standard when deploying BFD. If your standard is not to on dark links and should do lit links, you can quickly run into an administrative scenario as your network grows, and keeping track of which link is what re: BFD or not can be someone's untangling project 10 years later. * Circuit providers delivering hybrid links and not telling you because they are either afraid to or don't fully understand the scope of their (very large) network. In this case, you're told the link is dark, but somewhere along the path is their active gear. (not so) Strange, but true. Mark.
He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD. Luke Guillory Vice President – Technology and Innovation Tel: 985.536.1212 Fax: 985.536.0300 Email: lguillory@reservetele.com Reserve Telecommunications 100 RTC Dr Reserve, LA 70084 _________________________________________________________________________________________________ Disclaimer: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material which should not disseminate, distribute or be copied. Please notify Luke Guillory immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Luke Guillory therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. . -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Alex Lembesis Sent: Wednesday, March 21, 2018 11:31 AM To: Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers? To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
Correct, Luke. Best regards, Alex -----Original Message----- From: Luke Guillory (External) [mailto:lguillory@reservetele.com] Sent: Wednesday, March 21, 2018 12:37 PM To: Alex Lembesis; Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers? He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD. Luke Guillory Vice President – Technology and Innovation Tel: 985.536.1212 Fax: 985.536.0300 Email: lguillory@reservetele.com Reserve Telecommunications 100 RTC Dr Reserve, LA 70084 _________________________________________________________________________________________________ Disclaimer: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material which should not disseminate, distribute or be copied. Please notify Luke Guillory immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Luke Guillory therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. . -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Alex Lembesis Sent: Wednesday, March 21, 2018 11:31 AM To: Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers? To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible. Best regards, Alex -----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers? Silly question perhaps, but why would you do BFD on dark fiber? Kind regards, Job This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you. This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
Right, BFD on a dark fiber link (should) be immediately detected and the detecting end should send a cease/stop/whatever message to the remote peer to drop the neighbor relationship. BFD really comes into it's own in a derived circuit (such as metro-E or other type setup) where you can have an indirect failure (traffic does not pass, but the last mile link remains up). Ken On Wed, Mar 21, 2018 at 10:44 AM, Alex Lembesis <Alex.Lembesis@tevapharm.com
wrote:
Correct, Luke.
Best regards,
Alex
-----Original Message----- From: Luke Guillory (External) [mailto:lguillory@reservetele.com] Sent: Wednesday, March 21, 2018 12:37 PM To: Alex Lembesis; Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers?
He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD.
Luke Guillory Vice President – Technology and Innovation
Tel: 985.536.1212 Fax: 985.536.0300 Email: lguillory@reservetele.com
Reserve Telecommunications 100 RTC Dr Reserve, LA 70084
____________________________________________________________ _____________________________________
Disclaimer: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material which should not disseminate, distribute or be copied. Please notify Luke Guillory immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Luke Guillory therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. .
-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Alex Lembesis Sent: Wednesday, March 21, 2018 11:31 AM To: Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers?
To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible.
Best regards,
Alex
-----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers?
Silly question perhaps, but why would you do BFD on dark fiber?
Kind regards,
Job
This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
On 21 March 2018 at 16:37, Luke Guillory <lguillory@reservetele.com> wrote:
He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD.
Could it not also help with a unidirectional failure of the fibre? E.g. Loss of signal in only one direction might not bring the link down, but if one BFD peer stops receiving BFD packets it'll bring the link down. On 21 March 2018 at 17:10, Jason Lixfeld <jason+nanog@lixfeld.ca> wrote:
A few years ago I did some testing and found that the time between the transceiver detecting LOS and the routing protocol (ISIS in this case) being informed that the link was down (triggering the recalculation) took longer than it took BFD to signal ISIS to recalculate.
Have you looked at testing and adding this command to your IOS devices: ip routing protocol purge interface Also have you tried to set the carrier delay to zero? carrier-delay down 0 I haven't compared them to BFD explicitly, but I would expect the two commands together to have the same effect as you're seeing with BFD (the link down is signaled to the IGP ASAP). Cheers, James.
On 22 March 2018 at 10:47, James Bensley <jwbensley@gmail.com> wrote:
On 21 March 2018 at 16:37, Luke Guillory <lguillory@reservetele.com> wrote:
He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD.
Could it not also help with a unidirectional failure of the fibre? E.g. Loss of signal in only one direction might not bring the link down, but if one BFD peer stops receiving BFD packets it'll bring the link down.
Ethernet handles unidirectional failure natively through autonego asserting RFI. A======B B sees loss-of signal A does not B asserts RFI A receives RFI and goes down Some devices, such as JunOS can also assert RFI for external factors, like you can ask JunOS to assert RFI to client port, when pseudowire goes down, so client experiences link down when you experience pseudowire down. -- ++ytti
On 22 March 2018 at 09:59, Saku Ytti <saku@ytti.fi> wrote:
Ethernet handles unidirectional failure natively through autonego asserting RFI.
I was thinking about this as I wrote that post. I've not had a chance to test this across our various devices types, I will have to try and find the time to test which devices support this and how effectively it works. For now we just use BFD because it's tested as working on all our device types. We've seen issues where LOS isn't working properly across third party DWDM platforms so for now, we have no BFD bugs and it catches these issues for us. I think it might also be worth testing Ethernet RFI to check the delays that Jason was talking about; which is quicker to signal link down and begin the re-convergence process, RFI / LOS / BFD? Cheers, James.
RFI is corollary to LOS. Both are essentially speed of light limited, there in no uncertainty or detection margin. After one side sees LOS other side will see RFI, if it's not seeing LOS as well. I personally would consider BFD for: radio, pseudowire, L2 switch and equivalent poorly failing links. But I wouldn't run it on dark, copper or wave. Purely an anecdote, but I have far more BFD caused problems than BFD solved problems, spanning multiple vendors. (CAT7600, ASR9k, MX). Regarding BFD echo and control mode, neither is guaranteed to be HW or SW implementation, both can be both. Standard intends echo mode to be HW, but does not guarantee. On 22 March 2018 at 12:10, James Bensley <jwbensley@gmail.com> wrote:
On 22 March 2018 at 09:59, Saku Ytti <saku@ytti.fi> wrote:
Ethernet handles unidirectional failure natively through autonego asserting RFI.
I was thinking about this as I wrote that post. I've not had a chance to test this across our various devices types, I will have to try and find the time to test which devices support this and how effectively it works. For now we just use BFD because it's tested as working on all our device types. We've seen issues where LOS isn't working properly across third party DWDM platforms so for now, we have no BFD bugs and it catches these issues for us.
I think it might also be worth testing Ethernet RFI to check the delays that Jason was talking about; which is quicker to signal link down and begin the re-convergence process, RFI / LOS / BFD?
Cheers, James.
-- ++ytti
Here is what we do... router isis xxxx interface TenGigabitEthernet0/0/0/0 circuit-type level-2-only bfd minimum-interval 50 bfd multiplier 5 bfd fast-detect ipv4 We keep the same config for local and long haul core links. Works like a champ every time. Also as a FYI if you are running ASR9K, you are able to offload the BFD process from the Linecard CPU to the NPU. This allows BFD timers down to 3.3 milliseconds. https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r5-1/ro... -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mark Tinka Sent: Saturday, May 5, 2018 6:38 PM To: James Bensley <jwbensley@gmail.com>; NANOG <nanog@nanog.org> Subject: Re: How are you configuring BFD timers? On 22/Mar/18 10:47, James Bensley wrote:
Have you looked at testing and adding this command to your IOS devices:
ip routing protocol purge interface
In all recent versions of IOS, this command is now standard and is elided from the running configuration. Mark. ________________________________ CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain confidential information that is legally privileged. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error please notify the sender immediately by replying to this e-mail. You must destroy the original transmission and its attachments without reading or saving in any manner. Thank you.
On 6/May/18 07:41, Erik Sundberg wrote:
Here is what we do...
router isis xxxx interface TenGigabitEthernet0/0/0/0 circuit-type level-2-only bfd minimum-interval 50 bfd multiplier 5 bfd fast-detect ipv4
We keep the same config for local and long haul core links. Works like a champ every time.
We use the same timers for all links, but different multipliers depending on the link length. We have links as short as 5km, all the way to 14,500km. Mark.
On 7/May/18 18:46, valdis.kletnieks@vt.edu wrote:
Any words of wisdom / battle scars regarding running links that are in the 10K+ distance?
Keep repair ships nearby :-).
From a submarine perspective, things that are out-of-scope here.
From an IP perspective, we've had good experience with 250ms * 5 for BFD. Actual RTT latency is 140ms, so there is enough headroom to account for false positives.
Mark.
Why does DWDM imply need of BFD? DWDM has no problem propagating loss-of-signal and asserting remote failure. It is of course possible to configure DWDM so that this does not happen, particularly if you buy protected circuit you might not want it to happen. But as per usual, you should test that vendor delivers what you buy. I think the problem probability of not detecting failure on waves, dark fibres and copper connections are smaller than the probability of having issue caused by presence of BFD. I think you need very unreliably failing link to capitalise on BFD. Radio is good candidate, switch masked liveliness is good candidate. Of course you also need to be sure you don't kill your BGP's fast external failover, by using eBGP multihop on your point-to-point circuit. On 21 March 2018 at 18:37, Luke Guillory <lguillory@reservetele.com> wrote:
He's asking because if it was dark the interface would go down when the link was lost and the router would pull routes. But PA to FL would lead me to believe it'll be a wave from some type of DWDM gear which brings us to BFD.
Luke Guillory Vice President – Technology and Innovation
Tel: 985.536.1212 Fax: 985.536.0300 Email: lguillory@reservetele.com
Reserve Telecommunications 100 RTC Dr Reserve, LA 70084
_________________________________________________________________________________________________
Disclaimer: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material which should not disseminate, distribute or be copied. Please notify Luke Guillory immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Luke Guillory therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. .
-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Alex Lembesis Sent: Wednesday, March 21, 2018 11:31 AM To: Job Snijders (External); Youssef Bengelloun-Zahr Cc: NANOG Subject: RE: How are you configuring BFD timers?
To speed up BGP routing convergence. The (2x) dark fiber links from PA to FL are being used as Layer3 datacenter interconnects, where each datacenter has its own AS. The DF is also carrying FCIP traffic, so we need failover to be as fast as possible.
Best regards,
Alex
-----Original Message----- From: Job Snijders (External) [mailto:job@instituut.net] Sent: Wednesday, March 21, 2018 12:25 PM To: Youssef Bengelloun-Zahr Cc: Alex Lembesis; NANOG Subject: Re: How are you configuring BFD timers?
Silly question perhaps, but why would you do BFD on dark fiber?
Kind regards,
Job
This message is intended solely for the designated recipient(s). It may contain confidential or proprietary information and may be subject to attorney-client privilege or other confidentiality protections. If you are not a designated recipient you may not review, copy or distribute this message. If you receive this in error, please notify the sender by reply e-mail and delete this message. Thank you.
-- ++ytti
On 2018-03-21 17:24, Job Snijders wrote:
Silly question perhaps, but why would you do BFD on dark fiber?
Simple paranoia perhaps? Just because you only have layer 1 equipment (transponders) right now, doesn't guarantee that you won't need to stick in a layer 2 switch in the path tomorrow, and if you already have BFD running, you won't forget to add it at that time. As a real-life example: we have dark fiber from our main campus to another campus in the neighbouring city, and from there we have dark fiber to a partner a few kilometers further away. We run wavelengths the entire way from the main campus to our partner (being demuxed and muxed back at the sub-campus). At one point, the fiber needed to be rerouted, and then the attenuation became to high for one of the CWDM wavelengths. Solution: put an ethernet switch at our sub-campus to act as a kind of amplifier for that wavelength. (This was using 120km CWDM gigabit transceivers directly in the routers at each end. We have since retired those and use 10 gigabit DWDM with transponders and EDFA amplifiers.) Yes, it was a duct-tape solution, but it was cheap and got the work done. :-) /Thomas Bellman
Subject: Re: How are you configuring BFD timers? Date: Wed, Mar 21, 2018 at 04:24:47PM +0000 Quoting Job Snijders (job@instituut.net):
Silly question perhaps, but why would you do BFD on dark fiber?
Because Ethernet lacks the PRDI that real WAN protocols have. -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE SA0XLR +46 705 989668 If I am elected no one will ever have to do their laundry again! PS: Don't get me wrong. I'm all for Ethernet, it is cheap (or perhaps, SDH/SONET line cards were artificially expensive) and it makes networks faster more often, by virtue of interface cheapness. But one really needs to tack about half the signalling from SDH onto Ethernet (here, BFD) to get some predictability from it. Which is OK, it was made for NFS and Telnet on a LAN. It does really well considering that.
On 22 March 2018 at 22:41, Måns Nilsson <mansaxel@besserwisser.org> wrote:
Subject: Re: How are you configuring BFD timers? Date: Wed, Mar 21, 2018 at 04:24:47PM +0000 Quoting Job Snijders (job@instituut.net):
Silly question perhaps, but why would you do BFD on dark fiber?
Because Ethernet lacks the PRDI that real WAN protocols have.
Indeed, RFI on ethernet is rather modern addition, turning 20 this year. -- ++ytti
--On 22 mars 2018 23:45:16 +0200 Saku Ytti <saku@ytti.fi> wrote:
On 22 March 2018 at 22:41, Måns Nilsson <mansaxel@besserwisser.org> wrote:
Subject: Re: How are you configuring BFD timers? Date: Wed, Mar 21, 2018 at 04:24:47PM +0000 Quoting Job Snijders (job@instituut.net):
Silly question perhaps, but why would you do BFD on dark fiber?
Because Ethernet lacks the PRDI that real WAN protocols have.
Indeed, RFI on ethernet is rather modern addition, turning 20 this year.
(You just reminded me I've been doing some sort of WAN network ops for about 20 years.) That does indeed solve the problem for dark fibre, and those lucky WDM systems that actually reflect input status to output. Not always true, I'm afraid (just look at the Ethernet switch mid-span that Thomas Bellman wrote about; a fitting metaphor for all "ethernet-over-other.." models..). Ethernet still regards "no frames seen on the yellow coax" as an opportunity to send traffic rather than an error, if we're talking old things ;-). BFD solves that, and it is worthwhile to have one setup regardless of technology, if possible. -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE SA0XLR +46 705 989668 CHUBBY CHECKER just had a CHICKEN SANDWICH in downtown DULUTH!
Not directly related, but I wonder: how common is micro-BFD for detecting bundle member failures? On Thu, Mar 22, 2018 at 10:12 PM Måns Nilsson <mansaxel@besserwisser.org> wrote:
--On 22 mars 2018 23:45:16 +0200 Saku Ytti <saku@ytti.fi> wrote:
On 22 March 2018 at 22:41, Måns Nilsson <mansaxel@besserwisser.org> wrote:
Subject: Re: How are you configuring BFD timers? Date: Wed, Mar 21, 2018 at 04:24:47PM +0000 Quoting Job Snijders (job@instituut.net):
Silly question perhaps, but why would you do BFD on dark fiber?
Because Ethernet lacks the PRDI that real WAN protocols have.
Indeed, RFI on ethernet is rather modern addition, turning 20 this year.
(You just reminded me I've been doing some sort of WAN network ops for about 20 years.)
That does indeed solve the problem for dark fibre, and those lucky WDM systems that actually reflect input status to output. Not always true, I'm afraid (just look at the Ethernet switch mid-span that Thomas Bellman wrote about; a fitting metaphor for all "ethernet-over-other.." models..). Ethernet still regards "no frames seen on the yellow coax" as an opportunity to send traffic rather than an error, if we're talking old things ;-). BFD solves that, and it is worthwhile to have one setup regardless of technology, if possible.
-- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE SA0XLR +46 705 989668 CHUBBY CHECKER just had a CHICKEN SANDWICH in downtown DULUTH!
I'm not sure it's used by a large proportion of operators, but it is deployed in some volume in a number of networks that I'm aware of. During the development of implementations, we hosted inter-op testing/fixing at a previous employer. Rolling it out had started when I moved on, but I expect it is now across their global deployments at this point. I haven't heard anything to say that it's causing any issues. [I still am somewhat unable to reconcile myself with the use of BFD in this deployment, some Ethernet OAM - seemed a reasonable per-member solution to me, but folks have a preference for a single protocol here.] r. On Wed, 28 Mar 2018 at 10:15 Arie Vayner <ariev@vayner.net> wrote:
Not directly related, but I wonder: how common is micro-BFD for detecting bundle member failures?
On Thu, Mar 22, 2018 at 10:12 PM Måns Nilsson <mansaxel@besserwisser.org> wrote:
--On 22 mars 2018 23:45:16 +0200 Saku Ytti <saku@ytti.fi> wrote:
On 22 March 2018 at 22:41, Måns Nilsson <mansaxel@besserwisser.org> wrote:
Subject: Re: How are you configuring BFD timers? Date: Wed, Mar 21,
2018
at 04:24:47PM +0000 Quoting Job Snijders (job@instituut.net):
Silly question perhaps, but why would you do BFD on dark fiber?
Because Ethernet lacks the PRDI that real WAN protocols have.
Indeed, RFI on ethernet is rather modern addition, turning 20 this year.
(You just reminded me I've been doing some sort of WAN network ops for about 20 years.)
That does indeed solve the problem for dark fibre, and those lucky WDM systems that actually reflect input status to output. Not always true, I'm afraid (just look at the Ethernet switch mid-span that Thomas Bellman wrote about; a fitting metaphor for all "ethernet-over-other.." models..). Ethernet still regards "no frames seen on the yellow coax" as an opportunity to send traffic rather than an error, if we're talking old things ;-). BFD solves that, and it is worthwhile to have one setup regardless of technology, if possible.
-- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE SA0XLR +46 705 989668 CHUBBY CHECKER just had a CHICKEN SANDWICH in downtown DULUTH!
On 21 March 2018 at 13:10, Jason Lixfeld <jason+nanog@lixfeld.ca> wrote:
Hey,
For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier.
On paper, BFD between two devices over a local or metro dark fibre or wave seems pretty trivial: Assuming your gear can a) support echo mode b) hardware offloads echo processing c) automatically treats echos as vital and puts them into the appropriate high priority queue, then setting the timers down to their lowest possible values (3ms on some of the gear that I’ve seen) and some low multiplier seems more than reasonable. But?
From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on?
From yet another angle, what if your link is a long-haul wave, or for that matter a wave of any distance that imposes a one-way latency that is higher than the minimum tx and rx timers that are supported by your gear? We’ll assume an unprotected wave, because I’m sure if it’s protected, you have no choice but to consider the one-way latency of the longest of the two segments.
I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions?
Thanks!
Going back to the original question;
From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on?
Personally I don't care if it's a wavelength, dark fibre or L2 VPN service. I don't treat them differently based on the underlying connectivity type. The SLAs are probably more important. But if we are paying for say 10G of capacity on a link which is say a 10G pseudowire from another carrier, I treat it the same as a dark fibre connected to 10G transceivers at each end. Wave lengths are generally more stable in my opinion, we did have a 10G L2 Ethernet circuit from a carrier that was a pseudowire from them essentially, and their PE was under a DDoS attack so our L2 VPN service was affected (because the pseudowire was flapping up and down). But once the circuit is up and running for a while, if you're regularly pushing somewhere near the max circuit bandwidth and monitoring circuit latency, you'll get a feel for "how good" the carrier is and then adjust from there. Generally speaking though, if the carrier is "good" I treat DF/lamda/L2 circuits the same with regards the BFD/IGP tuning.
I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions?
Elsewhere in the thread you have mentioned that you are using Cisco ME3600 devices. If you disable BFD echo mode you will be able to get low timers on these devices. Echo mode is enabled by default on IOS when you enable BFD under an interface, which these devices don't support, so you need to explicitly disable it. See the min/max/avg BFD timers below between two ME devices when the interfaces are configured with "bfd interval 50 min_rx 50 multiplier 3": ME3600#show bfd neighbors interface te0/2 details ... Session state is UP and using echo function with 50 ms interval. Session Host: Software ... Rx Count: 72, Rx Interval (ms) min/max/avg: 1/4976/4323 last: 2348 ms ago Tx Count: 74, Tx Interval (ms) min/max/avg: 1/4968/4217 last: 1436 ms ago If you add the command "no bfd echo" to the interface you should see the following min/max/avg BFD timers: ME3600#show bfd neighbors interface te0/2 details ... Session state is UP and not using echo function. Session Host: Software ... Rx Count: 3314443, Rx Interval (ms) min/max/avg: 1/72/47 last: 36 ms ago Tx Count: 3310865, Tx Interval (ms) min/max/avg: 1/72/47 last: 40 ms ago We have a mixture of devices and they don't all support BFD echo mode. We have for example Cisco ASR9000s that support both echo / no echo mode, so it may have one interface towards a Juniper MX running BFD echo mode and one interface towards a Cisco ME which runs no echo mode. It's working fine for us. Cheers, James.
Thanks to everyone who has responded so far. Enlightening! My understanding around the origins of BFD is that it was developed in part to try and bring SONET like switchover times to an Ethernet world. What I’m reading is for those who do run BFD, no one seems to be dialing it down to try and achieve those times. Some folks explained why they chose the values they did, but others didn’t. So my follow up question is “Why don’t you dial them down?”. Are achieving those switchover times not important for your use case? Do you not trust that it won’t be reliable based on the gear you’re using, or the quality/reliability of the underlying circuit you’re trying to protect? Something else? Also, interesting to read about why some folks don’t care much about BFD at all.
On Mar 21, 2018, at 9:10 AM, Jason Lixfeld <jason+nanog@lixfeld.ca> wrote:
Hey,
For those running BFD on your land-based point-to-point links, I’m interested in hearing about what factors you consider when deciding how to configure your timers and multiplier.
On paper, BFD between two devices over a local or metro dark fibre or wave seems pretty trivial: Assuming your gear can a) support echo mode b) hardware offloads echo processing c) automatically treats echos as vital and puts them into the appropriate high priority queue, then setting the timers down to their lowest possible values (3ms on some of the gear that I’ve seen) and some low multiplier seems more than reasonable. But?
From another angle, your link isn’t dark fibre or a wave but, for example, ethernet over some sort of IP based L2 Transport, and is still a low (sub 1ms) one-way latency local or metro link. How do you set your timers, and what do you base that on?
From yet another angle, what if your link is a long-haul wave, or for that matter a wave of any distance that imposes a one-way latency that is higher than the minimum tx and rx timers that are supported by your gear? We’ll assume an unprotected wave, because I’m sure if it’s protected, you have no choice but to consider the one-way latency of the longest of the two segments.
I made some assumptions above about support for echo mode and hardware offload, but what if (some of) your gear doesn’t support some or all of that stuff? How do you factor your configuration decisions?
Thanks!
participants (17)
-
Alex Lembesis
-
Arie Vayner
-
Bryan Holloway
-
Chuck Anderson
-
Erik Sundberg
-
James Bensley
-
Jason Lixfeld
-
Job Snijders
-
Ken Matlock
-
Luke Guillory
-
Mark Tinka
-
Måns Nilsson
-
Rob Shakir
-
Saku Ytti
-
Thomas Bellman
-
valdis.kletnieks@vt.edu
-
Youssef Bengelloun-Zahr