Congestion/latency-aware routing for MPLS?
Using a mix of Juniper hardware... Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction. Every single radio link has different bandwidth, different latency, and different interference characteristics. These radio links do run at 100% capacity at least some of the time. It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows. It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon. In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least). Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this? Ideas, comments, etc. all appreciated. Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here. Thanks, -Adam Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) https://www.merlin.mb.ca<https://www.merlin.mb.ca/> Chat with me on Teams<https://teams.microsoft.com/l/chat/0/0?users=athompson@merlin.mb.ca>
Hi Adam, This sounds like a use case for MPLS-TE with TWAMP-Light. TWAMP-Light handles the latency concern and can encode your measured latency in IS-IS. Juniper docs: https://www.juniper.net/documentation/us/en/software/junos/is-is/topics/topi.... The configuration in steps 5 and 7 is all thats required (from a config standpoint) to get the data into IS-IS. You then, when building an RSVP LSP, would specify a constraint for the latency. Alternatively you can route by latency on its own by setting the metric to latency, but as you've alluded to, this can be pretty dangerous in environments with mixed bandwidth availability. The other option afforded for the second point on traffic balance is to use auto-bandwidth (https://www.juniper.net/documentation/us/en/software/junos/mpls/topics/topic... - see also https://archive.nanog.org/sites/default/files/tues.general.steenbergen.autob...). Other vendors support this as well. SR supports the use of TWAMP-Light as well if you prefer that over RSVP, but it doesn't support auto-bandwidth. _______________________ Jason R. Rokeach m: [603.969.5549](tel:+16039695549) e: jason@rokea.ch tg: [jasonrokeach](https://t.me/jasonrokeach) Sent with [ProtonMail](https://pr.tn/ref/QKTX33CHXPK0) secure email. Get my [PGP Public Key](https://gist.githubusercontent.com/jrokeach/3afd92bc82dc72bbc8f71214c02977e8...). ------- Original Message ------- On Wednesday, October 18th, 2023 at 9:13 AM, Adam Thompson - athompson at merlin.mb.ca <athompson_at_merlin_mb_ca_cbbhoxs@simplelogin.co> wrote:
Using a mix of Juniper hardware...
Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction.
Every single radio link has different bandwidth, different latency, and different interference characteristics.
These radio links do run at 100% capacity at least some of the time.
It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows.
It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon.
In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least).
Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this?
Ideas, comments, etc. all appreciated.
Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here.
Thanks, -Adam
Adam Thompson
Consultant, Infrastructure Services
MERLIN
100 - 135 Innovation Drive
Winnipeg, MB R3T 6A8
(204) 977-6824or1-800-430-6404(MB only)
[https://www.merlin.mb.ca](https://www.merlin.mb.ca/)
[Chat with me on Teams](https://teams.microsoft.com/l/chat/0/0?users=athompson@merlin.mb.ca)
We have been hoping to find use cases for the babel protocol's rtt metric, which builds on ideas from ntp, and is primarily used today in overlay networks: https://datatracker.ietf.org/doc/draft-ietf-babel-rtt-extension/ On Wed, Oct 18, 2023 at 7:17 AM Jason R. Rokeach via NANOG <nanog@nanog.org> wrote:
Hi Adam, This sounds like a use case for MPLS-TE with TWAMP-Light. TWAMP-Light handles the latency concern and can encode your measured latency in IS-IS. Juniper docs: https://www.juniper.net/documentation/us/en/software/junos/is-is/topics/topi.... The configuration in steps 5 and 7 is all thats required (from a config standpoint) to get the data into IS-IS. You then, when building an RSVP LSP, would specify a constraint for the latency. Alternatively you can route by latency on its own by setting the metric to latency, but as you've alluded to, this can be pretty dangerous in environments with mixed bandwidth availability.
The other option afforded for the second point on traffic balance is to use auto-bandwidth (https://www.juniper.net/documentation/us/en/software/junos/mpls/topics/topic... - see also https://archive.nanog.org/sites/default/files/tues.general.steenbergen.autob...).
Other vendors support this as well. SR supports the use of TWAMP-Light as well if you prefer that over RSVP, but it doesn't support auto-bandwidth.
_______________________ Jason R. Rokeach m: 603.969.5549 e: jason@rokea.ch tg: jasonrokeach
Sent with ProtonMail secure email. Get my PGP Public Key.
------- Original Message ------- On Wednesday, October 18th, 2023 at 9:13 AM, Adam Thompson - athompson at merlin.mb.ca <athompson_at_merlin_mb_ca_cbbhoxs@simplelogin.co> wrote:
Using a mix of Juniper hardware...
Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction.
Every single radio link has different bandwidth, different latency, and different interference characteristics.
These radio links do run at 100% capacity at least some of the time.
It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows.
It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon.
In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least).
Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this?
Ideas, comments, etc. all appreciated.
Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here.
Thanks, -Adam
Adam Thompson
Consultant, Infrastructure Services
MERLIN
100 - 135 Innovation Drive
Winnipeg, MB R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
Chat with me on Teams
-- Oct 30: https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html Dave Täht CSO, LibreQos
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link. Theoretically, you could have some sort of automation in place that dynamically detected available bandwidth over the path, and then re-configure the RSVP configured bandwidth for the interface to reflect that so the next auto-bandwidth calculation would take that into account. However, the efficacy of this would depend on the length of the RF disruption that caused BW reduction. Assuming your detection time was near instant ( which is saying something ) ,you'd still have to have very aggressive auto-BW timers to adjust to it quickly enough, and there are other downsides to doing that. On Wed, Oct 18, 2023 at 10:16 AM Jason R. Rokeach via NANOG <nanog@nanog.org> wrote:
Hi Adam, This sounds like a use case for MPLS-TE with TWAMP-Light. TWAMP-Light handles the latency concern and can encode your measured latency in IS-IS. Juniper docs: https://www.juniper.net/documentation/us/en/software/junos/is-is/topics/topi.... The configuration in steps 5 and 7 is all thats required (from a config standpoint) to get the data into IS-IS. You then, when building an RSVP LSP, would specify a constraint for the latency. Alternatively you can route by latency on its own by setting the metric to latency, but as you've alluded to, this can be pretty dangerous in environments with mixed bandwidth availability.
The other option afforded for the second point on traffic balance is to use auto-bandwidth ( https://www.juniper.net/documentation/us/en/software/junos/mpls/topics/topic... - see also https://archive.nanog.org/sites/default/files/tues.general.steenbergen.autob... ).
Other vendors support this as well. SR supports the use of TWAMP-Light as well if you prefer that over RSVP, but it doesn't support auto-bandwidth.
_______________________ *Jason R. Rokeach* m: 603.969.5549 <+16039695549> e: jason@rokea.ch tg: jasonrokeach <https://t.me/jasonrokeach>
Sent with ProtonMail <https://pr.tn/ref/QKTX33CHXPK0> secure email. Get my PGP Public Key <https://gist.githubusercontent.com/jrokeach/3afd92bc82dc72bbc8f71214c02977e8/raw/63d4835670c42e809818c02f7e368adabb16a61a/publickey.jason@rokea.ch-6753531f6f093f4facf1bf5289624a56c5271cf1.asc> .
------- Original Message ------- On Wednesday, October 18th, 2023 at 9:13 AM, Adam Thompson - athompson at merlin.mb.ca <athompson_at_merlin_mb_ca_cbbhoxs@simplelogin.co> wrote:
Using a mix of Juniper hardware...
Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction.
Every single radio link has different bandwidth, different latency, and different interference characteristics.
These radio links do run at 100% capacity at least some of the time.
It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows.
It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon.
In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least).
Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this?
Ideas, comments, etc. all appreciated.
Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here.
Thanks, -Adam
*Adam Thompson*
Consultant, Infrastructure Services
MERLIN
100 - 135 Innovation Drive
Winnipeg, MB R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
Chat with me on Teams <https://teams.microsoft.com/l/chat/0/0?users=athompson@merlin.mb.ca>
On Wed, Oct 18, 2023 at 7:38 AM Tom Beecher <beecher@beecher.cc> wrote:
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link.
Theoretically, you could have some sort of automation in place that dynamically detected available bandwidth over the path, and then re-configure the RSVP configured bandwidth for the interface to reflect that so the next auto-bandwidth calculation would take that into account. However, the efficacy of this would depend on the length of the RF disruption that caused BW reduction. Assuming your detection time was near instant ( which is saying something ) ,you'd still have to have very aggressive auto-BW timers to adjust to it quickly enough, and there are other downsides to doing that.
I have always been curious as to what extent RED is deployed on junOS? (in for example mpls networks) I had had some pretty bad results with some mx gear out of my control, a while back, couldn't fix it, slapped cake on it, grumpily blogged, moved on. https://blog.cerowrt.org/post/juniper/ What kind of latency swings are observable today?
On Wed, Oct 18, 2023 at 10:16 AM Jason R. Rokeach via NANOG <nanog@nanog.org> wrote:
Hi Adam, This sounds like a use case for MPLS-TE with TWAMP-Light. TWAMP-Light handles the latency concern and can encode your measured latency in IS-IS. Juniper docs: https://www.juniper.net/documentation/us/en/software/junos/is-is/topics/topi.... The configuration in steps 5 and 7 is all thats required (from a config standpoint) to get the data into IS-IS. You then, when building an RSVP LSP, would specify a constraint for the latency. Alternatively you can route by latency on its own by setting the metric to latency, but as you've alluded to, this can be pretty dangerous in environments with mixed bandwidth availability.
The other option afforded for the second point on traffic balance is to use auto-bandwidth (https://www.juniper.net/documentation/us/en/software/junos/mpls/topics/topic... - see also https://archive.nanog.org/sites/default/files/tues.general.steenbergen.autob...).
Other vendors support this as well. SR supports the use of TWAMP-Light as well if you prefer that over RSVP, but it doesn't support auto-bandwidth.
_______________________ Jason R. Rokeach m: 603.969.5549 e: jason@rokea.ch tg: jasonrokeach
Sent with ProtonMail secure email. Get my PGP Public Key.
------- Original Message ------- On Wednesday, October 18th, 2023 at 9:13 AM, Adam Thompson - athompson at merlin.mb.ca <athompson_at_merlin_mb_ca_cbbhoxs@simplelogin.co> wrote:
Using a mix of Juniper hardware...
Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction.
Every single radio link has different bandwidth, different latency, and different interference characteristics.
These radio links do run at 100% capacity at least some of the time.
It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows.
It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon.
In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least).
Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this?
Ideas, comments, etc. all appreciated.
Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here.
Thanks, -Adam
Adam Thompson
Consultant, Infrastructure Services
MERLIN
100 - 135 Innovation Drive
Winnipeg, MB R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
Chat with me on Teams
-- Oct 30: https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html Dave Täht CSO, LibreQos
On Wed, 18 Oct 2023 at 17:39, Tom Beecher <beecher@beecher.cc> wrote:
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link.
Jason also explained the TWAMP + latency solution, which is an active solution and doesn't rely on operator or automatic bandwidth providing information, but network automatically measures latency and encodes this information in ISIS, allowing automatic traffic engineering for LSP to choose the lowest latency path. I believe Jason's proposal is exactly what OP is looking for. -- ++ytti
I believe Jason's proposal is exactly what OP is looking for.
I would agree. On Wed, Oct 18, 2023 at 11:28 AM Saku Ytti <saku@ytti.fi> wrote:
On Wed, 18 Oct 2023 at 17:39, Tom Beecher <beecher@beecher.cc> wrote:
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link.
Jason also explained the TWAMP + latency solution, which is an active solution and doesn't rely on operator or automatic bandwidth providing information, but network automatically measures latency and encodes this information in ISIS, allowing automatic traffic engineering for LSP to choose the lowest latency path. I believe Jason's proposal is exactly what OP is looking for.
-- ++ytti
In addition to RSVP or may be worth using minimum modulation settings on the radios if possible. IE so that links completely drop and you re-route rather than run with less bandwidth. On Wed, Oct 18, 2023, 6:34 PM Tom Beecher <beecher@beecher.cc> wrote:
I believe Jason's proposal is exactly what OP is looking for.
I would agree.
On Wed, Oct 18, 2023 at 11:28 AM Saku Ytti <saku@ytti.fi> wrote:
On Wed, 18 Oct 2023 at 17:39, Tom Beecher <beecher@beecher.cc> wrote:
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link.
Jason also explained the TWAMP + latency solution, which is an active solution and doesn't rely on operator or automatic bandwidth providing information, but network automatically measures latency and encodes this information in ISIS, allowing automatic traffic engineering for LSP to choose the lowest latency path. I believe Jason's proposal is exactly what OP is looking for.
-- ++ytti
That's not a good option for bad weather depending on the region. Rain fade and other effects at 24Ghz and above can hinder a set of links, which is sometimes better than having no links at all. The encoding and error correcting capabilities play a crucial part in having a good connection. Ryan ________________________________ From: NANOG <nanog-bounces+ryan=rkhtech.org@nanog.org> on behalf of Mark Tees <marktees@gmail.com> Sent: Wednesday, October 18, 2023 10:01:06 AM To: Tom Beecher <beecher@beecher.cc> Cc: nanog <nanog@nanog.org> Subject: Re: Congestion/latency-aware routing for MPLS? Caution: This is an external email and may be malicious. Please take care when clicking links or opening attachments. In addition to RSVP or may be worth using minimum modulation settings on the radios if possible. IE so that links completely drop and you re-route rather than run with less bandwidth. On Wed, Oct 18, 2023, 6:34 PM Tom Beecher <beecher@beecher.cc<mailto:beecher@beecher.cc>> wrote: I believe Jason's proposal is exactly what OP is looking for. I would agree. On Wed, Oct 18, 2023 at 11:28 AM Saku Ytti <saku@ytti.fi<mailto:saku@ytti.fi>> wrote: On Wed, 18 Oct 2023 at 17:39, Tom Beecher <beecher@beecher.cc<mailto:beecher@beecher.cc>> wrote:
Auto-bandwidth won't help here if the bandwidth reduction is 'silent' as stated in the first message. A 1G interface , as far as RSVP is concerned, is a 1G interface, even if radio interference across it means it's effectively a 500M link.
Jason also explained the TWAMP + latency solution, which is an active solution and doesn't rely on operator or automatic bandwidth providing information, but network automatically measures latency and encodes this information in ISIS, allowing automatic traffic engineering for LSP to choose the lowest latency path. I believe Jason's proposal is exactly what OP is looking for. -- ++ytti
I remember some time back Juniper had a feature that would listen to snmp? from the radios and adjust ospf cost. My search foo is failing right now but I think they had a paper on the topic also. On Oct 18, 2023, at 6:13 AM, Adam Thompson <athompson@merlin.mb.ca> wrote: Using a mix of Juniper hardware... Network provides VPLS to customer, over MPLS (obviously) in a dual-redundant-ring radio topology. Each site is connected to one or more neighbors, generally with two radios, in two different bands, to *each* neighbor. So an ordinary node might have 4 radios, 2 pointing in each direction. Every single radio link has different bandwidth, different latency, and different interference characteristics. These radio links do run at 100% capacity at least some of the time. It's possible to set each link's relative cost in OSPF or IS-IS, of course, but I haven't found a way to make the router react to latency changes on one link or the other. (Right now, I think costs are set equal so traffic will use both links.) This means interference in one band invisibly diminishes the Ethernet bandwidth available and silently increases the latency on that link, sometimes dramatically. This seems to do interestingly unpleasant things to the client's flows. It's generally true that one band will be much more severely affected than the other, in any interference event. Before anyone asks, I'm told the network is a mixture of licensed and unlicensed bands, that's not changing anytime soon. In a perfect world, I'd like the routers to dynamically adjust traffic balance, but even just temporarily halting use of the impaired link would be helpful (or so I believe right now, at least). Is this a pipe dream? I'm not seeing anything in JunOS that could accomplish this... I'm not even sure if a mesh protocol could handle dual active links like this? Ideas, comments, etc. all appreciated. Also, I'm not the direct operator of use network. I'm involved, but mostly just trying to help them find better solutions. Nor am I an MPLS expert, as is obvious here. Thanks, -Adam Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB R3T 6A8 (204) 977-6824 <> or 1-800-430-6404 <> (MB only) https://www.merlin.mb.ca <https://www.merlin.mb.ca/> Chat with me on Teams <https://teams.microsoft.com/l/chat/0/0?users=athompson@merlin.mb.ca>
participants (8)
-
Adam Thompson
-
Dave Taht
-
Jason R. Rokeach
-
Jerry Jones
-
Mark Tees
-
Ryan Hamel
-
Saku Ytti
-
Tom Beecher