best practice for advertising peering fabric routes
I have a connection to a peering fabric and I'm not distributing the peering fabric routes into my network. I see three options 1. redistribute into my igp (OSPF) 2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way 3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad? I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective. thanks Eric
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the
default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and
the inbound traffic come back through the peering fabric
Advantages and disadvantages, pros and cons? Recommendations?
Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the
POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963 I like no-export
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken. NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period. Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this. -- TTFN, patrick On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
On Jan 14, 2014 7:13 PM, "Patrick W. Gilmore" <patrick@ianai.net> wrote:
Pardon the top post, but I really don't have anything to comment below
other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An
IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not
implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
+1. Rfc5963 needs to update that guidance. Set next hop self loopback0 and done CB
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP) So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router?
________________________________ From: Patrick W. Gilmore <patrick@ianai.net> To: NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 7:11 PM Subject: Re: best practice for advertising peering fabric routes
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
Never mind, I just carefully re-read the point. Right, I'll filter the prefix(es) of the IXP LAN(s) that I'm connected to and not let THAT get out, no reason to advertise it since no traffic ever goes to it. That still has me asking to how best to advertise the rest of the public prefixes coming from the other fabric members.
________________________________ From: Eric A Louie <elouie@yahoo.com> To: Patrick W. Gilmore <patrick@ianai.net>; NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 10:22 PM Subject: Re: best practice for advertising peering fabric routes
Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP)
So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router?
________________________________ From: Patrick W. Gilmore <patrick@ianai.net> To: NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 7:11 PM Subject: Re: best practice for advertising peering fabric routes
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
On Wed, Jan 15, 2014 at 1:36 AM, Eric A Louie <elouie@yahoo.com> wrote:
Never mind, I just carefully re-read the point. Right, I'll filter the prefix(es) of the IXP LAN(s) that I'm connected to and not let THAT get out, no reason to advertise it since no traffic ever goes to it. That still has me asking to how best to advertise the rest of the public prefixes coming from the other fabric members.
on your ibgp peers on 'your-router' you'd have something like: match community <community-added-for-all-ixp-participant-routes> set next-hop-self <http://www.cisco.com/en/US/tech/tk365/technologies_q_and_a_item09186a00800949e8.shtml#eleven> for one vendors view of the situation... and there is a link to: <http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00800c95bb.shtml> that's worth a read. http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00800c95bb...
________________________________ From: Eric A Louie <elouie@yahoo.com> To: Patrick W. Gilmore <patrick@ianai.net>; NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 10:22 PM Subject: Re: best practice for advertising peering fabric routes
Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP)
So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router?
________________________________ From: Patrick W. Gilmore <patrick@ianai.net> To: NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 7:11 PM Subject: Re: best practice for advertising peering fabric routes
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
On Wed, Jan 15, 2014 at 1:22 AM, Eric A Louie <elouie@yahoo.com> wrote:
Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP)
So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router?
perhaps I'm confused, but you have sort of this situation: ixp-participants -> ixp -> your-router -> your-network -> your-customer you get routes for ixp-participants from 'ixp' you send to the 'ixp' (and on to 'ixp-participants') routes for 'your-customer' and 'your-network' right? then so long as you send 'your-customer' the routes you learn from 'ixp' (which you set 'next-hop-self' on in ibgp from 'your-router' to 'your-network' (in the ibgp-mesh that you will setup) ... everything just works. All routers behind 'your-router' in 'your-netowrk' see 'ixp-participants' with a next-hop of 'your-router' who still knows 'send to ixp!' for the route(s) in question.
________________________________ From: Patrick W. Gilmore <patrick@ianai.net> To: NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 7:11 PM Subject: Re: best practice for advertising peering fabric routes
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
Ok, so the right way to do it is in iBGP. That pretty much answers the question - don't redistribute those ixp-participant prefixes into my IGP. I have a lot of iBGP homework to do, to make it work with the 5 POPs that are all taking full route feeds. I tried once and couldn't get the BGP tables working correctly with a full mesh of the 5 routers, so it looks like time to try it again, this time with a route reflector.
________________________________ From: Christopher Morrow <morrowc.lists@gmail.com> To: Eric A Louie <elouie@yahoo.com> Cc: Patrick W. Gilmore <patrick@ianai.net>; NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 10:37 PM Subject: Re: best practice for advertising peering fabric routes
On Wed, Jan 15, 2014 at 1:22 AM, Eric A Louie <elouie@yahoo.com> wrote:
Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP)
So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router?
perhaps I'm confused, but you have sort of this situation: ixp-participants -> ixp -> your-router -> your-network -> your-customer
you get routes for ixp-participants from 'ixp' you send to the 'ixp' (and on to 'ixp-participants') routes for 'your-customer' and 'your-network'
right?
then so long as you send 'your-customer' the routes you learn from 'ixp' (which you set 'next-hop-self' on in ibgp from 'your-router' to 'your-network' (in the ibgp-mesh that you will setup) ... everything just works.
All routers behind 'your-router' in 'your-netowrk' see 'ixp-participants' with a next-hop of 'your-router' who still knows 'send to ixp!' for the route(s) in question.
________________________________ From: Patrick W. Gilmore <patrick@ianai.net> To: NANOG list <nanog@nanog.org> Sent: Tuesday, January 14, 2014 7:11 PM Subject: Re: best practice for advertising peering fabric routes
Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken.
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
-- TTFN, patrick
On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote:
On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the
peering fabric routes into my network.
good plan.
I see three options 1. redistribute into my igp (OSPF)
2. configure ibgp and route them within that infrastructure. All the default routes go out through the POPs so iBGP would see packets destined for the peering fabric and route it that-a-way
3. leave it "as is", and let the outbound traffic go out my upstreams and the inbound traffic come back through the peering fabric
4. all peering-fabric routes get next-hop-self on your peering router before going into ibgp... all the rest of your network sees your local loopback as nexthop and things just work.
Advantages and disadvantages, pros and cons? Recommendations? Experiences, good and bad?
I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the POPs yet. That's another issue completely from a planning perspective.
thanks Eric
http://tools.ietf.org/html/rfc5963
I like no-export
Le 15/01/2014 07:59, Eric A Louie a écrit : > Ok, so the right way to do it is in iBGP. That pretty much answers the question - don't redistribute those ixp-participant prefixes into my IGP. Yes, using next-hop self (rather than importing IXP routes) as pointed out earlier in this thread. > > I have a lot of iBGP homework to do, to make it work with the 5 POPs that are all taking full route feeds. I tried once and couldn't get the BGP tables working correctly with a full mesh of the 5 routers, so it looks like time to try it again, this time with a route reflector. > I don't think you need route-reflection in a 5 node iBGP. What do you mean by "couldn't get the BGP tables working correctly"? Cheers, mh > > > >> ________________________________ >> From: Christopher Morrow <morrowc.lists@gmail.com> >> To: Eric A Louie <elouie@yahoo.com> >> Cc: Patrick W. Gilmore <patrick@ianai.net>; NANOG list <nanog@nanog.org> >> Sent: Tuesday, January 14, 2014 10:37 PM >> Subject: Re: best practice for advertising peering fabric routes >> >> >> On Wed, Jan 15, 2014 at 1:22 AM, Eric A Louie <elouie@yahoo.com> wrote: >>> Thank you - I will heed the warning. I want to be a good community member and make sure we're maintaining the agreed-upon practices (I'll re-read/review my agreement with the IXP) >>> >>> >>> So if that is the case, I have to rely on the peering fabric to just return traffic, since the rest of my network (save the directly connected router) will not know about those routes outbound? And what about my customers who are counting on me routing their office traffic through my network into the peering fabric to their properties? (I have one specifically who is eventually looking for that capability) Do I have to provide them some sort of VPN to make that happen across my network to the peering fabric router? >>> >> perhaps I'm confused, but you have sort of this situation: >> ixp-participants -> ixp -> your-router -> your-network -> your-customer >> >> you get routes for ixp-participants from 'ixp' >> you send to the 'ixp' (and on to 'ixp-participants') routes for >> 'your-customer' and 'your-network' >> >> right? >> >> then so long as you send 'your-customer' the routes you learn from >> 'ixp' (which you set 'next-hop-self' on in ibgp from 'your-router' to >> 'your-network' (in the ibgp-mesh that you will setup) ... everything >> just works. >> >> All routers behind 'your-router' in 'your-netowrk' see >> 'ixp-participants' with a next-hop of 'your-router' who still knows >> 'send to ixp!' for the route(s) in question. >> >>> >>> >>>> ________________________________ >>>> From: Patrick W. Gilmore <patrick@ianai.net> >>>> To: NANOG list <nanog@nanog.org> >>>> Sent: Tuesday, January 14, 2014 7:11 PM >>>> Subject: Re: best practice for advertising peering fabric routes >>>> >>>> >>>> Pardon the top post, but I really don't have anything to comment below other than to agree with Chris and say rfc5963 is broken. >>>> >>>> NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period. >>>> >>>> Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this. >>>> >>>> -- >>>> TTFN, >>>> patrick >>>> >>>> >>>> On Jan 14, 2014, at 21:22 , Christopher Morrow <morrowc.lists@gmail.com> wrote: >>>> >>>>> On Tue, Jan 14, 2014 at 9:09 PM, Cb B <cb.list6@gmail.com> wrote: >>>>>> On Jan 14, 2014 6:01 PM, "Eric A Louie" <elouie@yahoo.com> wrote: >>>>>>> I have a connection to a peering fabric and I'm not distributing the >>>>>> peering fabric routes into my network. >>>>> good plan. >>>>> >>>>>>> I see three options >>>>>>> 1. redistribute into my igp (OSPF) >>>>>>> >>>>>>> 2. configure ibgp and route them within that infrastructure. All the >>>>>> default routes go out through the POPs so iBGP would see packets destined >>>>>> for the peering fabric and route it that-a-way >>>>>>> 3. leave it "as is", and let the outbound traffic go out my upstreams and >>>>>> the inbound traffic come back through the peering fabric >>>>>>> >>>>> 4. all peering-fabric routes get next-hop-self on your peering router >>>>> before going into ibgp... >>>>> all the rest of your network sees your local loopback as nexthop and >>>>> things just work. >>>>> >>>>>>> Advantages and disadvantages, pros and cons? Recommendations? >>>>>> Experiences, good and bad? >>>>>>> >>>>>>> I have 5 POPs, 2 OSPF areas, and have not brought iBGP up between the >>>>>> POPs yet. That's another issue completely from a planning perspective. >>>>>>> thanks >>>>>>> Eric >>>>>>> >>>>>> http://tools.ietf.org/html/rfc5963 >>>>>> >>>>>> I like no-export >>>> >>>> >>>> >>>> >> >>
On Tue, Jan 14, 2014 at 10:11 PM, Patrick W. Gilmore <patrick@ianai.net> wrote:
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
Hi Patrick, I have to disagree with you. If it appears in a traceroute to somewhere else, I'd like to be able to ping and traceroute directly to it. When I can't, that impairs my ability to troubleshoot the all too common can't-get-there-from-here problems. The more you hide the infrastructure, the more intractable problems become for your customers. The IXP LAN should be reachable from every device on the ASes which connect to it, not just the immediate router. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jan 15, 2014, at 10:44 , William Herrin <bill@herrin.us> wrote:
On Tue, Jan 14, 2014 at 10:11 PM, Patrick W. Gilmore <patrick@ianai.net> wrote:
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
Hi Patrick,
I have to disagree with you. If it appears in a traceroute to somewhere else, I'd like to be able to ping and traceroute directly to it. When I can't, that impairs my ability to troubleshoot the all too common can't-get-there-from-here problems. The more you hide the infrastructure, the more intractable problems become for your customers.
The IXP LAN should be reachable from every device on the ASes which connect to it, not just the immediate router.
We disagree. Plus, you really can't type "ping" on the router connected to the IXP? _If_ you can guarantee your network has zero bots, abusable [DNS|NTP|etc.] servers, all your downstreams are perfectly clean, etc., etc., then maybe I could see you carrying it in your IGP. As I know 100% of ISPs (to at least one decimal place) cannot make such a guarantee, then doing so puts the IXP and all other members - whether peers of yours or not - at risk. Putting others at risk because you are lazy or because it makes your life easier is .. I believe I called it bad manners before. But let's take the philosophical out of this. The prefix in question is owned by the IXP. I said in an earlier post that if you carry a prefix I own, did not announce to you, and make it very clear I specifically do not want you to carry, I will ask you to stop or face possible disconnection. You may claim convergence (a bit of BS), troubleshooting (non-issue, IMO), or even "but I waaaaaaaaaaaant to!!1!1!" (whatever). Doesn't matter. That's not your prefix, you were not given it and told not to carry it, so Do Not Carry It. Ask your IXP if they mind whether you carry the prefix. See what they say. -- TTFN, patrick
UUnet once advertised the /24 for MAE-East to me (well, Net99), and because I also had it in my IGP, my network was using UUnet's backbone for west-to-east coast traffic for a couple of days until I noticed and fixed it (with next-hop-self). I agree 100% with Patrick and others on this point. No good can come from propagating IXP address space any further than is absolutely necessary. Best not to propagate it at all. Dave -----Original Message----- From: Patrick W. Gilmore [mailto:patrick@ianai.net] Sent: Wednesday, January 15, 2014 8:57 AM To: NANOG list Subject: Re: best practice for advertising peering fabric routes On Jan 15, 2014, at 10:44 , William Herrin <bill@herrin.us> wrote:
On Tue, Jan 14, 2014 at 10:11 PM, Patrick W. Gilmore <patrick@ianai.net> wrote:
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
Hi Patrick,
I have to disagree with you. If it appears in a traceroute to somewhere else, I'd like to be able to ping and traceroute directly to it. When I can't, that impairs my ability to troubleshoot the all too common can't-get-there-from-here problems. The more you hide the infrastructure, the more intractable problems become for your customers.
The IXP LAN should be reachable from every device on the ASes which connect to it, not just the immediate router.
We disagree. Plus, you really can't type "ping" on the router connected to the IXP? _If_ you can guarantee your network has zero bots, abusable [DNS|NTP|etc.] servers, all your downstreams are perfectly clean, etc., etc., then maybe I could see you carrying it in your IGP. As I know 100% of ISPs (to at least one decimal place) cannot make such a guarantee, then doing so puts the IXP and all other members - whether peers of yours or not - at risk. Putting others at risk because you are lazy or because it makes your life easier is .. I believe I called it bad manners before. But let's take the philosophical out of this. The prefix in question is owned by the IXP. I said in an earlier post that if you carry a prefix I own, did not announce to you, and make it very clear I specifically do not want you to carry, I will ask you to stop or face possible disconnection. You may claim convergence (a bit of BS), troubleshooting (non-issue, IMO), or even "but I waaaaaaaaaaaant to!!1!1!" (whatever). Doesn't matter. That's not your prefix, you were not given it and told not to carry it, so Do Not Carry It. Ask your IXP if they mind whether you carry the prefix. See what they say. -- TTFN, patrick
On Wed, Jan 15, 2014 at 10:57 AM, Patrick W. Gilmore <patrick@ianai.net> wrote:
On Jan 15, 2014, at 10:44 , William Herrin <bill@herrin.us> wrote:
I have to disagree with you. If it appears in a traceroute to somewhere else, I'd like to be able to ping and traceroute directly to it. When I can't, that impairs my ability to troubleshoot the all too common can't-get-there-from-here problems. The more you hide the infrastructure, the more intractable problems become for your customers.
The IXP LAN should be reachable from every device on the ASes which connect to it, not just the immediate router.
We disagree.
Plus, you really can't type "ping" on the router connected to the IXP?
Not when I'm the downstream customer, no. It's jolly good that *you* can test, but before the rest of us can get through the layers of support which insulate you, we have to be able to convincingly test too.
As I know 100% of ISPs (to at least one decimal place) cannot make such a guarantee, then doing so puts the IXP and all other members - whether peers of yours or not - at risk. Putting others at risk because you are lazy or because it makes your life easier is .. I believe I called it bad manners before.
That makes no sense. The IXP is at no more or less risk from your customers than any other connection you have for Internet carriage. Risk which you are responsible for managing either way.
I said in an earlier post that if you carry a prefix I own, did not announce to you, and make it very clear I specifically do not want you to carry, I will ask you to stop or face possible disconnection. [...] That's not your prefix, you were not given it and told not to carry it, so Do Not Carry It.
Well yes, of course. If you participate in an IXP you follow the rules of the IXP. I respectfully question the wisdom of such a rule and the IXPs I deal with only ask that you not announce the IXP prefix externally. But it's not OK to unilaterally break the IXP's rules, however poorly conceived. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
* Patrick W. Gilmore:
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
Doing so endangers your peers & the IX itself. It is on the order of not implementing BCP38, except no one has the (lame, ridiculous, idiotic, and pure cost-shifting BS) excuse that they "can't" do this.
Any ideas why DE-CIX doesn't enforce this? One advantage is that IXP participants can perform emergency maintenance if they have isolated their IXP router from their own network.
On Jan 14, 2014, at 7:55 PM, Eric A Louie <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the peering fabric routes into my network.
There's a two part problem lurking. Problem #1 is how you handle your internal routing. Most of the "big boys" will next-hop-self in iBGP all external routes. However depending on the size and configuration of your network there may be advantages to not using next-hop-self, or just putting it in your IGP. Basically, you should be doing the same thing you do for a /30 from a peer or transit provider in your network. There is one thing special about an exchange point though, for security reasons you probably want to add it to your "never accept" routing filter from peers/customers/transit providers. You don't need someone injecting a couple of more specifics to mess with your routing. Problem #2 is your customers. If you have customers that may operate default free, and they use one of the traceroute tools that not only finds the route, but then continues to probe it (like MTR, or Visual Traceroute) there can be an issue. The initial traceroute probe may return an IP on the exchange of your peer's router, but then when they subsequently source ICMP Ping to that IP there will be no route in their network, and it will simply never respond. Some call this a feature, some call this a problem. There is also an extremely rare problem where the far end of the peering exchange steps down MTU, and thus PMTU discovery is invoked, but your customers use Unicast RPF. Since the exchange LAN isn't in their table, Unicast RPF may drop the PMTU packet-too-big message, causing a timeout. If your customers have a default to you, all is well. However if they have a default to someone else, and take a table from you to selectively override the same problem can occur for any routes they select through you that also traverse the exchange. IMHO the best fix for #2 is that the exchange have an ASN, and announce the exchange LAN from that ASN, typically via the route server. You should then peer with the route server to pick up that network. That makes the announcement consistent, and makes it clear who operates that network, and your customers can then access it. Many exchanges do not do this, and then the next best solution might be to originate it from your ASN and announce it to your customers only, with no-export set on the way out. Various people will no doubt chime in and tell you the last two suggestions are either excellent wonderful and the worst idea ever. Safe to say I know of networks doing both and the world has not ended. YMMV, some assembly required, batteries not included, actual conditions may affect product performance, do not taunt the happy fun ball, and consult a doctor if your network is up for more than four hours. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Jan 14, 2014, at 22:20 , Leo Bicknell <bicknell@ufp.org> wrote:
On Jan 14, 2014, at 7:55 PM, Eric A Louie <elouie@yahoo.com> wrote:
I have a connection to a peering fabric and I'm not distributing the peering fabric routes into my network.
There's a two part problem lurking.
Problem #1 is how you handle your internal routing. Most of the "big boys" will next-hop-self in iBGP all external routes. However depending on the size and configuration of your network there may be advantages to not using next-hop-self, or just putting it in your IGP. Basically, you should be doing the same thing you do for a /30 from a peer or transit provider in your network. There is one thing special about an exchange point though, for security reasons you probably want to add it to your "never accept" routing filter from peers/customers/transit providers. You don't need someone injecting a couple of more specifics to mess with your routing.
Problem #2 is your customers. If you have customers that may operate default free, and they use one of the traceroute tools that not only finds the route, but then continues to probe it (like MTR, or Visual Traceroute) there can be an issue. The initial traceroute probe may return an IP on the exchange of your peer's router, but then when they subsequently source ICMP Ping to that IP there will be no route in their network, and it will simply never respond. Some call this a feature, some call this a problem. There is also an extremely rare problem where the far end of the peering exchange steps down MTU, and thus PMTU discovery is invoked, but your customers use Unicast RPF. Since the exchange LAN isn't in their table, Unicast RPF may drop the PMTU packet-too-big message, causing a timeout.
If your customers have a default to you, all is well. However if they have a default to someone else, and take a table from you to selectively override the same problem can occur for any routes they select through you that also traverse the exchange.
IMHO the best fix for #2 is that the exchange have an ASN, and announce the exchange LAN from that ASN, typically via the route server. You should then peer with the route server to pick up that network. That makes the announcement consistent, and makes it clear who operates that network, and your customers can then access it. Many exchanges do not do this, and then the next best solution might be to originate it from your ASN and announce it to your customers only, with no-export set on the way out.
Various people will no doubt chime in and tell you the last two suggestions are either excellent wonderful and the worst idea ever. Safe to say I know of networks doing both and the world has not ended. YMMV, some assembly required, batteries not included, actual conditions may affect product performance, do not taunt the happy fun ball, and consult a doctor if your network is up for more than four hours.
I've known Leo for .. well, let's just say a long time. And I have great respect for his networking abilities. But I fall into the second camp. As someone who owns & operates an IXP, and is on the board of a couple more, and helped start even more, I'm going to stick to my guns here. As for knowing networks that do both, blah, blah, blah. I know lots of networks that allow spam, don't configure BCP38, have abusable name or NTP servers, etc. and the world has not come to an end. Doesn't mean you should. Lame excuse, Leo, and beneath you to even go there. NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period. If for no better reason, how about because it is not your prefix, and chances are the IXP does not want you to use the prefix. In fact, I challenge you to find a major IXP route server which is announcing the IXP block. But because this is a teaching list, let's go through the problems Leo mentions. Anyone who steps down MTU on an IXP is far too broken to worry about your customer having RFP and not getting PMTU. Again, I challenge you to find someone doing this today, their network would be close to unusable. As for traceroute .... Seriously? You want to increase breakage on the Internet because it might cause 3 stars in a traceroute? Puh-LEEEZE. Sorry, neither of those pass the sniff test, IMHO. So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are lazy. -- TTFN, patrick
On Jan 14, 2014, at 9:35 PM, Patrick W. Gilmore <patrick@ianai.net> wrote:
So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are lazy.
I'm going to have to disagree here with Patrick, because this is security through obscurity, and that doesn't work well. For some history about why people like Patrick take the position he did, read: http://blog.cloudflare.com/the-ddos-that-almost-broke-the-internet Exchange points got attacked, so people yanked them from the routing table hoping to prevent attacks. If you're on this list it should take you all of about 3 seconds to realize the attackers could do a traceroute, and attack the IP one hop on the far side of the exchange for a few dozen providers and still cause all sorts of havoc, or do any of another half dozen things I won't mention to cause problems. The effect would be nearly, if not perfectly identical, since that traffic still has to cross the exchange. I'll point out the MTU step-down issue is real, and it's part of why we can't have 9K MTU exchanges be the default on the Internet, which would really make things better for a significant number of users. I think Patrick is a bit quick to dismiss some of the potential issues. Every link on every router is subject to attack. Exchange point LAN's really aren't special in that regard. If anything the only thing that makes them slightly special is that they may in fact be more oversubscribed than most links. Where a backbone might have a router with 20x10GE, so attackers could try and drive 190GE out a 10GE in theory; an exchange point may have 100 people with 20x10GE coming in. An alternate view that mega-exchange points are massively oversubscribed potential single points of failure, and perhaps network operators should consider that. While a DDOS taking an exchange down for half a day is bad, imagine if there was a more sinister attack, taking out the physical infrastructure of an exchange. That can't be "fixed" with a routing advertisement. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Jan 14, 2014, at 23:03 , Leo Bicknell <bicknell@ufp.org> wrote:
On Jan 14, 2014, at 9:35 PM, Patrick W. Gilmore <patrick@ianai.net> wrote:
So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are lazy.
I'm going to have to disagree here with Patrick, because this is security through obscurity, and that doesn't work well.
Leo, each of your points below is incorrect. I'm happy to discuss off-list if you'd like.
For some history about why people like Patrick take the position he did, read: http://blog.cloudflare.com/the-ddos-that-almost-broke-the-internet
Exchange points got attacked, so people yanked them from the routing table hoping to prevent attacks. If you're on this list it should take you all of about 3 seconds to realize the attackers could do a traceroute, and attack the IP one hop on the far side of the exchange for a few dozen providers and still cause all sorts of havoc, or do any of another half dozen things I won't mention to cause problems. The effect would be nearly, if not perfectly identical, since that traffic still has to cross the exchange.
Let's take just the incident mentioned in the blog post above (which is pretty broken itself, but hey, who said the CEO of CDN had to know anything about networking... ? :). To where would the attacker traceroute -to-? Somewhere inside Cloudflare? Other LINX members? Remember, most of the attack was sourced from networks which were not attached to the LINX. If the source network or the source network's upstreams are not LINX members, there is probably _no_ path that goes through LINX. Even if they are members, lots of networks have alternative paths (other IXPs, private interconnections, etc.). For instance, sources in Germany may well flow over DE-CIX even if there is a peering session at LINX. Etc. There is no single or set of IP addresses that will guarantee even a majority of packets traverse a specific IXP except the IXP LAN. Also, the attack was reflected DNS, so the attacker couldn't actually perform the traceroutes you suggest from each source as he did not control the sources. He _might_ be able to find _some_ of the paths with a lot of sleuthing through route & traceroute servers, but that would make things massively more difficult, as well as massively cut the number of servers he can abuse to the same effect. Both of which are huge wins for the good guys. Pulling the IXP prefix has a enormous benefits and essentially no downside. I know literally hundreds of ISPs large & small who do not carry the IXP prefix, and none have seen any significant issues (most have seen zero, a few get asked about 3 stars, but as I said before, puh-leeeeze). I'm a bit surprised you even tried to bring this up. I know you well enough to know you would have realized all of the above if you had though about it for a while (or just asked).
I'll point out the MTU step-down issue is real, and it's part of why we can't have 9K MTU exchanges be the default on the Internet, which would really make things better for a significant number of users. I think Patrick is a bit quick to dismiss some of the potential issues.
MTU step-down is a real issue, and it's real enough whether IXP LANs are in the DFZ or not. Let's solve the overarching problem before doing something which has real, proven harm and leaves the root cause in place. Besides, the two VLAN method already exists in multiple places and it hasn't helped adoption of 9K packets. Unless you are talking about letting some people attach with 1500 MTU and others with 9000 MTU? 'Cause if that's what you meant, then I'm just going to call you loony and ask what you're smoking.
Every link on every router is subject to attack. Exchange point LAN's really aren't special in that regard. If anything the only thing that makes them slightly special is that they may in fact be more oversubscribed than most links. Where a backbone might have a router with 20x10GE, so attackers could try and drive 190GE out a 10GE in theory; an exchange point may have 100 people with 20x10GE coming in. An alternate view that mega-exchange points are massively oversubscribed potential single points of failure, and perhaps network operators should consider that. While a DDOS taking an exchange down for half a day is bad, imagine if there was a more sinister attack, taking out the physical infrastructure of an exchange. That can't be "fixed" with a routing advertisement.
IXPs are more special because they are shared. Other links are between you & one other network not hundreds of other networks, some of whom have no relationship with you. If you don't like the rules of IXPs, don't join one. But hooking up to one and deciding "I'm going to carry this prefix" even when told not to is .. well, let's call it bad manners. As for the rest, nothing is a silver bullet. Claiming "this doesn't solve every possible problem so we shouldn't do it" is even more lame than your first excuse that the world hasn't ended. This solves lots of real, provable problems. It is trivial to implement. There is no network which peers at an IXP and cannot implement it. It _has_ been implemented 1000s of times without the harm you mention. In short, it should be done. I repeat: NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device except those directly attached to that LAN. Period. If you join one of my IXPs and I find you are carrying a prefix I own, did not advertise to you, and specifically told you not to carry, I'm going to ask you to stop immediately or face possible disconnection. The other members of my IXP should not be endangered because you don't like to follow the rules. What's more, I get a lot more people thanking me for doing that than complaining about it. -- TTFN, patrick
On Jan 15, 2014, at 11:41 AM, Patrick W. Gilmore <patrick@ianai.net> wrote:
I repeat: NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device except those directly attached to that LAN. Period.
+1 Again, folks, this isn't theoretical. When the particular attacks cited in this thread were taking place, I was astonished that the IXP infrastructure routes were even being advertised outside of the IXP network, because of these very issues. IXPs are not the problem when it comes to breaking PMTU-D. The problem is largely with enterprise networks, and with 'security' vendors who've propagated the myth that simply blocking all ICMP somehow increases 'security'. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Luck is the residue of opportunity and design. -- John Milton
On Jan 15, 2014, at 12:02 AM, "Dobbins, Roland" <rdobbins@arbor.net> wrote:
Again, folks, this isn't theoretical. When the particular attacks cited in this thread were taking place, I was astonished that the IXP infrastructure routes were even being advertised outside of the IXP network, because of these very issues.
I know a lot of people push next-hop-self, and if you're a large ISP with thousands of BGP customers is pretty much required to scale. However, a good engineer would know there are drawbacks to next-hop-self, in particular it slows convergence in a number of situations. There are networks where fast convergence is more important than route scaling, and thus the traditional design of BGP next-hops being edge interfaces, and edge interfaces in the IGP performs better. By attempting to force IX participants to not put the route in IGP, those IX participants are collectively deciding on a slower converging network for everyone. I don't like a world where connecting to an exchange point forces a particular network design on participants.
IXPs are not the problem when it comes to breaking PMTU-D. The problem is largely with enterprise networks, and with 'security' vendors who've propagated the myth that simply blocking all ICMP somehow increases 'security'.
That's some circular reasoning. Networks won't 9K peer at exchange points for a number of reasons, including PMTU-D discovery issues. Since there are virtual no 9K peering at exchange points, PMTU-D is a non-issue. Maybe if IXP design didn't break PMTU-D it would help attract more 9K peers, or there might even be a future where 9K peering was required? This whole problem smacks to me of exchange points that are "too big to fail". Since some of these exchanges are so big, everyone else must bend to their needs. I think the world would be a better place if some of these were broken up into smaller exchanges and they imposed less restrictions on their participants. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Jan 15, 2014, at 9:18 PM, Leo Bicknell <bicknell@ufp.org> wrote:
However, a good engineer would know there are drawbacks to next-hop-self, in particular it slows convergence in a number of situations. There are networks where fast convergence is more important than route scaling, and thus the traditional design of BGP next-hops being edge interfaces, and edge interfaces in the IGP performs better.
A good engineer also knows that there are huge drawbacks to having a peer's network infrastructure DDoSed, routes flapping, core bandwidth consumed by tens and hundreds of gb/sec of attack traffic, et. al., too. ;>
By attempting to force IX participants to not put the route in IGP, those IX participants are collectively deciding on a slower converging network for everyone. I don't like a world where connecting to an exchange point forces a particular network design on participants.
Concur. But that's the world we live in, unfortunately. It's just another example of the huge, concentric nature of the collateral damage arising from DDoS attacks, both from the attacks themselves, and from the compromises folks have to make in order to increase resilience against such attacks.
That's some circular reasoning.
Not really. What I'm saying is that since PMTU-D is already broken on so many endpoint networks - i.e., where traffic originates and where it terminates - that any issues arising from PMTU-D irregularities in IXP networks are trivial by comparison. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Luck is the residue of opportunity and design. -- John Milton
On Jan 15, 2014, at 8:49 AM, "Dobbins, Roland" <rdobbins@arbor.net> wrote:
Not really. What I'm saying is that since PMTU-D is already broken on so many endpoint networks - i.e., where traffic originates and where it terminates - that any issues arising from PMTU-D irregularities in IXP networks are trivial by comparison.
I think we're looking at two different aspects of the same issue. I believe you're coming at it from a 'for all users of the Internet, what's the chance they have connectivity that does not break PMTU-D.' That's an important group to study, particularly for those DSL users still left with < 1500 byte MTU's. And you're right, for those users IXP's are the least of their worries, mostly it's content-side poor networking, like load balancers and firewalls that don't work correctly. I am approaching it from a different perspective, 'where is PMTU-D broken for people who want to use 1500-9K frames end to end?' I'm the network guy who wants to buy transit in the US, and transit in Germany and run a tunnel of 1500 byte packets end to end, necessitating a ~1540 byte packet. Finding transit providers who will configure jumbo frames is trivial these days, and most backbones are jumbo frame clean (at least to 4470, but many to 9K). There's probably about a 25% chance private peelings are also jumbo clean. Pretty much the only thing broken for this use case is IXP's. Only a few have a second VLAN for 9K peerings, and most participants don't use it for a host of reasons, including PMTU-D problems. I'm an oddball. I think MPLS VPN's are a terrible idea for the consumer, locking them into a single provider in the vast majority of cases. Consumers would be better served by having a tunnel box (IPSec maybe?) at their edge and running there own tunnel over IP provider-independently, if they could get > 1500B MTU at the edge, and move those packets end to end. While I've always thought that, in the post-Snowden world I think I seem a little less crazy, rather than relying on your provider to keep your "VPN" traffic secret, customers should be encrypting it in a device they own. But hey, I get why ISP's don't want to offer 9K MTU clean paths end to end. Customers could then buy a VPN appliance and manage their own VPN's with no vendor lock-in. MPLS VPN revenues would tumble, and customers would move more fluidly between providers. That's terrible if you're an ISP. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Jan 15, 2014, at 10:31 PM, Leo Bicknell <bicknell@ufp.org> wrote:
I am approaching it from a different perspective, 'where is PMTU-D broken for people who want to use 1500-9K frames end to end?'
I understand that perspective, absolutely. But what I'm saying is that that whether or not they want to use jumbo frames for Internet traffic, it doesn't matter, because PMTU-D is likely to be broken either at the place where the traffic is initiated, the place where the traffic is received, or both - so any nonsense in the middle, especially on IXP networks in particular, isn't really a significant issue in and of itself. If we could get things optimized and remediated to the point where potential PMTU-D breakage in IXP networks were a significant issue of iteself, the Internet would be much improved. But I don't see any likelihood of that happening anytime soon. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Luck is the residue of opportunity and design. -- John Milton
On Jan 15, 2014, at 9:37 AM, "Dobbins, Roland" <rdobbins@arbor.net> wrote:
But what I'm saying is that that whether or not they want to use jumbo frames for Internet traffic, it doesn't matter, because PMTU-D is likely to be broken either at the place where the traffic is initiated, the place where the traffic is received, or both - so any nonsense in the middle, especially on IXP networks in particular, isn't really a significant issue in and of itself.
Your assertion does not match my deployment experience. When I have deployed endpoints that have working PMTU-D, I have 99.999% success with the ISP's in the middle having working PMTU-D. It even works fine for 9K providers connected to 1500B exchange points, because the packet-too-big typically originates from the input side of the router (the backbone link to the IXP router). Indeed, the only place I've seen it broken is where the ISP 9K peers at an exchange, and the "far end" ISP runs a < 9K backbone (like 4470), so the far end IXP-router does the packet-to-big, and originates it from the exchange LAN, which because it's no longer in the table fails to past uRPF. (Business class) ISP's don't break PMTU-D, end users break it with the equipment they connect. So a smart user connecting equipment that is properly configured should be able to expect it to work properly. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Jan 15, 2014, at 10:52 PM, Leo Bicknell <bicknell@ufp.org> wrote:
(Business class) ISP's don't break PMTU-D, end users break it with the equipment they connect.
Concur 100%. That's my point.
So a smart user connecting equipment that is properly configured should be able to expect it to work properly.
In my deployment experience, many (most?) end-user organization break PMTU-D to/through their LANs outside of their IDCs, much less to the Internet, for themselves, and for everyone who wishes to communicate with them across the Internet. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Luck is the residue of opportunity and design. -- John Milton
But hey, I get why ISP's don't want to offer 9K MTU clean paths end to end. Customers could then buy a VPN appliance and manage their own VPN's with no vendor lock-in. MPLS VPN revenues would tumble, and customers would move more fluidly between providers. That's terrible if you're an ISP.
No it's exactly why some carriers do their best to provide 9K+ MTU to most of their POPs so that they can provide L2 services to ISPs and other carriers that require 9K MTU for their BB links to capitalize on these new emerging markets. Customers locked in to a single provider (MPLS VPN) can rely on certain class of service (predictable delay, jitter and packet loss) properties you can't get out of a pure internet connection from NY to Tokyo. adam -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Wednesday, January 15, 2014 4:31 PM To: Dobbins, Roland Cc: NANOG list Subject: Re: best practice for advertising peering fabric routes On Jan 15, 2014, at 8:49 AM, "Dobbins, Roland" <rdobbins@arbor.net> wrote:
Not really. What I'm saying is that since PMTU-D is already broken on so many endpoint networks - i.e., where traffic originates and where it terminates - that any issues arising from PMTU-D irregularities in IXP networks are trivial by comparison.
I think we're looking at two different aspects of the same issue. I believe you're coming at it from a 'for all users of the Internet, what's the chance they have connectivity that does not break PMTU-D.' That's an important group to study, particularly for those DSL users still left with < 1500 byte MTU's. And you're right, for those users IXP's are the least of their worries, mostly it's content-side poor networking, like load balancers and firewalls that don't work correctly. I am approaching it from a different perspective, 'where is PMTU-D broken for people who want to use 1500-9K frames end to end?' I'm the network guy who wants to buy transit in the US, and transit in Germany and run a tunnel of 1500 byte packets end to end, necessitating a ~1540 byte packet. Finding transit providers who will configure jumbo frames is trivial these days, and most backbones are jumbo frame clean (at least to 4470, but many to 9K). There's probably about a 25% chance private peelings are also jumbo clean. Pretty much the only thing broken for this use case is IXP's. Only a few have a second VLAN for 9K peerings, and most participants don't use it for a host of reasons, including PMTU-D problems. I'm an oddball. I think MPLS VPN's are a terrible idea for the consumer, locking them into a single provider in the vast majority of cases. Consumers would be better served by having a tunnel box (IPSec maybe?) at their edge and running there own tunnel over IP provider-independently, if they could get > 1500B MTU at the edge, and move those packets end to end. While I've always thought that, in the post-Snowden world I think I seem a little less crazy, rather than relying on your provider to keep your "VPN" traffic secret, customers should be encrypting it in a device they own. But hey, I get why ISP's don't want to offer 9K MTU clean paths end to end. Customers could then buy a VPN appliance and manage their own VPN's with no vendor lock-in. MPLS VPN revenues would tumble, and customers would move more fluidly between providers. That's terrible if you're an ISP. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On (2014-01-15 08:18 -0600), Leo Bicknell wrote:
I know a lot of people push next-hop-self, and if you're a large ISP with thousands of BGP customers is pretty much required to scale.
It's actually the polar opposite. If you are small, there are no compelling reasons to put IXP in IGP. If you are large, you may wish to have different IGP metric to two egress points in same peering router. In which case you should at very least have IP ACL in IXP interface which only allows LAN2LAN. -- ++ytti
Hello Leo, On Wed, 15 Jan 2014 08:18:13 -0600 Leo Bicknell <bicknell@ufp.org> wrote:
This whole problem smacks to me of exchange points that are "too big to fail". Since some of these exchanges are so big, everyone else must bend to their needs. I think the world would be a better place if some of these were broken up into smaller exchanges and they imposed less restrictions on their participants.
You forgot to add "and would break down on a weekly basis". The restrictions that IXPs impose on their customers have nothing to do with the size of their peering LAN, but everything with offering a reliable service to these same customers. Kind regards, Martin
However, a good engineer would know there are drawbacks to next-hop-self, in particular it slows convergence in a number of situations. There are networks where fast convergence is more important than route scaling, and thus the traditional design of BGP next-hops being edge interfaces, and edge interfaces in the IGP performs better.
Well it's not true anymore BGP PIC edge and core converges under 50ms. "fast external failover" and "local repair" where available long before -but yes that's applicable only for MPLS. adam -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Wednesday, January 15, 2014 3:18 PM To: Dobbins, Roland Cc: NANOG list Subject: Re: best practice for advertising peering fabric routes On Jan 15, 2014, at 12:02 AM, "Dobbins, Roland" <rdobbins@arbor.net> wrote:
Again, folks, this isn't theoretical. When the particular attacks cited in this thread were taking place, I was astonished that the IXP infrastructure routes were even being advertised outside of the IXP network, because of these very issues.
IXPs are not the problem when it comes to breaking PMTU-D. The problem is largely with enterprise networks, and with 'security' vendors who've
I know a lot of people push next-hop-self, and if you're a large ISP with thousands of BGP customers is pretty much required to scale. However, a good engineer would know there are drawbacks to next-hop-self, in particular it slows convergence in a number of situations. There are networks where fast convergence is more important than route scaling, and thus the traditional design of BGP next-hops being edge interfaces, and edge interfaces in the IGP performs better. By attempting to force IX participants to not put the route in IGP, those IX participants are collectively deciding on a slower converging network for everyone. I don't like a world where connecting to an exchange point forces a particular network design on participants. propagated the myth that simply blocking all ICMP somehow increases 'security'. That's some circular reasoning. Networks won't 9K peer at exchange points for a number of reasons, including PMTU-D discovery issues. Since there are virtual no 9K peering at exchange points, PMTU-D is a non-issue. Maybe if IXP design didn't break PMTU-D it would help attract more 9K peers, or there might even be a future where 9K peering was required? This whole problem smacks to me of exchange points that are "too big to fail". Since some of these exchanges are so big, everyone else must bend to their needs. I think the world would be a better place if some of these were broken up into smaller exchanges and they imposed less restrictions on their participants. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On 1/14/14, 8:41 PM, Patrick W. Gilmore wrote:
I repeat: NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device except those directly attached to that LAN. Period.
So ... RFC1918 addresses for the IXP fabric, then? (Half kidding, but still ....) Jim Shankland
On 2014-01-15, at 12:04, Jim Shankland <nanog@shankland.org> wrote:
On 1/14/14, 8:41 PM, Patrick W. Gilmore wrote:
I repeat: NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device except those directly attached to that LAN. Period.
So ... RFC1918 addresses for the IXP fabric, then?
I've heard apparently non-drunk people suggest IPv6 link-local addresses as BGP endpoints across exchanges, too.
(Half kidding, but still ....)
RFC 6752. One observation on this thread: some networks have customers who react badly to unusual things seen in traceroute. Sometimes the margin on an individual customer is low enough that one support call displaces any profit you were going to make off them this month. It's understandable to me that such network operators would choose to carry IXP routes internally in order to avoid that potential support burden. I don't pretend to have any universal good/bad answer to the original question, though. I don't think the world is that simple. Joe
* nanog@shankland.org (Jim Shankland) [Wed 15 Jan 2014, 18:04 CET]:
So ... RFC1918 addresses for the IXP fabric, then?
(Half kidding, but still ....)
They need to be globally unique. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On Wed, Jan 15, 2014 at 12:54 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* nanog@shankland.org (Jim Shankland) [Wed 15 Jan 2014, 18:04 CET]:
So ... RFC1918 addresses for the IXP fabric, then?
(Half kidding, but still ....)
They need to be globally unique.
do they? :) also... there is/was an exchange in south america (columbia maybe? it's been a while since I saw this in configs) that used 192.168.0.0/16 space for their exchange.
On Wed, Jan 15, 2014 at 12:54 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* nanog@shankland.org (Jim Shankland) [Wed 15 Jan 2014, 18:04 CET]:
So ... RFC1918 addresses for the IXP fabric, then?
(Half kidding, but still ....)
They need to be globally unique.
Hi Niels, Actually, they don't. To meet the basic definition of working, they just have to be able to originate ICMP destination unreachable packets with a reasonable expectation that the recipient will receive those packets. Global uniqueness is not required for that. However, RFC1918 addresses don't meet the requirement for a different reason: they're routinely dropped at AS borders, thus don't have an expectation of reaching the external destination. Of course working, monitorable and testable are three different things. If my NMS can't reach the IXP's addresses, my view of the IXP is impaired. And "the Internet is broken" is not a trouble report that leads to a successful outcome with customer support... it helps to be able to pin things down with some specificity. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Wed, Jan 15, 2014 at 1:26 PM, William Herrin <bill@herrin.us> wrote:
On Wed, Jan 15, 2014 at 12:54 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* nanog@shankland.org (Jim Shankland) [Wed 15 Jan 2014, 18:04 CET]:
So ... RFC1918 addresses for the IXP fabric, then?
(Half kidding, but still ....)
They need to be globally unique.
Hi Niels,
Actually, they don't. To meet the basic definition of working, they just have to be able to originate ICMP destination unreachable packets with a reasonable expectation that the recipient will receive those packets. Global uniqueness is not required for that. However, RFC1918 addresses don't meet the requirement for a different reason: they're routinely dropped at AS borders, thus don't have an expectation of reaching the external destination.
Of course working, monitorable and testable are three different things. If my NMS can't reach the IXP's addresses, my view of the IXP is impaired. And "the Internet is broken" is not a trouble report that leads to a successful outcome with customer support... it helps to be able to pin things down with some specificity.
Regards, Bill Herrin
Using RFC1918 would incur the assumption that one will need to use a unique router or routing instance for every exchange connected to since exchanges are likely to have overlapping space at that point (RFC1918 IXP registry anyone?). I don't think it'd be a good idea to go down that path.. Also mentioned in a past nanog was the idea of potentially getting someone like team cymru to setup all exchange prefixes in a special bogon list and you could null route on your edge all those prefixes.. I inquired to team cymru about this back when originally discussed but never got anywhere with them.
-- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
-- [stillwaxin@gmail.com ~]$ cat .signature cat: .signature: No such file or directory [stillwaxin@gmail.com ~]$
On Jan 15, 2014, at 10:26 AM, William Herrin <bill@herrin.us> wrote:
Of course working, monitorable and testable are three different things. If my NMS can't reach the IXP's addresses, my view of the IXP is impaired. And "the Internet is broken" is not a trouble report that leads to a successful outcome with customer support... it helps to be able to pin things down with some specificity.
This approach concerns me for a number of reasons. First, having your NMS ping your upstream’s IXP peers probably doesn’t scale. If I’m a peer of a reasonably large provider, I’m pretty sure I don’t want all their customers hammering my management plane. Even if you’re the only one doing it, you also don’t know if I’m rate-limiting pings for that or any other reason. Second, what information do you get that you didn’t already have? If you saw the IP in a traceroute then you know it exists, is alive, is in the path, and a rough estimation of the latency. Pinging it may even give you negative information. Platforms vary and all, but in my experience pinging a router, especially a potentially busy one peering at an IXP, shows notably worse performance than “real” traffic experiences (admittedly somewhat true of TTL Expired responses, but less so in my experience). Now you’re potentially seeing high latency and packet loss which in reality might not even be there at all. Third, you don’t know that your ping to the peering IP is even taking the same path as the packets addressed to the real destination. MTR for example looks nice, but it would probably be more accurate if it simply ran the traceroute over and over instead of pinging each hop directly. You would also detect path changes for the real destination that pinging intermediate hops wouldn’t show you. While I appreciate the desire to be able to do as much of your own detective work as possible, I can also see where you’re now shifting workload onto someone else’s support organization when they’re not necessarily the problem either (“Hey, my NMS says your peering router is causing latency and packet loss, fix it!”). I’m also not saying there isn’t a troubleshooting gap caused by this. I’m just not sure being able to ping the IXP hop solves that problem either. Semi-related tangent: Working in an IXP setting I have seen weird corner cases cause issues in conjunction with the IXP subnet existing in BGP. Say someone’s got proxy ARP enabled on their router (sadly, more common than it should be, and not just from noobs at startups). Now say your IXP is growing and you expand the subnet. No matter how much you harp on the customers to make the change, they don’t all do it at once. Someone announces the new, larger subnet in BGP. Now when anyone ARPs for IPs in the new part of the range, proxy ARP guy (still on the smaller subnet) says “hey I have a route for that, send it here”. That was fun to troubleshoot. :) -c
* clay@bloomcounty.org (Clay Fiske) [Wed 15 Jan 2014, 20:34 CET]:
Semi-related tangent: Working in an IXP setting I have seen weird corner cases cause issues in conjunction with the IXP subnet existing in BGP. Say someone’s got proxy ARP enabled on their router (sadly, more common than it should be, and not just from noobs at startups). Now say your IXP is growing and you expand the subnet. No matter how much you harp on the customers to make the change, they don’t all do it at once. Someone announces the new, larger subnet in BGP. Now when anyone ARPs for IPs in the new part of the range, proxy ARP guy (still on the smaller subnet) says “hey I have a route for that, send it here”. That was fun to troubleshoot. :)
Proper run IXPs pay engineers to hunt down people with Proxy ARP enabled on their peering interfaces. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On Jan 15, 2014, at 12:46 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Wed 15 Jan 2014, 20:34 CET]:
Semi-related tangent: Working in an IXP setting I have seen weird corner cases cause issues in conjunction with the IXP subnet existing in BGP. Say someone’s got proxy ARP enabled on their router (sadly, more common than it should be, and not just from noobs at startups). Now say your IXP is growing and you expand the subnet. No matter how much you harp on the customers to make the change, they don’t all do it at once. Someone announces the new, larger subnet in BGP. Now when anyone ARPs for IPs in the new part of the range, proxy ARP guy (still on the smaller subnet) says “hey I have a route for that, send it here”. That was fun to troubleshoot. :)
Proper run IXPs pay engineers to hunt down people with Proxy ARP enabled on their peering interfaces.
Yes, yes, I expected a smug reply like this. I just didn’t expect it to take so long. But how can I detect proxy ARP when detecting proxy ARP was patented in 1996? http://www.google.com/patents/US5708654 Seriously though, it’s not so simple. You only get replies if the IP you ARP for is in the offender’s route table (or they have a default route). I’ve seen different routers respond depending on which non-local IP was ARPed for. And while using something like 8.8.8.8 might be an obvious choice, I don’t care to hose up everyone’s connectivity to it just to find local proxy ARP offenders on my network. -c
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:35 CET]: [...]
Seriously though, it’s not so simple. You only get replies if the IP you ARP for is in the offender’s route table (or they have a default route). I’ve seen different routers respond depending on which non-local IP was ARPed for. And while using something like 8.8.8.8 might be an obvious choice, I don’t care to hose up everyone’s connectivity to it just to find local proxy ARP offenders on my network.
You'll never be entirely sure but obviously you're not limited to sending only one ARP request - this isn't The Hunt For The Red October movie. We're talking a common misconfiguration here in this thread - or at least you were, two mails upthread. How will checking for Proxy ARP possibly hose up anybody's connectivity? You realise that ARP replies are unicast, right? And that IXPs generally have dedicated servers for monitoring from which they can source packets? -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On Jan 15, 2014, at 3:47 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:35 CET]: [...]
Seriously though, it’s not so simple. You only get replies if the IP you ARP for is in the offender’s route table (or they have a default route). I’ve seen different routers respond depending on which non-local IP was ARPed for. And while using something like 8.8.8.8 might be an obvious choice, I don’t care to hose up everyone’s connectivity to it just to find local proxy ARP offenders on my network.
You'll never be entirely sure but obviously you're not limited to sending only one ARP request - this isn't The Hunt For The Red October movie. We're talking a common misconfiguration here in this thread - or at least you were, two mails upthread.
How will checking for Proxy ARP possibly hose up anybody's connectivity? You realise that ARP replies are unicast, right? And that IXPs generally have dedicated servers for monitoring from which they can source packets?
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies. So no, even though I consider it someone else’s bad behavior to broadcast an ARP reply, I’m not willing to take the chance with an IP that doesn’t belong to me. -c
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies.
I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them.
So no, even though I consider it someone else’s bad behavior to broadcast an ARP reply, I’m not willing to take the chance with an IP that doesn’t belong to me.
So do an ARP request for www.equinix.com, or (and!) for an unused address on your Peering LAN. Standard tools like arpwatch should alert you to fishy things going on, loudly. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On Jan 15, 2014, at 4:03 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies.
I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them.
This was some time ago, but the two I was able to dig up from that case were both Junipers. Perhaps it’s something that only happens when proxy ARP is enabled? -c
Cisco PIX's used to do this if the firewall had a route and saw a ARP request in that IP range it would proxy arp. ----- Original Message -----
On Jan 15, 2014, at 4:03 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies.
I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them.
This was some time ago, but the two I was able to dig up from that case were both Junipers. Perhaps it’s something that only happens when proxy ARP is enabled?
-c
-- Eric Rosen CCIE Security #17821 Information Security Analyst Red Hat, Inc erosen@redhat.com 919.890.8555 x48555 IRC erosen
Excellent. So all everyone has to do is not buy cisco _or_ juniper. Wait a minute.... -- TTFN, patrick On Jan 15, 2014, at 19:54 , Eric Rosen <erosen@redhat.com> wrote:
Cisco PIX's used to do this if the firewall had a route and saw a ARP request in that IP range it would proxy arp.
----- Original Message -----
On Jan 15, 2014, at 4:03 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies.
I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them.
This was some time ago, but the two I was able to dig up from that case were both Junipers. Perhaps it’s something that only happens when proxy ARP is enabled?
-c
-- Eric Rosen CCIE Security #17821 Information Security Analyst Red Hat, Inc erosen@redhat.com 919.890.8555 x48555 IRC erosen
On Wed, Jan 15, 2014 at 10:21 PM, Patrick W. Gilmore <patrick@ianai.net>wrote:
Excellent. So all everyone has to do is not buy cisco _or_ juniper.
Or make the LANs IPv6-only adressed, since ARP is not used. <G> And it is probably unlikely that someone will turn on a ND Proxy by "accident".
Wait a minute....
-- TTFN, patrick
-- -JH
Cisco ASA's still have proxy ARP enabled by default when certain NAT types are configured. http://www.cisco.com/en/US/docs/security/asa/asa84/configuration/guide/nat_o... "Default Settings (8.3(1), 8.3(2), and 8.4(1)) The default behavior for identity NAT has proxy ARP disabled. You cannot configure this setting. (8.4(2) and later) The default behavior for identity NAT has proxy ARP enabled, matching other static NAT rules. You can disable proxy ARP if desired. See the "Routing NAT Packets" section for more information." On 1/15/2014 7:54 PM, Eric Rosen wrote:
Cisco PIX's used to do this if the firewall had a route and saw a ARP request in that IP range it would proxy arp.
----- Original Message -----
On Jan 15, 2014, at 4:03 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies. I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them. This was some time ago, but the two I was able to dig up from that case were both Junipers. Perhaps it’s something that only happens when proxy ARP is enabled?
-c
-- Vlade Ristevski Network Manager IT Services Ramapo College (201)-684-6854
* vristevs@ramapo.edu (Vlade Ristevski) [Thu 16 Jan 2014, 17:46 CET]:
Cisco ASA's still have proxy ARP enabled by default when certain NAT types are configured.
That wasn't the question. The question was what equipment would send proxy ARP replies as broadcasts, possibly causing poisoning in other routers (which still sounds far-fetched to me). -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
I seem to recall some video encoders doing that, but I can't remember the vendor. Sent from my Mobile Device. -------- Original message -------- From: Niels Bakker <niels=nanog@bakker.net> Date: 01/16/2014 8:54 AM (GMT-08:00) To: nanog@nanog.org Subject: Re: Proxy ARP detection * vristevs@ramapo.edu (Vlade Ristevski) [Thu 16 Jan 2014, 17:46 CET]:
Cisco ASA's still have proxy ARP enabled by default when certain NAT types are configured.
That wasn't the question. The question was what equipment would send proxy ARP replies as broadcasts, possibly causing poisoning in other routers (which still sounds far-fetched to me). -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On Thu, Jan 16, 2014 at 10:51 AM, Niels Bakker <niels=nanog@bakker.net>wrote:
That wasn't the question. The question was what equipment would send proxy ARP replies as broadcasts, possibly causing poisoning in other routers (which still sounds far-fetched to me).
Which current routers will actually _listen_ to a broadcast ARP response involving an IP address that is outside the subnet assigned to that IP interface, and override the routing table with that entry? -- -J
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 01:25 CET]:
On Jan 15, 2014, at 4:03 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* clay@bloomcounty.org (Clay Fiske) [Thu 16 Jan 2014, 00:59 CET]:
This is where theory diverges nicely from practice. In some cases the offender broadcast his reply, and guess what else? A lot of routers listen to unsolicited ARP replies.
I've never seen this. Please name vendor and product, if only so other subscribers to this list can avoid doing business with them.
This was some time ago, but the two I was able to dig up from that case were both Junipers. Perhaps it’s something that only happens when proxy ARP is enabled?
Maybe. I don't think I've ever dealt with a situation in which Proxy ARP was enabled on a Juniper router. I've certainly not seen them reply to a request with a broadcast, and frankly that sounds like such a weird implementation decision that I'm going to need to see pcaps before I believe it. Even if this were a regular occurrence - which it evidently is not - it's still better to trigger this when you know you're doing something rather than have to step in later when another misconfiguration triggers routing problems like described in an earlier mail, renumbering into a larger subnet. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
On 1/15/2014 6:31 PM, Clay Fiske wrote:
Yes, yes, I expected a smug reply like this. I just didn’t expect it to take so long.
But how can I detect proxy ARP when detecting proxy ARP was patented in 1996?
http://www.google.com/patents/US5708654
Seriously though, it’s not so simple. You only get replies if the IP you ARP for is in the offender’s route table (or they have a default route). I’ve seen different routers respond depending on which non-local IP was ARPed for. And while using something like 8.8.8.8 might be an obvious choice, I don’t care to hose up everyone’s connectivity to it just to find local proxy ARP offenders on my network.
-c
Shouldn't ARP inspection be a common feature?
On Wed, Jan 15, 2014 at 10:49 PM, ML <ml@kenweb.org> wrote:
Shouldn't ARP inspection be a common feature?
Dynamic ARP inspection is mostly useful only when the trusted ports receive their MAC to IP address mapping from a trusted DHCP server, and the trusted mapping is established using DHCP snooping. Or else, you have a manually entered entries in the secure ARP database of MAC to IP mappings. Which most operators would be resistant to dealing with, because of all the extra work. -It's not as if the switches know what the valid subnets are and suppress ARP requests for outside networks. Therefore, in most cases; ARP inspection won't be used, except for DHCP clients. Arp inspection goes hand-in-hand with increasing resistance against a Man in the Middle attack from a compromised workstation on a LAN, using ARP hijacking to capture traffic or distribute malware to a neighboring workstation. In most cases, DHCP-based configuration will not be used for routers (the very devices that might inadvertently have proxy-arp).... -- -JH
* bill@herrin.us (William Herrin) [Wed 15 Jan 2014, 19:27 CET]:
On Wed, Jan 15, 2014 at 12:54 PM, Niels Bakker <niels=nanog@bakker.net> wrote:
* nanog@shankland.org (Jim Shankland) [Wed 15 Jan 2014, 18:04 CET]:
So ... RFC1918 addresses for the IXP fabric, then? (Half kidding, but still ....)
They need to be globally unique.
Actually, they don't. To meet the basic definition of working, they just have to be able to originate ICMP destination unreachable packets with a reasonable expectation that the recipient will receive those packets. Global uniqueness is not required for that. However, RFC1918 addresses don't meet the requirement for a different reason: they're routinely dropped at AS borders, thus don't have an expectation of reaching the external destination.
They need to be globally unique because otherwise a connected network might be using them already internally, thus keeping them from connecting - or as another followup mail stated, force everything into their own VRFs, and that may still collide. This was rehashed a few years ago on the RIPE AP-WG mailing list, IIRC. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
* patrick@ianai.net (Patrick W. Gilmore) [Wed 15 Jan 2014, 04:36 CET]: [..]
NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.
This is correct, and protects both your (ISP) infrastructure and the IXP's. All major European IXPs revisited their policy after the giant DDoS attack on CloudFlare, and the above was pretty much the outcome. -- Niels. -- "It's amazing what people will do to get their name on the internet, which is odd, because all you really need is a Blogspot account." -- roy edroso, alicublog.blogspot.com
participants (24)
-
Adam Vitkovsky
-
Cb B
-
Christopher Morrow
-
Clay Fiske
-
Dobbins, Roland
-
Eric A Louie
-
Eric Rosen
-
Florian Weimer
-
Jim Shankland
-
Jimmy Hess
-
Joe Abley
-
Leo Bicknell
-
Mark Tinka
-
Martin Pels
-
Michael Hallgren
-
Michael Still
-
ML
-
Niels Bakker
-
Patrick W. Gilmore
-
Saku Ytti
-
Siegel, David
-
Vlade Ristevski
-
Warren Bailey
-
William Herrin