rfc 1812 third party address on traceroute

Randy Bush

31 May 2016 31 May '16

5:03 a.m.

rfc1812 says 4.3.2.4 ICMP Message Source Address Except where this document specifies otherwise, the IP source address in an ICMP message originated by the router MUST be one of the IP addresses associated with the physical interface over which the ICMP message is transmitted. If the interface has no IP addresses associated with it, the router's router-id (see Section [5.2.5]) is used instead. some folk have interpreted this to mean that, if a router R has three interfaces .-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------' if the source of a traceroute from S toward D with TTL to expire on R, and R's FIB wants to exit via C to get back to S (yes, virginia, the internet is highly asymmetric), the source address of the time exceeded message should be C. of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire. ras's preso, https://www.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N... page 10 illustrates this issue with rfc1812 cursory research and talking with C & J seem to indicate that they do what i want not what some folk have interpreted 1812 to mean. at least on some models. is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it? randy

Show replies by date

Mikael Abrahamsson

31 May 31 May

5:30 a.m.

On Mon, 30 May 2016, Randy Bush wrote:

...

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

I agree 100%, and I'd venture to guess that most of the people running networks expect it to work like you describe.

...

cursory research and talking with C & J seem to indicate that they do what i want not what some folk have interpreted 1812 to mean. at least on some models.

is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

I have been told that there were versions of IOS XR that stopped doing what people wanted, people screamed, and then it's now back to the behaviour that you describe. In RFC1812 2.2.7 there is talk about router-id. When reading that I think it is generic enough to work for IPv6 as well? Another thing I've seen: People number their links with ULAs. ICMP error messages (including PTBs) are then sent from the router using the ULA address. This is obviously a disaster since that PTB sourced from ULA address is going to be BCP38:ed (hopefully). What's the interaction here with choosing a source address for the ICMP error message from something with the same RFC6724 label as the ICMP error message is being sent to? -- Mikael Abrahamsson email: swmike@swm.pp.se

Larry Sheldon

6 a.m.

I am completely innocent of rfc1812, and have been out of the game for a long time, but I am pretty sure I was taught (and in turn taught) that a router would reply using the address of the interface that originated the reply unless that interface was unnumbered, in which case it would reply from the loop-back address. Never too old to learn something. -- "Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid." --Albert Einstein From Larry's Cox account.

Job Snijders

8:27 a.m.

On Mon, May 30, 2016 at 10:03:33PM -0700, Randy Bush wrote:

...

.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

if the source of a traceroute from S toward D with TTL to expire on R, and R's FIB wants to exit via C to get back to S (yes, virginia, the internet is highly asymmetric), the source address of the time exceeded message should be C.

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

On most Linux the default behaviour is using source address "C", but this can be corrected by setting the following somewhere in your /etc/sysctl.d/ files: # make traceroute nice net.ipv4.icmp_errors_use_inbound_ifaddr=1 Kind regards, Job

Owen DeLong

2:35 p.m.

It seems to me that a plain text reading of RFC-1812 is as Randy describes undesirable. It also seems that the violation of this text is commonplace in actual implementations because of yet another time where operators have made it clear to developers that the IETF is silly. I like the Linux solution... Comply with the RFC by default and provide a knob to do the "right thing" if desired. Best of all would be to put forth an errata against RFC1813 to change the text to specify the inbound interface of the packet triggering the ICMP message when applicable. The behavior currently described should be preserved for ICMP packets which are not triggered by inbound packets. Owen

...

On May 31, 2016, at 01:27, Job Snijders <job@instituut.net> wrote:

...
On Mon, May 30, 2016 at 10:03:33PM -0700, Randy Bush wrote: .-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

if the source of a traceroute from S toward D with TTL to expire on R, and R's FIB wants to exit via C to get back to S (yes, virginia, the internet is highly asymmetric), the source address of the time exceeded message should be C.

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

On most Linux the default behaviour is using source address "C", but this can be corrected by setting the following somewhere in your /etc/sysctl.d/ files:

# make traceroute nice net.ipv4.icmp_errors_use_inbound_ifaddr=1

Kind regards,

Job

William Herrin

3:26 p.m.

On Tue, May 31, 2016 at 1:03 AM, Randy Bush <randy@psg.com> wrote:

...

.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

Hi Randy, I've thought for a number of years that routers should have an "ip icmp-error-from" interface directive which allows the operator to specify the source address for ICMP errors messages generated due to packets received on that interface. The behavior you describe where the time-exceeded message comes from C instead of A is a nuisance. The RDNS gives you clues which point in the wrong direction. Darn. Guess you'll have to rely on the preceding router to tell you where the packet came from before it reached R. The behavior Mikael notes is more deadly. Bogon filters drop packets from RFC1918 sources. They aren't subtle enough to allow ICMP errors through while dropping other IP packets. With bogon filters in place, ICMP errors originated from RFC1918 space don't reach S. PMTUD dies and your TCP connections die along with it. It's really important that an Internet router not originate ICMP from 192.168.1.1! It would also have been nice if ICMP error messages had defined a text comment field where ops could place diagnostic information such as the received interface. Overloading the functionality of the layer-3 address for any purpose (such as hanging an RDNS entry with textual diagnostic information) is bad bad bad. Probably too late to shoehorn that in. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>

Octavio Alvarez

4:08 p.m.

On 05/30/2016 10:03 PM, Randy Bush wrote:

...

rfc1812 says

4.3.2.4 ICMP Message Source Address

Except where this document specifies otherwise, the IP source address in an ICMP message originated by the router MUST be one of the IP addresses associated with the physical interface over which the ICMP message is transmitted. If the interface has no IP addresses associated with it, the router's router-id (see Section [5.2.5]) is used instead.

some folk have interpreted this to mean that, if a router R has three interfaces

.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

Do you mean the source address or the source interface? I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me). Octavio.

Hugo Slabbert

4:52 p.m.

On Tue 2016-May-31 09:08:42 -0700, Octavio Alvarez <octalnanog@alvarezp.org> wrote:

...

On 05/30/2016 10:03 PM, Randy Bush wrote:

...
rfc1812 says

4.3.2.4 ICMP Message Source Address

Except where this document specifies otherwise, the IP source address in an ICMP message originated by the router MUST be one of the IP addresses associated with the physical interface over which the ICMP message is transmitted. If the interface has no IP addresses associated with it, the router's router-id (see Section [5.2.5]) is used instead.

some folk have interpreted this to mean that, if a router R has three interfaces

.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

Do you mean the source address or the source interface?

I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

How is the latter better? What guarantees are there that the adjacent L3 device on R's interface A has a route for S and if such a route exists that it doesn't simply point at R? As Randy so eloquently put it:

...

...
(yes, virginia, the internet is highly asymmetric)

...

Octavio.

-- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal

Octavio Alvarez

1 Jun 1 Jun

9:14 p.m.

On 05/31/2016 09:52 AM, Hugo Slabbert wrote:

...

...
I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

How is the latter better? What guarantees are there that the adjacent L3 device on R's interface A has a route for S [?]

Consider this scenario: .-------. ISP1ADDR/30 { D---|B R A|---------------[ ISP 1 ]---- { `---C---' { |(towards S) { S is someplace | { over this side .----F---. { ---|G R2 H|--------------*[ ISP 2 ]---- { `--------' ISP2ADDR/30 { In the asterisk there is BCP38 filtering which won't allow ISPADDR/30. The packet expired on R incoming from ISP 1. Under Randy's scenario, the TTL-exceeded packet would get dropped by ISP2. The only way for the packet to get through is to follow RFC 1812, or to send it back through A using A's address (this follows RFC 1812 4.3.2.4).

...

and if such a route exists that it doesn't simply point at R?

If the route points back to R, then R just forwards it using the routing table as with any packet. Best regards.

William Herrin

31 May 31 May

6:22 p.m.

On Tue, May 31, 2016 at 12:08 PM, Octavio Alvarez <octalnanog@alvarezp.org> wrote:

...

...
.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

...

I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

Howdy, That doesn't make sense. There may be multiple next hops out A. If the next hop in the FIB is out C, how would the router pick the next hop to send to out A? Anyway, Randy's comment was about source address selection, not routing. With the packet coming from S into interface A, he'd prefer that the ICMP error message be sourced from the IP address assigned to A, not the IP address assigned to C or R. Regards, Bill -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>

Octavio Alvarez

1 Jun 1 Jun

9:03 p.m.

On 05/31/2016 11:22 AM, William Herrin wrote:

...

...
I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

That doesn't make sense. There may be multiple next hops out A. If the next hop in the FIB is out C, how would the router pick the next hop to send to out A?

Back to the physical address that sent the TTL-offending packet.

...

Anyway, Randy's comment was about source address selection, not routing. With the packet coming from S into interface A, he'd prefer that the ICMP error message be sourced from the IP address assigned to A, not the IP address assigned to C or R.

Thanks.

Hugo Slabbert

9:08 p.m.

On Wed 2016-Jun-01 14:03:41 -0700, Octavio Alvarez <octalnanog@alvarezp.org> wrote:

...

On 05/31/2016 11:22 AM, William Herrin wrote:

...
...
I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

That doesn't make sense. There may be multiple next hops out A. If the next hop in the FIB is out C, how would the router pick the next hop to send to out A?

Back to the physical address that sent the TTL-offending packet.

Which comes back to my question: What guarantees do you have that the device at that physical address (so, adjacent off of R's interface A) has a valid route for S, and that the route does not simply point back to R?

...

...
Anyway, Randy's comment was about source address selection, not routing. With the packet coming from S into interface A, he'd prefer that the ICMP error message be sourced from the IP address assigned to A, not the IP address assigned to C or R.

Thanks.

-- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal

William Herrin

9:42 p.m.

On Wed, Jun 1, 2016 at 5:03 PM, Octavio Alvarez <octalnanog@alvarezp.org> wrote:

...

On 05/31/2016 11:22 AM, William Herrin wrote:

...
...
I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

That doesn't make sense. There may be multiple next hops out A. If the next hop in the FIB is out C, how would the router pick the next hop to send to out A?

Back to the physical address that sent the TTL-offending packet.

Howdy, That would be an example of a layer violation. The only guarantee that layer 2 makes to layer 3 is that if you tell the layer 2 stack the layer 3 next hop address on that lan segment, it can figure out where to deliver your packet (via arp on ethernet, but this is not necessarily true of other layer 2s). Long story short, layer violations break things. Indeed, many of BGP's thornier problems and the mess that is mobile routing can all be traced to a single layer violation that TCP commits on IP. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>

Randy Bush

2 Jun 2 Jun

4:39 a.m.

...

...
.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

Do you mean the source address or the source interface?

the source address

...

I'm not sure if you mean that, if sent through C it should have the source addres of A, or that it should actually be sent through A regardless of the routing table (which sounds better to me).

not to me. i have kinda grown used to fibs randy

Marc Storck

1 Jun 1 Jun

7:16 p.m.

With BCP38 in mind, could therre be situations where Router R is not allowed to source packets with address A out of intergace C? I think that the possibility does exist. E.g. If interface A and C are upstream interfaces, router R may use an IP address from ISP A on interface A and an address from ISP C on interface C. Obviously BCP38 is not widely deployed but yet... Regards, Marc

...

On 31 mai 2016, at 07:05, Randy Bush <randy@psg.com> wrote:

rfc1812 says

4.3.2.4 ICMP Message Source Address

Except where this document specifies otherwise, the IP source address in an ICMP message originated by the router MUST be one of the IP addresses associated with the physical interface over which the ICMP message is transmitted. If the interface has no IP addresses associated with it, the router's router-id (see Section [5.2.5]) is used instead.

some folk have interpreted this to mean that, if a router R has three interfaces

.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

if the source of a traceroute from S toward D with TTL to expire on R, and R's FIB wants to exit via C to get back to S (yes, virginia, the internet is highly asymmetric), the source address of the time exceeded message should be C.

of course, simpletons such as i would desire the source of the time exceeded message to be A. after all, this is the interface to which i sent the icmp with the TTL to expire.

ras's preso, https://www.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N... page 10 illustrates this issue with rfc1812

cursory research and talking with C & J seem to indicate that they do what i want not what some folk have interpreted 1812 to mean. at least on some models.

is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

randy

William Herrin

9:45 p.m.

On Wed, Jun 1, 2016 at 3:16 PM, Marc Storck <mstorck@voipgate.com> wrote:

...

...
.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

With BCP38 in mind, could there be situations where Router R is not allowed to source packets with address A out of interface C?

Hi Marc, I think you're right. Address A in a /30 from ISP A. ISP C accepts source addresses from your /24 but not the A /30. So if the router does not follow the RFC (sends an ICMP packet out C with a source address from A), typical asynchronous routing can result in black-holding the ICMP error message. You've hit on a good reason to follow the RFC by default instead of doing what Randy wants. ;) -Bill -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>

Marc Storck

9:50 p.m.

I'm not saying anyone is wrong here. I merely want to point out eventual incompatabilities. So please don't get me wrong. Regards, Marc

...

On 1 juin 2016, at 23:46, William Herrin <bill@herrin.us> wrote:

On Wed, Jun 1, 2016 at 3:16 PM, Marc Storck <mstorck@voipgate.com> wrote:

...
...
.-----------------. | | | B |--------- D S ---------| A R | | C |--------- (toward S) | | `-----------------'

With BCP38 in mind, could there be situations where Router R is not allowed to source packets with address A out of interface C?

Hi Marc,

I think you're right. Address A in a /30 from ISP A. ISP C accepts source addresses from your /24 but not the A /30. So if the router does not follow the RFC (sends an ICMP packet out C with a source address from A), typical asynchronous routing can result in black-holding the ICMP error message.

You've hit on a good reason to follow the RFC by default instead of doing what Randy wants. ;)

-Bill

-- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>

Randy Bush

6 Jun 6 Jun

1:39 a.m.

...

is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

we verified that the juniper and cisco platforms we tested replied with the source address being the ingress interface. this is, imiho, good. a kind soul actually sent citable tests

...

At least my MikroTik RB850Gx2, running 'latest stable' (RouterOS v6.32.2) replies with the outbound interface, not the inbound.

I'd assume this is because by default, icmp_errors_use_inbound_ifaddr in linux is disabled, and they haven't changed the default.

No idea if that can be tweaked in the weird maze of mikrotik config options.

and from the same kind engineer

...

And just to add even more inconsistency, I checked on my Ubiquiti EdgeMax (a VyOS fork) which does let me check the state of sysctls:

router:/etc/sysctl.d$ cat 30-vyatta-router.conf <snip> # Send ICMP responses with primary address of exiting interface net.ipv4.icmp_errors_use_inbound_ifaddr=1 </snip>

So someone in Vyatta decided to explictly set this to be enabled.

so one win and one loss randy

Josh Reynolds

2:55 a.m.

I'm assuming you'd like this behavior on EdgeOS changed? I know a guy... On Jun 5, 2016 8:41 PM, "Randy Bush" <randy@psg.com> wrote:

...

...
is anyone seeing the dreaded rfc1812 behavior in a citable fashion? how common is it?

we verified that the juniper and cisco platforms we tested replied with the source address being the ingress interface. this is, imiho, good.

a kind soul actually sent citable tests

...
At least my MikroTik RB850Gx2, running 'latest stable' (RouterOS v6.32.2) replies with the outbound interface, not the inbound.

I'd assume this is because by default, icmp_errors_use_inbound_ifaddr in linux is disabled, and they haven't changed the default.

No idea if that can be tweaked in the weird maze of mikrotik config options.

and from the same kind engineer

...
And just to add even more inconsistency, I checked on my Ubiquiti EdgeMax (a VyOS fork) which does let me check the state of sysctls:

router:/etc/sysctl.d$ cat 30-vyatta-router.conf <snip> # Send ICMP responses with primary address of exiting interface net.ipv4.icmp_errors_use_inbound_ifaddr=1 </snip>

So someone in Vyatta decided to explictly set this to be enabled.

so one win and one loss

randy

Randy Bush

4:03 a.m.

...

I'm assuming you'd like this behavior on EdgeOS changed?

no, the opposite. j & c got it right. microtik did not. vyatta seems to have. randy

3572

Age (days ago)

3578

Last active (days ago)

List overview

Download

19 comments

10 participants

participants (10)

Hugo Slabbert
Job Snijders
Josh Reynolds
Larry Sheldon
Marc Storck
Mikael Abrahamsson
Octavio Alvarez
Owen DeLong
Randy Bush
William Herrin