Issue with point to point VPNs behind NAT and asymmetric traffic
Hello everyone, Trying to get my head around a certain unexpected behaviour. I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends. The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF. Thanks. -- Anurag Bhatia anuragbhatia.com
Could it be as simple as a stateful firewall? Anurag Bhatia wrote on 6/12/2019 14:44:
Hello everyone,
Trying to get my head around a certain unexpected behaviour.
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Thanks.
--
Anurag Bhatia anuragbhatia.com <http://anuragbhatia.com>
My guess is something is doing stateful filtering. If you send a SYN down one link and the SYN-ACK comes back a different link, the receiving firewall will discard it as bogus. You should be able to test this by doing pcaps to confirm the traffic is arriving (though I'm not familiar with WireGuard so maybe not), and you should be able to disable this by setting a rule or unchecking a box in your firewall. On Wed, Jun 12, 2019, 5:47 PM Anurag Bhatia <me@anuragbhatia.com> wrote:
Hello everyone,
Trying to get my head around a certain unexpected behaviour.
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Thanks.
--
Anurag Bhatia anuragbhatia.com
Hi I did disable firewall at both ends to test and the result was similar. Please note firewall rules do allow the UDP ports to establish the VPN link and inside the link, there aren't any firewall restrictions. However, as I said I wonder if or if not the CGNAT device of my link 2 will allow the inbound traffic on the established link. On Thu, Jun 13, 2019 at 3:35 AM Ross Tajvar <ross@tajvar.io> wrote:
My guess is something is doing stateful filtering. If you send a SYN down one link and the SYN-ACK comes back a different link, the receiving firewall will discard it as bogus. You should be able to test this by doing pcaps to confirm the traffic is arriving (though I'm not familiar with WireGuard so maybe not), and you should be able to disable this by setting a rule or unchecking a box in your firewall.
On Wed, Jun 12, 2019, 5:47 PM Anurag Bhatia <me@anuragbhatia.com> wrote:
Hello everyone,
Trying to get my head around a certain unexpected behaviour.
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Thanks.
--
Anurag Bhatia anuragbhatia.com
-- Anurag Bhatia anuragbhatia.com
Linux by default (regardless of firewall rules) will not accept a packet on an interface when the source of that packet "should" be on another interface according to the current route table (in other words, you're doing asymetric routing). Easy fix: # Controls source route verification net.ipv4.conf.default.rp_filter = 0 # Do not accept source routing net.ipv4.conf.default.accept_source_route = 1 -----Original message----- From:Anurag Bhatia <me@anuragbhatia.com> Sent:Wed 06-12-2019 04:45 pm Subject:Issue with point to point VPNs behind NAT and asymmetric traffic To:NANOG Mailing List <nanog@nanog.org>; Hello everyone, Trying to get my head around a certain unexpected behaviour. I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends. The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF. Thanks. -- Anurag Bhatia anuragbhatia.com <http://anuragbhatia.com>
The issue is resolved by tweaking the route validation. Added following my ansible playbook for both ends: - name: Enable Controls source route verification sysctl: name: net.ipv4.conf.default.rp_filter value: '0' sysctl_set: yes - name: Do not accept source routing sysctl: name: net.ipv4.conf.default.accept_source_route value: '1' sysctl_set: yes and it works fien now. Thanks, everyone for the inputs. On Thu, Jun 13, 2019 at 3:55 AM Jerry Cloe <jerry@jtcloe.net> wrote:
Linux by default (regardless of firewall rules) will not accept a packet on an interface when the source of that packet "should" be on another interface according to the current route table (in other words, you're doing asymetric routing).
Easy fix:
# Controls source route verification net.ipv4.conf.default.rp_filter = 0 # Do not accept source routing net.ipv4.conf.default.accept_source_route = 1
-----Original message----- *From:* Anurag Bhatia <me@anuragbhatia.com> *Sent:* Wed 06-12-2019 04:45 pm *Subject:* Issue with point to point VPNs behind NAT and asymmetric traffic *To:* NANOG Mailing List <nanog@nanog.org>; Hello everyone,
Trying to get my head around a certain unexpected behaviour.
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Thanks.
--
Anurag Bhatia
anuragbhatia.com
-- Anurag Bhatia anuragbhatia.com
On 6/12/19 3:44 PM, Anurag Bhatia wrote:
Hello everyone,
Hi,
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP.
Okay. Is there any filtering of the traffic that flows through the VPNs? Or do things have full connectivity through them? What OS is on each of the VPN endpoints?
The redundancy here is taken care of by the OSPF running via FRR on both ends.
Okay.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools.
O.o
The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends).
Please clarify if those IPs are inside the VPN or outside the VPN?
As long as both ends prefer link1 or link2, it works fine.
Okay.
At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up.
Is NAT or SPI being applied to the traffic flowing through the VPN?
Any idea why asymmetric packets are being dropped here?
Not enough data to speculate yet.
This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Can I ask why the change of the VPN technology, routing daemon, and protocol all at the same time? Or was that a diagnostic step? -- Grant. . . . unix || die
On Wed, Jun 12, 2019 at 2:45 PM Anurag Bhatia <me@anuragbhatia.com> wrote:
I am running two site to site VPNs (wireguard now, OpenVPN earlier)
The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say
between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP. link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here? This is probably enabled on one or both ends: http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.rpf.html Disable it. -- William Herrin bill@herrin.us https://bill.herrin.us/
On 6/15/19 2:06 PM, William Herrin wrote:
This is probably enabled on one or both ends:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.rpf.html
Do some distros enable this now? I thought it was disabled by default.
Disable it.
Or make sure it's using loose (2) filtering. rp_filter - INTEGER 0 - No source validation. 1 - Strict mode as defined in RFC3704 Strict Reverse Path Each incoming packet is tested against the FIB and if the interface is not the best reverse path the packet check will fail. By default failed packets are discarded. 2 - Loose mode as defined in RFC3704 Loose Reverse Path Each incoming packet's source address is also tested against the FIB and if the source address is not reachable via any interface the packet check will fail. -- Grant. . . . unix || die
participants (6)
-
Anurag Bhatia
-
blakangel@gmail.com
-
Grant Taylor
-
Jerry Cloe
-
Ross Tajvar
-
William Herrin