On Sat, Dec 10, 2011 at 11:49 AM, NetSecGuy <netsecguy@gmail.com> wrote:
I have a Linode VPS in Japan that I can't access from Verizon FIOS, but can access from other locations. I'm not sure who to blame.
The host, 106.187.34.33, is behind the gateway 106.187.34.1:
From FIOS to 106.187.34.1 (this works).
traceroute to 106.187.34.1 (106.187.34.1), 64 hops max, 52 byte packets
4 so-6-1-0-0.phil-bb-rtr2.verizon-gni.net (130.81.199.4) 9.960 ms 9.957 ms 6.666 ms 5 so-8-0-0-0.lcc1-res-bb-rtr1-re1.verizon-gni.net (130.81.17.3) 12.298 ms 13.463 ms 13.706 ms 6 0.ae2.br1.iad8.alter.net (152.63.32.158) 14.571 ms 14.372 ms 14.003 ms 7 204.255.169.218 (204.255.169.218) 14.692 ms 14.759 ms 13.670 ms 8 sl-crs1-dc-0-1-0-0.sprintlink.net (144.232.19.229) 13.077 ms 12.577 ms 14.954 ms 9 sl-crs1-nsh-0-5-5-0.sprintlink.net (144.232.18.200) 31.443 ms sl-crs1-dc-0-5-3-0.sprintlink.net (144.232.24.37) 33.005 ms sl-crs1-nsh-0-5-5-0.sprintlink.net (144.232.18.200) 31.507 ms 10 sl-crs1-kc-0-0-0-2.sprintlink.net (144.232.18.112) 57.610 ms 58.322 ms 59.098 ms 11 otejbb204.kddnet.ad.jp (203.181.100.45) 196.063 ms otejbb203.kddnet.ad.jp (203.181.100.13) 188.846 ms otejbb204.kddnet.ad.jp (203.181.100.21) 195.277 ms 12 cm-fcu203.kddnet.ad.jp (124.215.194.180) 214.760 ms cm-fcu203.kddnet.ad.jp (124.215.194.164) 198.925 ms cm-fcu203.kddnet.ad.jp (124.215.194.180) 200.583 ms 13 124.215.199.122 (124.215.199.122) 193.086 ms * 194.967 ms
This does not work from FIOS:
traceroute to 106.187.34.33 (106.187.34.33), 64 hops max, 52 byte packets
4 so-6-1-0-0.phil-bb-rtr2.verizon-gni.net (130.81.199.4) 34.229 ms 8.743 ms 8.878 ms 5 so-8-0-0-0.lcc1-res-bb-rtr1-re1.verizon-gni.net (130.81.17.3) 15.402 ms 13.008 ms 14.932 ms 6 0.ae2.br1.iad8.alter.net (152.63.32.158) 13.325 ms 13.245 ms 13.802 ms 7 204.255.169.218 (204.255.169.218) 14.820 ms 14.232 ms 13.491 ms 8 lap-brdr-03.inet.qwest.net (67.14.22.78) 90.170 ms 92.273 ms 145.887 ms 9 63.146.26.70 (63.146.26.70) 92.482 ms 92.287 ms 94.000 ms 10 sl-crs1-kc-0-0-0-2.sprintlink.net (144.232.18.112) 58.135 ms 58.520 ms 58.055 ms 11 otejbb203.kddnet.ad.jp (203.181.100.17) 205.844 ms otejbb204.kddnet.ad.jp (203.181.100.25) 189.929 ms otejbb203.kddnet.ad.jp (203.181.100.17) 204.846 ms 12 sl-crs1-oro-0-1-5-0.sprintlink.net (144.232.25.77) 87.229 ms sl-crs1-oro-0-3-3-0.sprintlink.net (144.232.25.207) 88.796 ms 88.717 ms 13 124.215.199.122 (124.215.199.122) 193.584 ms 202.208 ms 192.989 ms 14 * * *
Same IP from different network:
traceroute to 106.187.34.33 (106.187.34.33), 30 hops max, 60 byte packets
6 ae-8-8.ebr2.Washington1.Level3.net (4.69.134.105) 2.230 ms 1.847 ms 1.938 ms 7 ae-92-92.csw4.Washington1.Level3.net (4.69.134.158) 2.010 ms 1.985 ms ae-62-62.csw1.Washington1.Level3.net (4.69.134.146) 1.942 ms 8 ae-94-94.ebr4.Washington1.Level3.net (4.69.134.189) 12.515 ms ae-74-74.ebr4.Washington1.Level3.net (4.69.134.181) 12.519 ms 12.507 ms 9 ae-4-4.ebr3.LosAngeles1.Level3.net (4.69.132.81) 65.957 ms 65.958 ms 66.056 ms 10 ae-83-83.csw3.LosAngeles1.Level3.net (4.69.137.42) 66.063 ms ae-93-93.csw4.LosAngeles1.Level3.net (4.69.137.46) 65.985 ms ae-63-63.csw1.LosAngeles1.Level3.net (4.69.137.34) 66.026 ms 11 ae-3-80.edge2.LosAngeles9.Level3.net (4.69.144.143) 66.162 ms 66.160 ms 66.238 ms 12 KDDI-AMERIC.edge2.LosAngeles9.Level3.net (4.53.228.14) 193.317 ms 193.447 ms 193.305 ms 13 lajbb001.kddnet.ad.jp (59.128.2.101) 101.544 ms 101.543 ms lajbb002.kddnet.ad.jp (59.128.2.185) 66.563 ms 14 otejbb203.kddnet.ad.jp (203.181.100.13) 164.217 ms 164.221 ms 164.330 ms 15 cm-fcu203.kddnet.ad.jp (124.215.194.164) 180.350 ms cm-fcu203.kddnet.ad.jp (124.215.194.180) 172.779 ms cm-fcu203.kddnet.ad.jp (124.215.194.164) 185.824 ms 16 124.215.199.122 (124.215.199.122) 175.703 ms 175.700 ms 168.268 ms 17 li377-33.members.linode.com (106.187.34.33) 174.381 ms 174.383 ms 174.368 ms
In doing a little probing right now, from various source addresses, I'm unable to reproduce the problem. I've seen failures similar to this one (where the source address matters; some work, some don't) when multi-port LAGs or ECMP paths have a single link in them fail, but are still detected and forwarded over as if it was up. This can happen, for example, if you run a LAG with no channeling protocol (like LACP or PAGP), that hashes source and destination IPs to pick a link (to ensure consistent paths per-path, and with ports, per-flow). If one of those links fails in the underlying media or physical path, but the link is still detected as up, packets to some IPs (but not others) will just drop on the flor. Now, in this particular case, it doesn't seem like the path to both destinations seem like they even take the same path (so that previous hypothesis is pure conjecture). Perhaps routes were actively shifting between KDDI and Sprint (which would explain weird AS paths like VZB->Qwest->Sprint->KDDI->Sprint)? The IP space belongs to KDDI and is originated by them, and they're just allocating it for Linode's use.
The last hop is KDDI, but things work from via Level3 and not sprint. Linode blames Verizon, but I'm not seeing how it's them.
Honestly, I'd see if Linode can work with KDDI to make sure they're announcing (and others are receiving and routing) to that IP space as intended. There's not enough info here to point fingers, but I would think that they would be the organization most empowered to do anything about this. Cheers, jof