On Sun, Jun 15, 2008 at 11:12:25AM -0300, Rubens Kuhl Jr. wrote:
1) I've seen this behavior before; you are not alone in the universe.
Thank $DEITY for that. <grin>
2) Most likely there is a balanced channel on the path, either L3 or L2, and one of the links in the bundle is dead but has not been detected as such.
A multiple-link bundle which is load balanced by source/destination pair with an undetected dud link? I hadn't thought of that, but it does make an *awful* lot of sense. (Although, not being a big-network transit kinda person, I don't know if such a thing actually exists <grin>) I'll mention it (or ask about it) as a possibility next time I talk to the relevant people, though. Thanks, - Matt
On Sun, Jun 15, 2008 at 11:01 AM, Matt Palmer <mpalmer@hezmatt.org> wrote:
We're seeing some really weird issues with connections that go through / to Level3 IP space. Basically, certain "pairs" of IPs (particular L3 IPs coupled with particular IPs of ours) have dodgy/nonexistent connectivity, but if you change the IP at either end everything's hunky dory.
I've sniffed (from both ends) pings going from a host in L3 space to our end and seen the pings arrive at our end and head back in the direction of L3, but they never get to their destination. Traceroutes from L3 stop at the next-to-last hop, while traceroutes back get to the hop before L3 space and stop.
All of this behaviour is source/dest *pair* specific -- if I ping/traceroute from another address (in the same netblock as the problematic IP, so all the same equipment is involved) at either end, or to another address (again, same netblock) at either end, it all works again.
I've got two questions:
1) Has anyone else seen similar behaviour from L3 (or other providers, even), so I know I'm not going mad?
2) What sort of configuration problem or software bug would cause this sort of problem to occur? If it was an IP blacklist (or even a block routing issue) anywhere along the line, surely it wouldn't be sensitive to changing the other end's address to another one in the same /24?
Any insight/anecdotes/etc would be greatly appreciated, as it's starting to do my head in. Just knowing I'm not alone with this insanity would be nice at this point. <grin>
If it makes any difference, the blocks I'm working from at my end are Internap, in 74.201.254.0/23 (we don't have all of it, just most of it), while the far end is 8.12.35.0/24.