What would work better/faster?
my-noc -> b0rken-noc
or
my-noc -> my-upstream-noc -> b0rken-noc-upstream-noc -> b0rken-noc
?
OK, rant time (blame the easter long weekend... a 4 day weekend down here... and associated excessive alcohol)... General comment: the below isn't meant to reflect badly on any of our past or present providers or peers... and in the most part problems mentioned relate to previous suppliers so please don't try to guess who they could be about :-) Becomes much more relevant when you're not in America. Often a company in, say, Singapore or New Zealand may manage an Australian company's connection to the US internet. And then said Australian company may have a problem connecting through the US internet to, say, China via Japan (which the company I work at doesn't do anymore - one of our providers now has connectivity via Singapore to China which is much better, but that still isn't the case for many in Australia). You want to think how many NOCs and language barriers there can be in that path? And peering relationships, timezone changes (harder to get good engineers sometimes, and 24 hour NOCs aren't common in many countries), etc? Or, we can directly contact a NOC in southern China and get resolution as well as having a very satisfied customer because all his other upstreams attempted and failed the NOC to upstream NOC through a massive number of NOCs who couldn't resolve the issue. The problem is when you take this approach you have to be very sure of which AS is causing the real problem (and/or what the real problem is - calling your upstream's upstream and telling them to tune their tx-ring-limits is another example, where your direct upstream at the time may not have heard of such a thing to know to relay the fault in a way the remote NOC would work out what the problem was and how to fix it. of course the better thing to tell the provider in question should have been "don't try and put that many OC3 cards in a 7206!"). Admittedly the escalation in the southern China case (which wasn't our standard problem with providers in China turning on routers which make classful assumptions, and us having some 61.* IP space) was: customer's customers -> customer's NOC -> our NOC -> problem site's upstream's NOC (who liased with problem site, and fortunately spoke english - the problem site didn't, but if it had been an issue, our customer's NOC had offered to translate) but that cut out a _lot_ of NOCs. To me there's some maximum number of NOCs to be involved in a problem to coordinate well, and it is around 4 (end ISP NOC, their upstream NOC to confirm the problem, problem site's upstream NOC to enforce fixing of problem, problem site's actual NOC), which then becomes 3 in the case where the problem network is someone like sprint, at&t or uunet who we wouldn't consider to actually have an "upstream" (and for the record in the cases I've had to, I haven't had a problem dealing those three directly even though we're not a customer; maybe I've just been lucky). An Australian company who is being directly affected by a problem may keep good staff on until the right time to contact a US or other international NOC directly during their working hours and get decent staff, rather waiting on all the various NOCs to miscommunicate the problem across various hops. Another problem is "follow the clock" NOCs and trying to call at the right time to get a US operator, since operators in the UK or Singapore in a certain ISP had pretty much no access to their routers and could do nothing more than email the US staff and hope to get some resolution 12 hours later... the country that took the call owned the problem, but had to pass it off internally, then wait till that country was active again to call the customer back, repeat that a few times to convince them of the actual fault. Glad I don't deal with that particular company anymore :-) I haven't had a problem from large US providers in providing me a trouble ticket even though we're far removed from being a customer. And we've found the "trouble" has been things as lame as a certain large US provider putting a /32 static backhole in one of their routers, and following the "correct" escalation path NOC to NOC in one case (since it was minor and worked around) did nothing for a week, a direct email (in that case, calls are for more urgent issues :-)) to their US NOC and the problem was fixed within an hour. The only group in the US I've found hard to deal with in any way internet operationally related was a bad experience and waste of international calls to NetSol/VeriSign, they had no intent to deal with a _customer_ in a timely manner over an urgent change (domain change for a company who had just gone into liquidation and were about to lose the routability of their IP space in 48 hours, and NetSol's systems weren't accepting IP changes for the nameservers due to what turned out to be design problems in their database application - the permission to change info update in some cases needs 24+ hours to propogate internally before you can make changes under the new permissions... ugly). David.