My InfoWorld Column About NANOG
Dear NANOG List, Thanks for your critiques of my NANOG meeting critique column in InfoWorld. Below is a copy of a draft (before editing) of the offending column, just in case some of you have been reading only one another's critiques instead of the column itself. Of course I stand by it. Some of you guys/gals are very good at ad hominem attacks. Flaming is alive and well on the Internet. Tisk tisk. But then I asked for it. Anyway, the attention is flattering. Thank you. A few of you missed one point at least. I am NOT suggesting that any of YOU start wearing suits, especially if you find them uncomfortable, or that they make a statement you are not willing to make -- none of that, no -- good engineers are too valuable to overdress. I am suggesting that more of the kind of people who ALREADY wear suits should start paying attention to the important work NANOG is attempting and start attending your meetings so they can pitch in on the non-engineering aspects of operating the Internet. Is that clearer now? By the way, there are reports from two days ago that 400,000 people lost their Internet access for 13 hours. Sounds like an outage approaching "collapse." Was that just a Netcom thing that NANOG has no interest in? Netcom is not talking very much about what happened. Any clues/facts out there? Were any NAPs involved? /Bob Metcalfe, InfoWorld ---------------------- InfoWorld / From the Ether / Bob Metcalfe NANOG Meeting Column DRAFT TWO The North American Network Operations Group (NANOG) remains our best bet for managing through the Internet's coming collapses. Problem is, like the Internet, NANOG itself is struggling to scale up. I've just been among the 350 mostly engineers attending NANOG's May meeting at George Washington University. It's clear now, even if they hate the idea, that if NANOG is to lead us toward an industrial-strength Internet, then it must now urgently attract the active participation of many more men and women who routinely wear suits. Here, on April Fool's Day, I nominated NANOG as that organization best positioned to lift the Internet out of its current, disfunctional operations anarchy. I then incorrectly identified NANOG as part of the Internet Society's Internet Engineering and Planning Group (IEPG), a seemingly defunct sister of the Internet Engineering Task Force (IETF). Turns out I was wrong about what I'd read on the Web at http://info.isoc.org:80/adopsec. For the next two weeks postings on NANOG's message archive flamed me for not knowing that NANOG is moderated by the Merit Network at the University of Michigan (http://www.merit.edu). NANOG, I was told, has nothing to do with the Internet Society. And further, the Internet Society has nothing to do anymore with the IETF. Checking with a pal at the Society, I was told that IETF has been arguing about disassociating from the Internet Society, and, oh by the way, Merit is "irrelevant." Yes, I found pettiness and bureaucratic infighting among the groups I had hoped would be pulling the Internet together. I stand corrected, but not reassured. Back at NANOG, I was surrounded by people whose life is about "running code." I twiddled as these mostly engineers, unaccustomed as they are to public speaking, stood up one by one in front of 350 people without having ever tried their slides on GWU's projection system. We all waited while Windows booted. If you have running code, it seems, you don't have to respect your audience by checking your slides at least once in advance. Or by wearing a suit. NANOG's opening presentations on "The State of the Internet" were given by the four Network Access Points (NAPs). Pacific Bell (http://www.PacBell.COM/Products/NAP), Sprint (http://www.sprintlink.net/SPLK/HB21.html), Ameritech (http://www.ameritech.com/products/data/nap), and MFS Datanet (http://www.mfsdatanet.com/) each showed how very connected they are to various of the big Internet Service Providers (ISPs). They are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then came the three large Network Service Providers (NSPs). Sprint, ANS (http://www.ans.net), and MCI (http://www.mci.com/resources) each showed, after some Macintosh booting, that they are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then the fit hit the shan. Various earnest young speakers from Merit stood up one by one to report "alarming" statistics from the Internet -- rapidly increasing packet loss rates and routing instabilities (http://nic.merit.edu/routing.arbiter/RA/statistics). They asked the NAPs and NSPs, "Where are so many packets being lost?" "Somewhere else," came the denial. Then followed an afternoon and another morning of pleadings. For standards on traffic measurements. For regular outage reporting. For cooperation on gathering topological information to use in Internet operations management. For streamlining multilateral "peering agreements" among ISPs. For systematic use of an Internet Routing Registry. And, from an actual Internet user, pleadings for cooperation on end-to-end service measurements. Sadly, there was nobody at NANOG with the organizational sophistication to grab hold of these pleadings and accelerate them toward action. So, hey, I've got an idea, let's ask the business executives to whom current attendees of NANOG report to buy some T-shirts and take over. The Internet needs more than running code. Now, what would happen if some of NANOG's big university, NAP, and NSP regulars showed up among the many small commercial ISPs expected August 8-10 at ONE ISPCON in San Francisco? I'll be summarizing there. See www.boardwatch.com or call 800-933-6038. END ______________________________________________ ______________________________________________ Dr. Robert M. ("Bob") Metcalfe Executive Correspondent, InfoWorld and VP Technology, International Data Group Internet Messages: bob_metcalfe@infoworld.com Voice Messages: 617-534-1215 Conference Chairman for ACM97: The Next 50 Years of Computing San Jose Convention Center March 1-5, 1997 ______________________________________________ ______________________________________________
On Fri, 21 Jun 1996, Bob Metcalfe wrote:
Dear NANOG List,
Thanks for your critiques of my NANOG meeting critique column in InfoWorld. [...] Some of you guys/gals are very good at ad hominem attacks. Flaming is alive and well on the Internet. Tisk tisk. But then I asked for it. Anyway, the attention is flattering. Thank you.
Not quite...logic doesn't often lend itself to applications in politics. Statements made about you in reference to the opinions you presented were made as attempts to explain WHY you held these opinions. Some of the explanations make an awful lot of sense, IMHO, but they are only conjecture. If you'd like to enlighten us with regard to your reasoning (without resorting to lofty goals of truth and a happier {Internet|planet}), I, for one, would gladly listen. I think it's far more probable, however, that you'll rank my questions with the rest that have been posed here...unworthy of a response. // Matt Zimmerman Chief of System Management NetRail, Inc. // mdz@netrail.net sales@netrail.net // (703) 524-4800 [voice] (703) 524-4802 [data] (703) 534-5033 [fax]
to be honest, i never attacked you, bob metcalfe, the person. in fact i have nothing but respect for your accomplishments. i have also stated that i can't believe that a person of your abilities can "get it wrong" so many times without doing so somewhat intentionally. if you'd really like to know what i take issue with, it is the following: They are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then came the three large Network Service Providers (NSPs). Sprint, ANS (http://www.ans.net), and MCI (http://www.mci.com/resources) each showed, after some Macintosh booting, that they are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then the fit hit the shan. Various earnest young speakers from Merit stood up one by one to report "alarming" statistics from the Internet -- rapidly increasing packet loss rates and routing instabilities (http://nic.merit.edu/routing.arbiter/RA/statistics). They asked the NAPs and NSPs, "Where are so many packets being lost?" "Somewhere else," came the denial. now, if i have to stand up in front of my peers and say "we are losing packets on our tail circuits into the NAPs" which i did stand up and say... as did other providers. and that "we are trying to remedy this situation by adding more capacity into the NAPs and are engineering direct interconnections between ourselves and other large providers" you ought to at least get it right when you go off and tell the world: "Somewhere else," came the denial. and, yes you also said that we should respect our audience by wearing a suit - but i found that rather funny. in fact that very suggestion has been the outcome of many a practical joke played on unsuspecting newcomers from our office to the nanog and/or ietf meetings. nuff said, i return you to discussions of "running code" Jeff Young young@mci.net
A few of you missed one point at least. I am NOT suggesting that any of YOU start wearing suits, especially if you find them uncomfortable, or that they make a statement you are not willing to make -- none of that, no -- good engineers are too valuable to overdress. I am suggesting that more of the kind of people who ALREADY wear suits should start paying attention to the important work NANOG is attempting and start attending your meetings so they can pitch in on the non-engineering aspects of operating the Internet. Is that clearer now?
I will make sure not to miss this point.
By the way, there are reports from two days ago that 400,000 people lost their Internet access for 13 hours. Sounds like an outage approaching "collapse." Was that just a Netcom thing that NANOG has no interest in? Netcom is not talking very much about what happened. Any clues/facts out there? Were any NAPs involved?
That wasn't a collapse. It was a syntax error by one provider disrupting that one provider's service to it's customer base. Just because they have 400,000 customers doesn't mean that their screw-up represents a collapse of the internet.
/Bob Metcalfe, InfoWorld
----------------------
InfoWorld / From the Ether / Bob Metcalfe
NANOG Meeting Column
DRAFT TWO
The North American Network Operations Group (NANOG) remains our best bet for managing through the Internet's coming collapses. Problem is, like the Internet, NANOG itself is struggling to scale up.
Yes and no.
I've just been among the 350 mostly engineers attending NANOG's May meeting at George Washington University. It's clear now, even if they hate the idea, that if NANOG is to lead us toward an industrial-strength Internet, then it must now urgently attract the active participation of many more men and women who routinely wear suits.
This is a two-edged sword, and I suspect that you are not seeing both edges. NANOG has been a very effective and useful forum for a long time. It has allowed North American network operators a place to get down to the technical where the presenters can assume that the audience is knowledgable and heavily involved in this stuff on a daily basis. If you start attracting a lot of suits, that suffers. You can't have a bunch of suits attending a meeting like this and still talk high-end BGP-4 techinese without alienating the suits. NANOG's primary mission is to provide a forum for addressing highly technical issues. I believe that the addition of a large number of suits would hinder that process. I do agree that the suits need to build a forum, but I don't think it should be done at the expense of NANOG's strong technical focus.
Here, on April Fool's Day, I nominated NANOG as that organization best positioned to lift the Internet out of its current, disfunctional operations anarchy. I then incorrectly identified NANOG as part of the Internet Society's Internet Engineering and Planning Group (IEPG), a seemingly defunct sister of the Internet Engineering Task Force (IETF). Turns out I was wrong about what I'd read on the Web at http://info.isoc.org:80/adopsec.
IEPG is not defunct. For more information on IEPG, you would do well to contact Elise P. Gerich (@merit.edu). She is the IEPG leader. IEPG has a more global focus than NANOG. People from all over the world attend IEPG, whereas NANOG is primarly North Americo-centric. Also, IEPG tends to be less technical and more overview/big-picture oriented than NANOG. Indeed, IEPG might well be a good place to bring together the suits and the techies for the type of forum you describe.
Yes, I found pettiness and bureaucratic infighting among the groups I had hoped would be pulling the Internet together. I stand corrected, but not reassured.
I don't know who exactly you talked to, but this characterization is largely unfair. By and large, IETF, ISOC, and Merit all have their contributions. All are valuable organizations. True, there is an absence of well defined roles right now, but this is an evolving immature technology which just had it's primary infrastructure completely rearranged by the US NSF. I would say, in the face of such a revolution, the fact that the system continues to work at all is amazing, let alone continuing to sustain exponential growth and a less than exponential decay in service.
Back at NANOG, I was surrounded by people whose life is about "running code." I twiddled as these mostly engineers, unaccustomed as they are to public speaking, stood up one by one in front of 350 people without having ever tried their slides on GWU's projection system. We all waited while Windows booted. If you have running code, it seems, you don't have to respect your audience by checking your slides at least once in advance. Or by wearing a suit.
Look, we're not polished presenters. It's not our strong suit. However, we can answer the questions the other engineers come up with after we present. If we sent a polished presenter (suit) in instead, sure, he could go through the slides and would be a better speaker. But he wouldn't have the first clue what half the questions related to, and he'd probably have a hard time pronouncing half the words on the slides.
Then the fit hit the shan. Various earnest young speakers from Merit stood up one by one to report "alarming" statistics from the Internet -- rapidly increasing packet loss rates and routing instabilities (http://nic.merit.edu/routing.arbiter/RA/statistics). They asked the NAPs and NSPs, "Where are so many packets being lost?" "Somewhere else," came the denial.
You know, this is not atypical of any engineering session where one trys to resolve interdepartmental issues. You can bring our dirty laundry out for the public to see if you want, but it's not productive or helpful, any more than the very infighting you complained about earlier. The reality is that when problems occur, denial is the first defense of almost any human. I guarantee you that there were few people amongst those 350 who could be said to do less in any single day to keep the internet running than you have done in a year. You can say what you want about "rough consensus and running code" but everything else is theory. Everything else is more complicated and less effective.
Then followed an afternoon and another morning of pleadings. For standards on traffic measurements. For regular outage reporting. For cooperation on gathering topological information to use in Internet operations management. For streamlining multilateral "peering agreements" among ISPs. For systematic use of an Internet Routing Registry. And, from an actual Internet user, pleadings for cooperation on end-to-end service measurements.
You can call this pleadings if you want, but for the most part, these included proposals on how to go about it as well. Perhaps an engineering environment is sufficiently foreign to you that you don't realize that we do things a little differently from "suits". We generally put things on the table that we think will solve a problem. Then we watch as other engineers mercilessly pick it apart and tell us what's broken about it. Then we work together on ways to resolve those issues. Eventually, we come up with a design everyone thinks they can live with and we try to build something that looks like what we agreed upon. Some years ago, this process eventually generated a protocol now known as IP. Further rounds then resulted in IPv4. Later still, we are on the verge of seeing IPv6 come out of this same process.
Sadly, there was nobody at NANOG with the organizational sophistication to grab hold of these pleadings and accelerate them toward action. So, hey, I've got an idea, let's ask the business executives to whom current attendees of NANOG report to buy some T-shirts and take over. The Internet needs more than running code.
No, the Internet doesn't need more than running code. The Internet needs more running code. That is the goal you claim we lack the organizational sophistication to attain. Frankly, I think you miss alot of what happened at the meeting. Noone has to "grab hold of these pleadings." Each and every engineer in that room probably will spend some time thinking about the needs identified in those pleadings. Out of that, at some point, will come some design ideas. Probably at the next NANOG. From those design ideas, will come design reviews, followed by prototype code, followed by running code, followed by debugging, followed by a workable system. That is how the internet engineering world works. It has worked that way for a long time. I'm sorry if it isn't happening fast enough to satisfy you or make you think that we're all a great bunch of people, however, it is happening. Adding suits will only serve to muddy the waters. However, if you would like to contribute the funding to hire a full-time staff of about 15-20 of those engineers in that room and one or two suits to manage them, then allow them to work on nothing but the problems you mention, you could probably achieve results within 6-12 months. I estimate many of the results will be obtained through the existing NANOG process in 12-24 months.
Now, what would happen if some of NANOG's big university, NAP, and NSP regulars showed up among the many small commercial ISPs expected August 8-10 at ONE ISPCON in San Francisco? I'll be summarizing there. See www.boardwatch.com or call 800-933-6038.
Probably several will be there. I likely will. Technically, you could call my ISP a small one, but it's being run like a big one, and we're peering and routing like a big one. The delta is just a matter of time and customers. Owen
On Fri, 21 Jun 1996 11:31:32 EDT, Bob Metcalfe wrote:
By the way, there are reports from two days ago that 400,000 people lost their Internet access for 13 hours. Sounds like an outage approaching "collapse." Was that just a Netcom thing that NANOG has no interest in? Netcom is not talking very much about what happened. Any clues/facts out there? Were any NAPs involved?
The news reports about the outage were that somehow numerous external routes got into the internal Netcom backbone routing, and the extra load caused a chain reaction that caused everything to go down. Apparently it was mainly confined to Netcom's network. Whether by design or by dumb luck, I don't know. We currently hang off of AGIS in San Jose, and for about four hours after Netcom came back, we were up and down. Couldn't tell from here if it was AGIS, MAE West, or what, or if Netcom coming back had anything to do with it. I watched this outage from the periphery, and was completely blown away by the non-reaction to it. Official statements from Netcom (essentially confirming Bob's numbers above) were quoted on the Reuters newswire, and on the front page of the San Jose Mercury News Business section the next day (although the editor played down the impact of it a little, and mixed a one-hour AOL email outage into the same story and turned it into "outages affect online services"). On the other hand, Netcom has said essentially nothing to its subscriber base about the outage. I've seen only a little mention of it around the net. Am I looking in the wrong places -- or is there no good way to communicate about these sorts of things yet? (I've signed up to the outage discussion list, as Sean suggested.) My impression is starting to be that most Netcom subscribers didn't really notice the difference between normal Internet operations and the 13-20 hour outage, and/or didn't have the diagnostic capabilities to be able to tell. There were technically-oriented folks that could see that something was going on, but even for them, it was hard to tell what. I'm wearing two hats for the next set of questions -- the first as a technical manager for an ISP growing an international backbone, and the second as someone who's concerned about marketing the Internet (and my company) to the public. Can other big parts of the backbone fall down and take 13 (or more) hours to get back up? Or is the rest of the net engineered more redundantly than Netcom? Should I build two backbones, each with separate technologies? Was this a foreshock of the coming Metcalfean Big One, or just lousy procedures at one of the bigger ISPs? Inquiring minds want to know. Right now, it appears to be just a few (thankfully?). And now is the time to develop communications and publicity strategies for this sort of thing -- along with the engineering to hopefully prevent them. -- Pete Kaminski kaminski@nanospace.com
On Fri, 21 Jun 1996, Peter Kaminski wrote:
Can other big parts of the backbone fall down and take 13 (or more) hours to get back up? Or is the rest of the net engineered more redundantly than Netcom? Should I build two backbones, each with separate technologies?
Ask NASA how they do it. Three redundant systems using two separate technologies. But then look at NASA's downside and compare it to yours. If Netcom's customers hardly noticed this maybe the dialup market doesn't care. However, the leased line market is a whole other story and they also have the technical expertise to understand your backbone engineering and perhaps pay a higher fee to have that redundancy. This question really tangles up marketing and engineering concerns together.
Was this a foreshock of the coming Metcalfean Big One, or just lousy procedures at one of the bigger ISPs?
The bigger they are, the harder they fall. Seems to me that as ISP's and NSP's get larger, failures will be more spectacular. However, the big one depends on the ability for failures to propogate from one ISP/NSP to another and I don't think this is very likely. Partly due to the different engineering styles and partly due to the diversity of technology deployed. You have frame relay backbones, ATM fabrics, DS3 meshes with Cisco nodes and DS3 meshes with Bay nodes. Up until Netcom, the most spectacular failures I recall seeing over the past two years were either caused by NAP congestion or backhoes. NAP congestion is partially a management failure to deploy bigger pipes and routers and increase the number of NAP's in time to meet the growth in traffic flow. But it is also self-correcting as some customers migrate to NSP's with less congestion and management injects capital into their infrastructure. It seems to be a well understood problem. But to me, backhoes are the most interesting failure mode. For one, I don't think that backhoe problems can be eliminated and I think that as the physical mesh of fibre becomes more finely divided over the geography, these incidents will increase. And I also don't know of anyone taking action to protect against these events by building geographic redundancy into their backbones. This may be partly because NSP's often don't have any idea where the fibres lie and partly because they want to use a specific infrastructure like SPRINT and its railway rights of way. The incident in the Northeast where a backhoe cut a Wiltel(?) fibre bundle that was carrying critical DS3's leased by all the NSP's in the region points out how catastrophic this can be. Michael Dillon ISP & Internet Consulting Memra Software Inc. Fax: +1-604-546-3049 http://www.memra.com E-mail: michael@memra.com
Michael, On Fri, 21 Jun 1996, Michael Dillon wrote: ...
But to me, backhoes are the most interesting failure mode. For one,I don't think that backhoe problems can be eliminated
I don't know about total elimination but we're working on it. Sprint is currently deploying 4-fiber bi-directional SONET rings that will cause fiber cuts to go virtually unnoticed. Circuits are switched to a protect channel in about 50 msec after a failure of the primary path. and I think that as
the physical mesh of fibre becomes more finely divided over the geography, these incidents will increase. And I also don't know of anyone taking action to protect against these events by building geographic redundancy into their backbones. This may be partly because NSP's often don't have any idea where the fibres lie and partly because they want to use a specific infrastructure like SPRINT and its railway rights of way. The incident in the Northeast where a backhoe cut a Wiltel(?) fibre bundle that was carrying critical DS3's leased by all the NSP's in the region points out how catastrophic this can be.
Again, this may not be totally eliminated. However, we are working to provide as much physical path diversity as possible. Jim Steinhardt SprintLink Engineering
Michael Dillon ISP & Internet Consulting Memra Software Inc. Fax: +1-604-546-3049 http://www.memra.com E-mail: michael@memra.com
Can other big parts of the backbone fall down and take 13 (or more) hours to get back up? Or is the rest of the net engineered more redundantly than Netcom? Should I build two backbones, each with separate technologies? Was this a foreshock of the coming Metcalfean Big One, or just lousy procedures at one of the bigger ISPs?
Maybe we should turn off a major exchange point for 6 hours as a test. Seriously, unless you have on-staff people who are quite swift and really understand both the architecture and the implementation bogies of your particular IP network, it might not be a bad idea to build two separate backbones with different technology and/or routing policy.
Inquiring minds want to know. Right now, it appears to be just a few (thankfully?). And now is the time to develop communications and publicity strategies for this sort of thing -- along with the engineering to hopefully prevent them.
-- Pete Kaminski kaminski@nanospace.com
Avi
On Fri, 21 Jun 1996, Peter Kaminski wrote:
I'm wearing two hats for the next set of questions -- the first as a technical manager for an ISP growing an international backbone, and the second as someone who's concerned about marketing the Internet (and my company) to the public.
Can other big parts of the backbone fall down and take 13 (or more) hours to get back up? Or is the rest of the net engineered more redundantly than Netcom? Should I build two backbones, each with separate technologies? Was this a foreshock of the coming Metcalfean Big One, or just lousy procedures at one of the bigger ISPs?
Having a fully meshed/redundant network should be the goal of any serious ISP. The only one that claims it with any substance IMO is UUNET. We are trying to build one and its not easy. Haveing redundant links in place does not guarantee instant fall over of traffic. Static routes, IGRP, iBGP, bridgeing, rip1 vs rip2, etc. are some of the issues we are running into. As well as when an interface is down, but actually looks up to the router, etc..it can be done, but there are so many possible points of failure and unforseen scenarios, it is very difficult to construct and certainly takes time to develop. /stb --- Stephen Balbach "Driving the Internet To Work" VP, ClarkNet due to the high volume of mail I receive please quote info@clark.net the full original message in your reply.
On Jun 21, 11:31, Bob Metcalfe <bob_metcalfe@infoworld.com> wrote:
By the way, there are reports from two days ago that 400,000 people lost their Internet access for 13 hours. Sounds like an outage approaching "collapse."
Let's blow this into proportion, shall we? Current, somewhat overhyped, estimates say there are 60M Internet users worldwide. 400k is some 0.67% of that. Cut the hype in half if you want, and make that 1.3% instead. I say, some collapse. -- ------ ___ --- Per G. Bilse, Mgr Network Operations Ctr ----- / / / __ ___ _/_ ---- EUnet Communications Services B.V. ---- /--- / / / / /__/ / ----- Singel 540, 1017 AZ Amsterdam, NL --- /___ /__/ / / /__ / ------ tel: +31 20 6233803, fax: +31 20 6224657 --- ------- 24hr emergency number: +31 20 421 0865 --- Connecting Europe since 1982 --- http://www.EU.net e-mail: bilse@EU.net
So, Mr. Metcalfe: Is it the case that anyone who disagrees with you is a flamer? Tisk tisk.... DT On Fri, 21 Jun 1996, Bob Metcalfe wrote:
Dear NANOG List,
Thanks for your critiques of my NANOG meeting critique column in InfoWorld.
Below is a copy of a draft (before editing) of the offending column, just in case some of you have been reading only one another's critiques instead of the column itself. Of course I stand by it.
Some of you guys/gals are very good at ad hominem attacks. Flaming is alive and well on the Internet. Tisk tisk. But then I asked for it. Anyway, the attention is flattering. Thank you.
A few of you missed one point at least. I am NOT suggesting that any of YOU start wearing suits, especially if you find them uncomfortable, or that they make a statement you are not willing to make -- none of that, no -- good engineers are too valuable to overdress. I am suggesting that more of the kind of people who ALREADY wear suits should start paying attention to the important work NANOG is attempting and start attending your meetings so they can pitch in on the non-engineering aspects of operating the Internet. Is that clearer now?
By the way, there are reports from two days ago that 400,000 people lost their Internet access for 13 hours. Sounds like an outage approaching "collapse." Was that just a Netcom thing that NANOG has no interest in? Netcom is not talking very much about what happened. Any clues/facts out there? Were any NAPs involved?
/Bob Metcalfe, InfoWorld
----------------------
InfoWorld / From the Ether / Bob Metcalfe
NANOG Meeting Column
DRAFT TWO
The North American Network Operations Group (NANOG) remains our best bet for managing through the Internet's coming collapses. Problem is, like the Internet, NANOG itself is struggling to scale up. I've just been among the 350 mostly engineers attending NANOG's May meeting at George Washington University. It's clear now, even if they hate the idea, that if NANOG is to lead us toward an industrial-strength Internet, then it must now urgently attract the active participation of many more men and women who routinely wear suits. Here, on April Fool's Day, I nominated NANOG as that organization best positioned to lift the Internet out of its current, disfunctional operations anarchy. I then incorrectly identified NANOG as part of the Internet Society's Internet Engineering and Planning Group (IEPG), a seemingly defunct sister of the Internet Engineering Task Force (IETF). Turns out I was wrong about what I'd read on the Web at http://info.isoc.org:80/adopsec. For the next two weeks postings on NANOG's message archive flamed me for not knowing that NANOG is moderated by the Merit Network at the University of Michigan (http://www.merit.edu). NANOG, I was told, has nothing to do with the Internet Society. And further, the Internet Society has nothing to do anymore with the IETF. Checking with a pal at the Society, I was told that IETF has been arguing about disassociating from the Internet Society, and, oh by the way, Merit is "irrelevant." Yes, I found pettiness and bureaucratic infighting among the groups I had hoped would be pulling the Internet together. I stand corrected, but not reassured. Back at NANOG, I was surrounded by people whose life is about "running code." I twiddled as these mostly engineers, unaccustomed as they are to public speaking, stood up one by one in front of 350 people without having ever tried their slides on GWU's projection system. We all waited while Windows booted. If you have running code, it seems, you don't have to respect your audience by checking your slides at least once in advance. Or by wearing a suit. NANOG's opening presentations on "The State of the Internet" were given by the four Network Access Points (NAPs). Pacific Bell (http://www.PacBell.COM/Products/NAP), Sprint (http://www.sprintlink.net/SPLK/HB21.html), Ameritech (http://www.ameritech.com/products/data/nap), and MFS Datanet (http://www.mfsdatanet.com/) each showed how very connected they are to various of the big Internet Service Providers (ISPs). They are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then came the three large Network Service Providers (NSPs). Sprint, ANS (http://www.ans.net), and MCI (http://www.mci.com/resources) each showed, after some Macintosh booting, that they are installing new equipment to meet ramping demand, are operating well below capacity, and are not losing even a single Internet packet ever, they said. Then the fit hit the shan. Various earnest young speakers from Merit stood up one by one to report "alarming" statistics from the Internet -- rapidly increasing packet loss rates and routing instabilities (http://nic.merit.edu/routing.arbiter/RA/statistics). They asked the NAPs and NSPs, "Where are so many packets being lost?" "Somewhere else," came the denial. Then followed an afternoon and another morning of pleadings. For standards on traffic measurements. For regular outage reporting. For cooperation on gathering topological information to use in Internet operations management. For streamlining multilateral "peering agreements" among ISPs. For systematic use of an Internet Routing Registry. And, from an actual Internet user, pleadings for cooperation on end-to-end service measurements. Sadly, there was nobody at NANOG with the organizational sophistication to grab hold of these pleadings and accelerate them toward action. So, hey, I've got an idea, let's ask the business executives to whom current attendees of NANOG report to buy some T-shirts and take over. The Internet needs more than running code. Now, what would happen if some of NANOG's big university, NAP, and NSP regulars showed up among the many small commercial ISPs expected August 8-10 at ONE ISPCON in San Francisco? I'll be summarizing there. See www.boardwatch.com or call 800-933-6038.
END
______________________________________________ ______________________________________________
Dr. Robert M. ("Bob") Metcalfe Executive Correspondent, InfoWorld and VP Technology, International Data Group
Internet Messages: bob_metcalfe@infoworld.com Voice Messages: 617-534-1215
Conference Chairman for ACM97: The Next 50 Years of Computing San Jose Convention Center March 1-5, 1997 ______________________________________________ ______________________________________________
participants (11)
-
Avi Freedman
-
bob_metcalfe@infoworld.com
-
Doug Tooley
-
Jeff Young
-
Jim J. Steinhard
-
Matt Zimmerman
-
Michael Dillon
-
owen@DeLong.SJ.CA.US
-
Per Gregers Bilse
-
Peter Kaminski
-
Stephen Balbach