my personal reality check on dual homing and backbone growth
just for kicks, here is my version of the history off the top of my head. For those of you who don't feel like reading all this, the main points are: a/ there hasn't been a coherent "backbone" for a long time b/ while Robert M's point about OC3 being only 3 x DS3 is only somewhat true due to the proliferation of backbones, we're still in trouble :-) I don't know much prior to about 1984, other than the cutover from NCP was pretty much complete by the end of January in 1983. The ARPAnet at that time wasn't too overloaded (at least, it worked when I used it for email :-). The circuits were 56kb digital and 50 kb analog for the trunks, and various speeds for the PSN to host connections. The Milnet split off around then, too, again, I don't really know the exact topological details, but I understand it was using essentially the same technology as the ARPAnet. As 1985 and 1986 progressed, the ARPAnet got more loaded as more sites came on, but I don't have any specific statistics. Cross-country links like the one we had through Pittsburgh (CMU, PSN #14) were apparently the real hot spot. BITnet was starting around then, too, mostly 9.6-19.2-56 kb links between IBM mainframes. In August/September of 1986, the fuzzballs for the original NSFnet backbone were delivered to the 6 sites, and in October or so the links came up between them at 56kb. Initially, there was no connection between the NSFnet and the ARPAnet, but there really wasn't much on "the NSFnet" for anything to connect to, anyway, since the infrastructure in the supercomputer centers was just going into place, and the regional networks weren't built either. In early November (? it was a blur, months without sleep cause me to lose track of time :-) ?), Greg Christy and I got PSC-gateway working and routing packets between the CMU PSN and the NSFnet fuzzball. IT was running gated code, developed mostly by Scot Brim and Mark Fedor who were both at Cornell at the time. I think this was the first NSFnet-ARPAnet connection, at least it was the first "official" one (that's wht I was told). Both the ARPAnet and NSFnet were quicking filling up and we were hitting all kinds of scaling limits (not enough VCs on the X.25 cards, routing slots in just about everything from Proteon P4200s to whatever) throughout 1987. We used default for stubs, but pretty much everything was trying to carry full routing, which at that time was a couple of hundred routes (maybe several hundred, I can dig up old stats if anyone actually cares). Since the backbones were full, and there wasn't a lot of cash around for shared infrastructure, but the supercomputer centers got $$$ to connect their affiliates, some "regional" networks went in - with many links at T1 (at that point it was difficult to convince equipment vendors, especially CSU/DSU manufacturers, that there was a need for clear channel T1's - "no one could ever need to send that much data". :-). PSC, NCSA, SDSC, and JvNC all built "affiliate" networks. NCAR built the UCAR satellite network, and other regionals like MIDnet, BARRnet, NWnet, SURAnet and NYSERnet were built throughout late 1986 and 1987 (and obviously continued to grow). I remember that JvNC had the Univ. of Colorado (one of their supercomputer affiliates) connected, and UCAR connected sites all over the place. I forgot about WESTnet, which was another regional which was somehow affiliated with Univ. of Colorado/NCAR. At the PSC, we put in connections to the Univ. of Maryland, which had a UCAR connection and an ARPAnet connection, and to the Univ of Michigan, which connected to the ARPAnet, too, etc. etc. Bottom line: there were lots of opportunities even then for dual homing, in many cases the higher speed links were limited to traffic going between a pair of regionals/supercomputer centers, and the backbones were for general purpose traffic. Since the commercial equipment vendors (cisco and Proteon) were adding advanced features like split horizon for RIP to their products ( :-) :-) ), there were many late night sessions trying to figure out routing backdoors and loops and such (in the presence of overflowing routing tables and other craziness), with new gated versions being developed by Mark F. several times a week, installed at various points in the ARPAnet-NSFnet interconnects, and Dave Mills writing new fuzzball code and also installing regularly. (does this sound familiar?) Packet loss and congestion were certainly normal events, I was generally impressed when things worked at all. SNMP and traceroute didn't exist yet. And we had to walk uphill to work both ways in the snow every day. Not that we ever got to go home because the network was always broken in one way or the other. In December of 1987 the contract for the new NSFnet was awarded to the triumvirate of Merit, IBM, and MCI. Their target was to get the T1 based network installed by July 1 of 1988. In the next six months, things really sucked, Mark Oros was working at Cornell with the responsibility to make sure that people's gated configs were consistent with each other around the country to prevent feedback between the NSFnet, ARPAnet, and various regional network backdoors. A human route arbiter with no enforcement can only be so effective. By some hard work and luck the new NSFnet came up on time and worked well, which is good since everything had collapsed by that point, during the day the network was basically unusable. The backbone was made up of IDNX T1 muxes which interconnected IBM RTs (there were something like 9 of these at each of the backbone sites). The IDNXs and weak interface support on the RTs limited bandwidth on the trunks to something like 1/3rd of the T1, but it was still an improvement. Later in 1988 or early in 1989, more trunks were added, and also cards were installed in the RTs that could actually approach doing T1 rates, again, happening just in time due to severe backbone congestion. Someone from the Merit group can probably provide actual dates for these upgrades. the ARPAnet was still at 56kb at this point and there were several places which were connected to both networks, and Merit maintained *by hand* this database which tracked which routes were accepted at each backbone node. The database only changed twice a week, early on Tuesday and Friday mornings I believe. During 1989 and 1990 is when things were probably the most coherent from a routing perspective as ever, because soon after that is when the new backbones really started to sprount. Alternet and PSI both built nationwide commercial backbones, and Sprint put in the international connections under the ICMnet program, NASA built NSI, DOE built ESnet, and FIX-East and FIX-West went in, that was in very late 1989/early 1990 (so I guess the period of there being a real "single backbone" was pretty short, like half of 1989 maybe, at least when the clear bulk of the traffic was running over the NSFnet backbone). Also CA*Net was going in and connecting to the US at 3 (?) spots adding additional routing challenges. In 1990 there was also clearly enough more bandwidth and enough site connected to the IP networks that both BITnet and USENET really cutover onto the Internet backbone in a big way, both adding heavily to the traffic load. "New" regionals like CICnet, CERFnet, NEARnet, Los Nettos, CONCERT, PREPnet, etc. were also going in during this time, typically with T1 backbones. The other regionals continued to grow, and dual homing of sites that had an ARPAnet connection or an ESnet connection, etc. added to the confusion since hardly anyone knew how to do dual homing, and gated was complicated, and their weren't enough of the right knobs on the ciscos and proteons, and the routing table was growing fast (it hit 1000 entries in 1990 sometime I believe). At SURAnet we had a person working probably 2 days a week trying to keep the routing table stuff straight since we were at least nominally responsible for FIX-East (people called us when it broke) and there were lots of dual homed sites, and the international connections coming in through UUnet which we had to submit to the Merit folks since we were the only ones designated to authorize changes to the NSFnet database for the College Park node. In 1991, the DS3 NSFnet backbone started to go in, but it was more problemsome than the initial T1 backbone - plus more sites were added at MIT, Georgia Tech, and Argonne Labs, creating more interesting backbone meshing between ESnet and NSFnet (all three sites were ESnet sites, too, or at least MIT and Argonne), and more late night phone calls and email to Cathy Wittbrodt and Mike Collins and Tony Hain and the Merit folks trying to keep things working. The insanity has continued through the creation of MAE-East, and now the other current NAPS and other interconnects in 1992 and since then,m along with the building of new backbones, initially SprintLink and more recently internetMCI, and Net99/AGIS, and of course the emergence of ANS from the IBM/MCI/Merit group (which actually happened in 1990 I believe, but that's another story altogether, with its own biases and viewpoints and perspectives, refer to the vast com-priv archives of the day if you have the time and stomach for it :-). The backbone has continued to grow in number of interconnections, and in trunk bandwidth (raw pipe speed, interface speed, buffering), more dense meshing, more routes, more international links, and always more complexity. In general it has continued to work in some rough sense of the word "work", primarily due to a lot of effort of now quite enough individuals who understand how things plugged together but were somehow able to keep things running while educating those bright and willing (and crazy) individuals who were ready to step up to the task of helping out (unfortunately there were never enough of them, and there still aren't it seems). I probably left out a lot of stuff, either due to my faulty memory or ignorance of the facts, or simply my limited perspective on the situation, feel free to provide your own facts as you desire. Anyway, as I kind of said in the beginning, there have pretty much always been multiple backbones, routing has pretty much always been screwed up in one way or another, and traffic loads have pretty much always been at the verge of melting the net (sometimes succeeding in doing so). have a nice day. dave
participants (1)
-
Dave O'Leary