my personal reality check on dual homing and backbone growth

7 Apr 1996

      just for kicks, here is my version of the history off the top of my
head.  For those of you who don't feel like reading all this, the 
main points are:

a/ there hasn't been a coherent "backbone" for a long time

b/ while Robert M's point about OC3 being only 3 x DS3 is only somewhat
true due to the proliferation of backbones, we're still in trouble :-)

I don't know much prior to about 1984, other than the cutover from NCP
was pretty much complete by the end of January in 1983.  The ARPAnet at
that time wasn't too overloaded (at least, it worked when I used it for
email :-).  The circuits were 56kb digital and 50 kb analog for the 
trunks, and various speeds for the PSN to host connections.

The Milnet split off around then, too, again, I don't really know
the exact topological details, but I understand it was using essentially
the same technology as the ARPAnet.

As 1985 and 1986 progressed, the ARPAnet got more loaded as more sites
came on, but I don't have any specific statistics.  Cross-country links
like the one we had through Pittsburgh (CMU, PSN #14) were apparently
the real hot spot.

BITnet was starting around then, too, mostly 9.6-19.2-56 kb links
between IBM mainframes.

In August/September of 1986, the fuzzballs for the original NSFnet
backbone were delivered to the 6 sites, and in October or so the 
links came up between them at 56kb.  Initially, there was no 
connection between the NSFnet and the ARPAnet, but there really wasn't
much on "the NSFnet" for anything to connect to, anyway, since the 
infrastructure in the supercomputer centers was just going into place,
and the regional networks weren't built either.

In early November (? it was a blur, months without sleep cause me to 
lose track of time :-) ?), Greg Christy and I got PSC-gateway working
and routing packets between the CMU PSN and the NSFnet fuzzball.  IT
was running gated code, developed mostly by Scot Brim and Mark Fedor
who were both at Cornell at the time.  I think this was the first 
NSFnet-ARPAnet connection, at least it was the first "official" one
(that's wht I was told).

Both the ARPAnet and NSFnet were quicking filling up and we were hitting
all kinds of scaling limits (not enough VCs on the X.25 cards, routing
slots in just about everything from Proteon P4200s to whatever) 
throughout 1987.  We used default for stubs, but pretty much everything
was trying to carry full routing, which at that time was a couple of
hundred routes (maybe several hundred, I can dig up old stats if anyone
actually cares).  

Since the backbones were full, and there wasn't a lot of cash around 
for shared infrastructure, but the supercomputer centers got $$$ to 
connect their affiliates, some "regional" networks went in - with many
links at T1 (at that point it was difficult to convince equipment 
vendors, especially CSU/DSU manufacturers, that there was a need for 
clear channel T1's - "no one could ever need to send that much data". :-).

PSC, NCSA, SDSC, and JvNC all built "affiliate" networks.  NCAR built
the UCAR satellite network, and other regionals like MIDnet, BARRnet,
NWnet, SURAnet and NYSERnet were built throughout late 1986 and
1987 (and obviously continued to grow).  I remember that JvNC had the
Univ. of Colorado (one of their supercomputer affiliates) connected,
and UCAR connected sites all over the place.  I forgot about WESTnet,
which was another regional which was somehow affiliated with Univ. of
Colorado/NCAR.  At the PSC, we put in connections to the Univ. of
Maryland, which had a UCAR connection and an ARPAnet connection, and
to the Univ of Michigan, which connected to the ARPAnet, too, etc.
etc.

Bottom line: there were lots of opportunities even then for dual homing, 
in many cases the higher speed links were limited to traffic going 
between a pair of regionals/supercomputer centers, and the backbones
were for general purpose traffic.  Since the commercial equipment 
vendors (cisco and Proteon) were adding advanced features like 
split horizon for RIP to their products ( :-) :-) ), there were many
late night sessions trying to figure out routing backdoors and loops
and such (in the presence of overflowing routing tables and other
craziness), with new gated versions being developed by Mark F. several
times a week, installed at various points in the ARPAnet-NSFnet 
interconnects, and Dave Mills writing new fuzzball code and also 
installing regularly.  

(does this sound familiar?)

Packet loss and congestion were certainly normal events, I was 
generally impressed when things worked at all.  SNMP and traceroute
didn't exist yet.  And we had to walk uphill to work both ways in
the snow every day.  Not that we ever got to go home because the 
network was always broken in one way or the other.

In December of 1987 the contract for the new NSFnet was awarded to 
the triumvirate of Merit, IBM, and MCI.  Their target was to get the
T1 based network installed by July 1 of 1988.  

In the next six months, things really sucked, Mark Oros was working at
Cornell with the responsibility to make sure that people's gated configs
were consistent with each other around the country to prevent feedback
between the NSFnet, ARPAnet, and various regional network backdoors.
A human route arbiter with no enforcement can only be so effective.

By some hard work and luck the new NSFnet came up on time and worked
well, which is good since everything had collapsed by that point, 
during the day the network was basically unusable.  The backbone was
made up of IDNX T1 muxes which interconnected IBM RTs (there were
something like 9 of these at each of the backbone sites).  The IDNXs
and weak interface support on the RTs limited bandwidth on the trunks
to something like 1/3rd of the T1, but it was still an improvement.
Later in 1988 or early in 1989, more trunks were added, and also cards
were installed in the RTs that could actually approach doing T1 rates,
again, happening just in time due to severe backbone congestion.
Someone from the Merit group can probably provide actual dates for 
these upgrades.

the ARPAnet was still at 56kb at this point and there were several 
places which were connected to both networks, and Merit maintained 
*by hand* this database which tracked which routes were accepted at
each backbone node.  The database only changed twice a week, early
on Tuesday and Friday mornings I believe.

During 1989 and 1990 is when things were probably the most coherent 
from a routing perspective as ever, because soon after that is when
the new backbones really started to sprount.

Alternet and PSI both built nationwide commercial backbones, and 
Sprint put in the international connections under the ICMnet 
program, NASA built NSI, DOE built ESnet, and FIX-East and FIX-West
went in, that was in very late 1989/early 1990 (so I guess the period
of there being a real "single backbone" was pretty short, like half
of 1989 maybe, at least when the clear bulk of the traffic was running
over the NSFnet backbone).  Also CA*Net was going in and connecting
to the US at 3 (?) spots adding additional routing challenges.

In 1990 there was also clearly enough more bandwidth and enough site
connected to the IP networks that both BITnet and USENET really cutover
onto the Internet backbone in a big way, both adding heavily to the 
traffic load.

"New" regionals like CICnet, CERFnet, NEARnet, Los Nettos, CONCERT,
PREPnet, etc.  were also going in during this time, typically with T1
backbones.  The other regionals continued to grow, and dual homing of
sites that had an ARPAnet connection or an ESnet connection, etc.
added to the confusion since hardly anyone knew how to do dual homing,
and gated was complicated, and their weren't enough of the right knobs
on the ciscos and proteons, and the routing table was growing fast (it
hit 1000 entries in 1990 sometime I believe).  At SURAnet we had a
person working probably 2 days a week trying to keep the routing table
stuff straight since we were at least nominally responsible for
FIX-East (people called us when it broke) and there were lots of dual
homed sites, and the international connections coming in through UUnet
which we had to submit to the Merit folks since we were the only ones
designated to authorize changes to the NSFnet database for the College
Park node.

In 1991, the DS3 NSFnet backbone started to go in, but it was more 
problemsome than the initial T1 backbone - plus more sites were
added at MIT, Georgia Tech, and Argonne Labs, creating more 
interesting backbone meshing between ESnet and NSFnet (all three
sites were ESnet sites, too, or at least MIT and Argonne), and 
more late night phone calls and email to Cathy Wittbrodt and 
Mike Collins and Tony Hain and the Merit folks trying to keep things
working.

The insanity has continued through the creation of MAE-East, and now
the other current NAPS and other interconnects in 1992 and since then,m
along with the building of new backbones, initially SprintLink and more
recently internetMCI, and Net99/AGIS, and of course the emergence of
ANS from the IBM/MCI/Merit group (which actually happened in 1990
I believe, but that's another story altogether, with its own biases
and viewpoints and perspectives, refer to the vast com-priv archives
of the day if you have the time and stomach for it :-).

The backbone has continued to grow in number of interconnections, and 
in trunk bandwidth (raw pipe speed, interface speed, buffering), 
more dense meshing, more routes, more international links, and always
more complexity.  In general it has continued to work in some rough 
sense of the word "work", primarily due to a lot of effort of now 
quite enough individuals who understand how things plugged together
but were somehow able to keep things running while educating those
bright and willing (and crazy) individuals who were ready to step up
to the task of helping out (unfortunately there were never enough of
them, and there still aren't it seems).  

I probably left out a lot of stuff, either due to my faulty memory or
ignorance of the facts, or simply my limited perspective on the
situation, feel free to provide your own facts as you desire.

Anyway, as I kind of said in the beginning, there have pretty much 
always been multiple backbones, routing has pretty much always been 
screwed up in one way or another, and traffic loads have pretty much 
always been at the verge of melting the net (sometimes succeeding in
doing so).  

have a nice day.

					dave

Dave O'Leary

tags

participants (1)