Kent, As a former network design guy who's done traffic engineering and design (and redesign) on many networks (Internet and otherwise), I disagree that traffic engineering doesn't work for the Internet. I've seen many people go with the "throw bandwidth at the problem" as a cure all. While it tends to work, it tends to be the most expensive method of solving the problem. Doing traffic engineering right is hard. The telcos have it down pat for their voice networks, and telco-based ISPs often have applied this design expertise to their ISP network. Having a person do traffic engineering can save the ISP big bucks. The traffic engineering techniques I'm talking can't handle wildly dynamic situations. For example, a news event like Princess Di's death greatly increases traffic to/from England which plays temporary havoc with forecasted traffic projections. However, outside of these anomalies, traffic projections work pretty well. I've outlined a basic technique below which works for many types of networks, and has some ISP specific steps. The key to this analysis is that it takes into account the underlying traffic flows and then determines the appropriate physical backbone topology, or the changes to be made to an existing topology. This is directly in contrast to the "throw bandwidth at the problem" case that patches a backbone topology which might be sub-optimal in the first place. Here's the overall outline: 1. Divide your network into a small number of geographic areas (between five and ten). Each geographic area you choose probably has a large city that serves as a major traffic source for that area. These cities are usually the natural cities for backbone connectivity. Create an NxN matrix, where N is the total number of areas in your network. Each cell in the matrix will represent the total traffic demand between each source/destination geographic area. There are several factors which effect this matrix, each of which will be discussed below. 1. The locality of traffic. 2. The typical utilization of customers. 3. The entry/exit points of traffic from your network. 2. Identify which % of traffic, if any, has regional locality. For pure Internet traffic, the probability that the source and destinatino of traffic are within the same metropolitan area tends to be low (10% or lower for metros within the US). However, there are exceptions. Telecommuting applications tend to have very high locality. People close to work dial into work through an ISP, so both the source and destination of traffic tends to stay local (70% or higher). Places like the Bay Area in California also tend to have higher traffic locality. This is because the Bay Area has lots of Internet users (which tend to be traffic sinks), and lots of web sites (which tend to be traffic sources). ISPs outside of the US tend to have a higher percentage of traffic staying within the country, especially non-english speaking countries. 3. Measure/estimate the typical utilization of customers. Utilization needs to be measured/estimated in both send and receive directions. Dial-up users typically receive almost seven times as much as they send. Corporate customers not doing telecommuting applications tend to receive about four times as much as they send (less because corporations have web sites that others access). Web server farms have the opposite characteristics of dial-up users. Percentage utilization tends to increase with bandwidth. In the U.S., a T1 customer connection typically has a peak recieve utilization of 20% or less. However, a DS3 customer can easily have a receive utilization of 50% or more. Simple explanation is that someone paying big bucks for a DS3 wants to make sure it is justified. So, take the total number of users in each area, the connection speeds and customer types, multiply by the appropriate factors, and you get the total demand you are trying to serve out of each area. Take this traffic demand, and multiply it by the non-local traffic. This represents the total traffic that you need to get either in/out of the network, or in/out of this particular part of your network. 4. Determine the entry/exit points for traffic with your network, and its effect upon the traffic matrix. How do you setup your routing policies? Many ISPs use nearest exit. If the nearest exit is in the same geographic area, the traffic sent by your customers does not affect any other part of the overall traffic matrix. If the nearest exit is not within the same geographic area, determine the area where this traffic will be sent. Enter this value in the appropriate source/destination box of the traffic matrix. It gets harder when peering with many other ISPs, some of whom you connect to in the same area, and others in remote areas. In this case, determine which percentage of the traffic goes into each particular region, and The main traffic sources into your network (excluding your customers) are your peering points (both public and private). The amount of traffic from each peering point is measurable. You can generally estimate that this traffic is to be distributed proportional to the overall traffic demand in each geographic area. This is a significant amount of matrix math, but the overall concept is simple. Determine the overall flow between one part of your network to another. 5. With me so far? Good, now it's time to design your backbone to handle your demands. You can use dedicated lines or layer two services such as Frame Relay or ATM. The simplicity of using Frame Relay or ATM is that the circuits you need between each geographic area has been defined by your traffic matrix. This is part of the appeal of using public L2 services for a backbone. Designing your own backbone is a bit harder. The actual topology tends to be straightforward--you need to connect up the major cities in each of the geographic areas. For five areas, a simple ring suffices. For up to 10 areas, this tends to be rings bisected once or twice. The real work in designing your own backbone is in satisfying the traffic demands going across your network. Remember that geographic areas in the center of your network have to carry the traffic demands going across your network. This imposes a heavier burden in the center of the network than the traffic matrix would indicate. You also have to worry about resiliency, having sufficient bandwidth when the backhoes go fiber hunting, etc. 6. Design the network within each geographic areas. The steps for designing the network within each geographic area tend to be similar to that of designing the overall network. Breaking the overall design process into a regional network and backbone network makes the problem more tractable. 7. Measure data from a real network. This is really important. You've made lots of assumptions. Regularly check the overall traffic to see if it matches the assumptions. Refine the traffic matrix to see if it still represents reality. Create trendlines which show the overall traffic changes to/from each area, and project these trendlines into the future. You will tend to have pretty good certainty about 4 months into the future, with the value of the information decaying after that. Use this data to determine where to add additional peering points. Estimate what impact this would have on the traffic matrix. 8. Factor the measured and projected data into the next network backbone design. This next backbone design gives you the optimum backbone given the underlying flows in your network. See what changes you need to make to your backbone to get to this new optimum backbone, and order the circuits. Phew! Like I said earlier, it is hard to do right, and I've left out quite a few details in the above outline. But having been there, done that, (quite a few times) I can say it really works. And it saves ISPs money! Question for NANOG members. How important is traffic engineering given that it is fairly hard to do properly and you folks have enough other things to think about? Prabhu Kavi IP Business Marketing Manager Ascend Communications prabhu.kavi@ascend.com ______________________________ Reply Separator _________________________________ Subject: Chanukah [was Re: Hezbollah] Author: "Kent W. England" <kwe@geo.net> at smtplink Date: 9/16/97 2:09 PM At 05:03 PM 14-09-97 -0400, Dorian R. Kim wrote:
... One of the things that needs to be engineered into building and maintaining national/international backbones is traffic accounting to an arbitrary granularity that paves the way for better traffic engineering and bandwidth projections. There are already ample tools to to per-prefix matrix of traffic right now. Tying this in with good sales projections will alleviate much of the last minute fire fighting.
This will most likely never be 100% accurate and precise, but there is no reason why we can't get a better handle on bandwidth forecasts. (say to 95% percentile)
Dorian; I don't want to throw cold water on the value of planning and foresight, but in terms of predicting traffic patterns it has never worked on the Internet. It sounds good and that was the argument that all the mainframe networkers made to us early Internet networkers -- Why can't you tell me upfront what your bandwidth requirements are going to be? Don't you know exactly how many terminals you have and where they are and what application keystrokes are going to be pressed at any given time? How else can you guarantee response time in your network? This Internet stuff is stupid. It'll never work. Somehow with the way that HTTP/HTML caught fire and Internet-CB (aka VocalTec and CUSeeMe) took off, I would be loath to think I could project my backbone needs with any reliability based on *historical* projections.
Furthermore, with the deployment of WDM and Internet core devices moving closer to the transmission gear, if you have access to fiber, getting more bandwidth may become as straightforward as using an additional wavelength on the ADM that your router's plugged into.
-dorian
This I like a lot better as a design technique. Throwing more bandwidth at the problem almost always works (unless the transport protocol is broken). Like Peter Kline said, Turn up the speed dial upon onset of congestion. Simple. Effective. Then again, creating a data architecture for the web (a problem that has been recognized, but not addressed in the last five years) would eliminate much of the backbone bandwidth demand. What would happen if -- presto -- a data architecture for the web showed up one day? A lot of backbone bandwidth would become surplus and a lot more edge bandwidth would be needed ASAP. What does that do to historical projections? --Kent