All, I am chairing an effort in the IEEE 802.3 Ethernet Working Group to understand bandwidth demand and how it will impact future Ethernet needs. This is exactly the type of discussion i would like to get shared with this activity. I would appreciate follow-on conversations with anyone wishing to share their observations. Regards, John D'Ambrosia Chair, IEEE 802.3 New Ethernet Applications Ad hoc -----Original Message----- From: NANOG <nanog-bounces@nanog.org> On Behalf Of James Bensley Sent: Thursday, April 4, 2019 4:41 AM To: Tom Ammon <thomasammon@gmail.com>; NANOG <nanog@nanog.org> Subject: Re: modeling residential subscriber bandwidth demand On Tue, 2 Apr 2019 at 17:57, Tom Ammon <thomasammon@gmail.com> wrote:
How do people model and try to project residential subscriber bandwidth demands into the future? Do you base it primarily on historical data? Are there more sophisticated approaches that you use to figure out how much backbone bandwidth you need to build to keep your eyeballs happy?
Netflow for historical data is great, but I guess what I am really asking is - how do you anticipate the load that your eyeballs are going to bring to your network, especially in the face of transport tweaks such as QUIC and TCP BBR?
Tom
Hi Tom, Historical data is definitely the way to predict a trend, you can’t call something a trend if it only started today IMO, something (e.g. bandwidth profiling) needs to have been recorded for a while before you can say that you are trying to predict the trend. Without historical data you're just making predications without any direction, which I don't think you want J Assuming you have a good mixture of subs, i.e. adults, children, male, female, different regions etc. and 100% of your subs aren't a single demographic like university campuses for example; then I don't think you need to worry about specifics like the adoption of QUIC or BBR. You will never see a permeant AND massive increase in your total aggregate network utilisation from one day to the next. If for example, a large CDN makes a change that increases per-user bandwidth requirements, it's unlikely they are going to deploy it globally in one single big-bang change. This would also be just one of your major bandwidth sources/destinations, of which you'll likely have several big-hitters that make up the bulk of your traffic. If you have planned well so far, and have plenty of spare capacity (as others have mentioned, in the 50-70% range and your backhaul/peering/transit links are of a reasonable size ratio to your subs, e.g. subs get 10-20Mbps services and your links are 1Gbps) there should be no persisting risk to your network capacity as long as you keep following the same upgrade trajectory. Major social events like the Super Bowl where you are (or here in England, sunshine) will cause exceptional traffic increases, but only for brief periods. You haven't mentioned exactly what you're doing for modelling capacity demand (assuming that you wanted feedback on it)? Assuming all the above is true for you, to give us a reasonable foundation to build on; In my experience the standard method is to record your ingress traffic rate at all your PEs or P&T nodes, and essentially divide this by the number of subs you have (egress is important too, it's just usually negligible in comparison). For example, if your ASN has a total average ingress traffic rate of 1Gbps at during peak hours and, you have 10,000 subs, you can model on say 0.1Mbps per sub. That’s actually a crazily low figure these days but, it’s just a fictional example to demonstrate the calculation. The ideal scenario is that you have this info for as long as you can. Also, the more subs you have the better it all averages out. For business ISPs, bringing on 1 new customer can make a major difference, if it’s a 100Gbps end-site site and your backbone is a single 100Gbps link you could be in trouble. For residential services, subs almost always have slower links than your backbone/P&T/PE nodes. If you have different types of subs it’s also worth breaking down the stats by sub type. For example; we have ADSL subs and VDSL subs. We record the egress traffic rate on the BNGs towards each type of sub separately and then aggregate across all BNGs. For example, today peak inbound for our ASN was X, of that X, Y went to ADSL subs and Z when to VDSL subs. Y / $number_of_adsl_subs == peak average for an ADSL line and, Z / $number_of_vdsl_subs == peak average for a VDSL line. It’s good to know this difference because a sub migrating from ADSL to VDSL is not the same as getting a new sub in terms of additional traffic growth. We have a lot of users upgrading to VDSL which makes a difference at scale, e.g 10K upgrades is less additional traffic than 10k new subs. Rinse and repeat for you other customer types (FTTP/H, wireless etc.)
On Tue, Apr 2, 2019 at 2:20 PM Josh Luthman <josh@imaginenetworksllc.com> wrote:
We have GB/mo figures for our customers for every month for the last ~10 years. Is there some simple figure you're looking for? I can tell you off hand that I remember we had accounts doing ~15 GB/mo and now we've got 1500 GB/mo at similar rates per month.
I'm mostly just wondering what others do for this kind of planning - trying to look outside of my own experience, so I don't miss something obvious. That growth in total transfer that you mention is interesting.
You need to be careful with volumetric based usage figures. As links continuously increase in speed over the years, users can transfer the same amount of data in less bit-time. The problem with polling at any interval (be it 1 seconds or 15 minutes) is that you miss bursts in between the polls. Volumetric based accounting misses the link utilisation which is how congestion is identified. You must measure utilisation and divide that by $number_of_subs. Your links can be congested and if you only measure by data volume transferred, you’ll see month by month subs transferred the same amount of data overall but day by day, hour by hour, it took longer because a link somewhere is congested, and everyone is pissed off. So with faster end-user speeds, one may have shorter but high core link utilisation.
I always wonder what the value of trying to predict utilization is anyway, especially since bandwidth is so cheap. But I figure it can't hurt to ask a group of people where I am highly likely to find somebody smarter than I am :-)
The main requirement in my opinion is upgrades. You need to know how long a link upgrade takes for your operational teams, or a device upgrade etc. If it takes 2 months to deliver a new backhaul link to a regional PoP, call it 3 months to allow for wayleaves, sick engineers, DC access failures, etc. Then make sure you trigger a backhaul upgrade when your growth model says you’re 3-4 months away from 70% utilisation (or whatever figure suites your operations and customers). Cheers, James. P.S. Sorry for the epistle.