Alternative to NetFlow for Measuring Traffic flows
Hi all - Here is the problem: Everyone wants to know how much traffic would ultimately be passed in peering relationships at an IX before signing up/building into an IX. I heard an interesting solution recently to estimating the traffic volume destined to an AS in the absence of NetFlow or the like. The ability to measure traffic via sampling has been difficult for a variety of reasons (lack of staff resources, capabilities of the interface cards, expensive SW, etc.) and I found from talking to Peering Coordinators that less than 1 in 20 ISPs actually do the traffic measurements. In the absence of data, ISPs are often left to intuition, guessing that a particular AS would be a good peering candidate. Here is the Solution: Assuming that: 1) You are multi-homed 2) You have some ideas of who you would want to peer with 3) <more assumptions here I'm sure> 1) You adjust routing to prefer one transit provider or the other for the AS 2) Shift traffic to the particular AS from one transit provider to the other, noting the change in the loads on the transit providers. If you do this at peak time you can get a rough estimate of the peak traffic to this AS, and therefore a rough order of magnitude estimate of the amount of traffic that would go to this AS in a peering relationship. (Rough Estimate means determining if the traffic volume is likely to be 2 Mbps vs. 20 Mbps vs. 200Mbps) Interesting idea. Comments? The other approach some ISPs use is to set up a "trial" peering session, usually using a private cross connect to measure the traffic volume and relative traffic ratios. Then both side can get an idea of the traffic before engaging in a contractual Settlement-Free Peering relationship. Bill
Hi Bill, Impressive numbers but of course, slackers aside, if it was your connection and resources wouldnt you want more accurate information than just a guess? This may be effective for an IX decision if you created some sort of a map based on ALL the ASN's of the people on the peering switch.. but in most cases anyone pushing any real traffic will probably not have fine grained samples enough to determine a peering relationship based on a single AS with this method. Maybe Im wrong but hey if you are taking 200megs from any one ASN I would hope you knew about it.
Interesting idea. Comments?
Again it seems to iffy. What if you get a short DOS when you shift an ASN.. how much of a chump will you look like when you need that peer to be 1gbps and you hook up and its only pulling 2mpbs ?
The other approach some ISPs use is to set up a "trial" peering session, usually using a private cross connect to measure the traffic volume and relative traffic ratios. Then both side can get an idea of the traffic before engaging in a contractual Settlement-Free Peering relationship.
I like this one the best if I didnt have Netflow stat's... however I doubt everyone will allow this because of time, money, resources, security, etc. I tend to look at peering as something you need to know when to do because the data tells you so. In this industry as it stands now why would you NOT run netflow stats to give you this information? all you are doing is wasting more money paying for transit that could be offloaded to peering. And the flipside is also true.. why even worry about peering if you cant get more than a meg or two max to each AS?
On Mon, Dec 16, 2002 at 09:16:55PM -0500, K. Scott Bethke wrote:
based on ALL the ASN's of the people on the peering switch.. but in most cases anyone pushing any real traffic will probably not have fine grained samples enough to determine a peering relationship based on a single AS with this method. Maybe Im wrong but hey if you are taking 200megs from any one ASN I would hope you knew about it.
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size). You would have to statically route (or otherwise inject routes with a specific nexthop) a list of their customer prefixes that would have to be manually transmitted. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
On Monday, Dec 16, 2002, at 22:28 Canada/Eastern, Richard A Steenbergen wrote:
On Mon, Dec 16, 2002 at 09:16:55PM -0500, K. Scott Bethke wrote:
based on ALL the ASN's of the people on the peering switch.. but in most cases anyone pushing any real traffic will probably not have fine grained samples enough to determine a peering relationship based on a single AS with this method. Maybe Im wrong but hey if you are taking 200megs from any one ASN I would hope you knew about it.
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size).
You would have to statically route (or otherwise inject routes with a specific nexthop) a list of their customer prefixes that would have to be manually transmitted.
If you are interested in traffic *to* a particular destination, surely you could just tweak localpref on routes based on an as-path filter? If you are interested in traffic *from* a particular destination (you have a network full of eyes, not content) then this approach is not useful anyway. Joe
On Monday 16 December 2002 07:37 pm, Joe Abley wrote:
If you are interested in traffic *to* a particular destination, surely you could just tweak localpref on routes based on an as-path filter?
And then quantify it how? Ie; useful Netflow-like "x Mbps to AS x, y Mbps to AS y" statistics? -- Grant A. Kirkwood - grant(at)tnarg.org Fingerprint = D337 48C4 4D00 232D 3444 1D5D 27F6 055A BF0C 4AED
On Monday, Dec 16, 2002, at 22:47 Canada/Eastern, Grant A. Kirkwood wrote:
On Monday 16 December 2002 07:37 pm, Joe Abley wrote:
If you are interested in traffic *to* a particular destination, surely you could just tweak localpref on routes based on an as-path filter?
And then quantify it how? Ie; useful Netflow-like "x Mbps to AS x, y Mbps to AS y" statistics?
I think the idea was to say "well, from the mrtg graph, the difference between this circuit with all my _9327_ traffic and this circuit without any _9327_ traffic, at what I might reasonably estimate their peak time to be, looks to be about 2 megs or so". It's a pretty crude measure, but it does have the advantage of requiring no more than mrtg and a route-map to set up. Joe
On Mon, 16 Dec 2002, Joe Abley wrote:
I think the idea was to say "well, from the mrtg graph, the difference between this circuit with all my _9327_ traffic and this circuit without any _9327_ traffic, at what I might reasonably estimate their peak time to be, looks to be about 2 megs or so".
It's a pretty crude measure, but it does have the advantage of requiring no more than mrtg and a route-map to set up.
It is also useful as a supplement to netflow statistics, as sort of a verification to your flow data. Sometimes due to design, operating conditions, etc netflow data is not always the most reliable and/or meaningful. As an example: You run two main types of border router platforms. On one platform you must sample netflow @ 1% due to performance limitations. On the other platform there is no sampling functionality built into the software. This creates an immediate skew of data, unless software is created to sample the flows coming off the second platform. Now take into account that your traffic is mainly outbound from your network, which means that you need to ignore vendor best practice and enable flow caching on your core (internal) facing interfaces to measure the traffic flowing out of your network. So, in order for you to get any kind of traffic statistics for a peer, you've got to spend many hours distilling data manually, doing AS aggregations, and create a possibly unstable networking environment. No big deal, right? It may be crude, but sometimes it can be the most reliable _available_ method to tell how much traffic is going to the ISP and ISP customers. Joe
Quantifiable Proof and "Peering Profiles"...see below. At 08:53 PM 12/16/2002 -0800, Joe Wood wrote:
On Mon, 16 Dec 2002, Joe Abley wrote:
I think the idea was to say "well, from the mrtg graph, the difference between this circuit with all my _9327_ traffic and this circuit without any _9327_ traffic, at what I might reasonably estimate their peak time to be, looks to be about 2 megs or so".
It's a pretty crude measure, but it does have the advantage of requiring no more than mrtg and a route-map to set up.
Right, it is crude, but in an economy where business decisions require "Quantifiable *Proof*", this is quantifiable and easy to do. Some Peering Coordinators are putting together business plans now for peering at the IX that includes the #'s of Mbps of peering traffic, and e-mail confirmation from the peers at the IX that they will indeed peer with them at the IX. Smart customers; if they exceed the breakeven point then peering makes sense. A lot more work up front than it used to be.
It is also useful as a supplement to netflow statistics, as sort of a verification to your flow data. Sometimes due to design, operating conditions, etc netflow data is not always the most reliable and/or meaningful.
As an example:
You run two main types of border router platforms. On one platform you must sample netflow @ 1% due to performance limitations. On the other platform there is no sampling functionality built into the software. This creates an immediate skew of data, unless software is created to sample the flows coming off the second platform.
Now take into account that your traffic is mainly outbound from your network, which means that you need to ignore vendor best practice and enable flow caching on your core (internal) facing interfaces to measure the traffic flowing out of your network.
So, in order for you to get any kind of traffic statistics for a peer, you've got to spend many hours distilling data manually, doing AS aggregations, and create a possibly unstable networking environment.
No big deal, right?
It may be crude, but sometimes it can be the most reliable _available_ method to tell how much traffic is going to the ISP and ISP customers.
Joe is absolutely right here, and this still represents a common scenario and problem for the peering community. Another approach I have been thinking about is to generate "Peering Profiles" for the community...here is how it works. Let's say I work with a few Internet Gaming companies and find that the netflow stats show a certain pattern, or profile of traffic destinations. Maybe I find that 2% to Cox 3% to Shaw 2% to Comcast 5% to Roadrunner 2% to Adelphia and the next top 20 ASes represent the next 10% of traffic. Anonymized, this "Peering Profile" for Internet Gaming companies can probably be applied to other Internet Gaming companies and can provide a rough idea of good targets for peering and how much traffic can be expected at a peering point, as a percentage of their total traffic. Empirically, these top traffic destinations and volumes have been large enough, 10's of Mbps each, generally more than enough to justify peering a an IX where the breakeven point is 10-30Mbps. The design of the tool/template is pretty obvious from there. Side Note: See all the trouble we go through because traffic flow measurement is still non-trivial? If the netflow data is available at ingress/egress points, I was pointed to http://ehnt.sourceforge.net/ as a good freeware tool for evaluating and translating the netflow raw data. Bill
On Mon, Dec 16, 2002 at 10:53:26PM -0800, William B. Norton wrote:
Right, it is crude, but in an economy where business decisions require "Quantifiable *Proof*", this is quantifiable and easy to do. Some Peering Coordinators are putting together business plans now for peering at the IX that includes the #'s of Mbps of peering traffic, and e-mail confirmation from the peers at the IX that they will indeed peer with them at the IX. Smart customers; if they exceed the breakeven point then peering makes sense. A lot more work up front than it used to be.
Business decisions surrounding bandwidth and connectivity is still in the early years of development. A few, highly technical people understand it. Unfortunately, most of those people do not also know how financial roi works, or how to define a project or strategy that meets business hurdle rates. I personally feel that this is a gap in how internet marketing started idetifying measurements such as `click-throughs' and application- oriented numbers to fulfill their statistical needs for building partners and identifying their relationship to their market. Clearly, just by looking at the names they chose for things (`click-throughs'?), they had no idea what they were doing, and likely were also not willing to listen to their technical counterparts. Engineers don't know anything about business partnerships and relationships to their market, right? As a business unit manager responsible for Internet connectivity, one is obligated to look at their WACC / discount-/hurdle- rates and determine the value of future returns. Find out the WACC and marginal tax rates of your company. Find a financial controller, or someone who manages finance and locate information on how capital expenditures are evaluated and depreciated (how long?). Find out what metrics they want and need for technology expenditures that involve both capex and opex in the same budget/project. What are the business expectations?
Another approach I have been thinking about is to generate "Peering Profiles" for the community...here is how it works. Let's say I work with a few Internet Gaming companies and find that the netflow stats show a certain pattern, or profile of traffic destinations. Maybe I find that 2% to Cox 3% to Shaw 2% to Comcast 5% to Roadrunner 2% to Adelphia and the next top 20 ASes represent the next 10% of traffic.
Anonymized, this "Peering Profile" for Internet Gaming companies can probably be applied to other Internet Gaming companies and can provide a rough idea of good targets for peering and how much traffic can be expected at a peering point, as a percentage of their total traffic. Empirically, these top traffic destinations and volumes have been large enough, 10's of Mbps each, generally more than enough to justify peering a an IX where the breakeven point is 10-30Mbps. The design of the tool/template is pretty obvious from there.
Bill, I fully agree with your methods and think they are wonderful. If there is any quantifiable proof, it needs to be identified and executed on. Get some numbers, any numbers. Take some tcpdump samples, or even load a permit acl on your routers to determine estimates. What is the problem you are trying to solve? My answer: an internal business unit agreement that a minimum percentage of costs can be reduced through peering. Do this in any way that you can. Why is this so difficult? Probably because the disagreeable parties aren't talking the same language. I also feel that in many businesses, technology (such as bandwidth) is not taken seriously. Prothat include capex+opex to a variety of vendors (creating vendor dependence) with new "extra" routers (equipment), and seemingly costly exchange point "extra" connectivity, with "extra" racks and power requirements with monthly re-occurring charges - well that's just not intuitively cost-effective, now is it? Ask the right questions. If managers do not want to do peering because it doesn't meet some marketing or partnership requirement, then take some different angles. If managers don't believe that peering can actually save money, come up with the numbers and the financial language they are used to. If you are told not to come up with technology estimates, or simply can't because you don't have the time - then consider using someone else's work and time (like Bill Norton). Augment/replace estimates (and even really accurate NetFlow data) with externally researched estimates and national or global averages. BTW: Yes, I believe NetFlow is an excellent tool, and I use it myself for determing who would make a good peer (and ras is correct that using an external routing table along with the netflow data improves this even further). NetFlow is easy. Set it up and use it, or get the needed help from your vendor. If you determine that for your environment - that NetFlow is not easy, then use something else. dre
On Tue, Dec 17, 2002 at 11:25:58AM -0800, dre wrote:
Prothat include capex+opex to a variety of vendors (creating vendor dependence) with new "extra" routers (equipment), and seemingly costly exchange point "extra" connectivity, with "extra" racks and power requirements with monthly re-occurring charges - well that's just not intuitively cost-effective, now is it?
s/Prothat/Projects that/ stupid vi. ;> dre
On Monday 16 December 2002 07:37 pm, Joe Abley wrote:
If you are interested in traffic *to* a particular destination, surely you could just tweak localpref on routes based on an as-path filter?
And then quantify it how? Ie; useful Netflow-like "x Mbps to AS x, y Mbps to AS y" statistics?
My total traffic is Z, my traffic to AS X is Px%. My traffic to AS Y is Py%. Py is 70x Px. I therefore should attempt to get interconnect with y. Alex
based on ALL the ASN's of the people on the peering switch.. but in most cases anyone pushing any real traffic will probably not have fine grained samples enough to determine a peering relationship based on a single AS with this method. Maybe Im wrong but hey if you are taking 200megs from any one ASN I would hope you knew about it.
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size).
The vast majority of the routes will be an intersection of routes announced by the AS to other AS (including looking glasses). Alex
based on ALL the ASN's of the people on the peering switch.. but in most cases anyone pushing any real traffic will probably not have fine grained samples enough to determine a peering relationship based on a single AS with this method. Maybe Im wrong but hey if you are taking 200megs from any one ASN I would hope you knew about it.
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size).
The vast majority of the routes will be an intersection of routes announced by the AS to other AS (including looking glasses).
oops, this should be read as "by the AS to other AS' (including the data you can pull of from looking glasses)." Alex
On Mon, Dec 16, 2002 at 11:41:12PM -0500, alex@yuriev.com wrote:
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size).
The vast majority of the routes will be an intersection of routes announced by the AS to other AS (including looking glasses).
Assume you are provider A, and you are considering peering with provider B. Assume Provider B has customer Z, who buys transit from Provider B and Provider C. Assume you already peer with provider C. You have no way to know if customer Z will be part of your routes to Provider B, or if you will prefer them over provider C, without having the route list. This is a very common situation if you have any decent amount of peering, and/or if you are considering peering with a provider who has any reasonable number of multihomed customers. As we've already proved in previous nanog emails, the top 20 route-announcing providers added together have enough routes to cover the internet around 8 times over. Even looking glasses may not contain all the paths available. Projecting actual IP traffic onto actual IP routes is the only way to do it. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
Also, that method has the same "knowing the routes" problem as netflow. Whereever you are getting your list of ASN's route ASN.*"'s routes, there is pretty much no way they are accurate (for an ASN of ANY size).
The vast majority of the routes will be an intersection of routes announced by the AS to other AS (including looking glasses).
Assume you are provider A, and you are considering peering with provider B. Assume Provider B has customer Z, who buys transit from Provider B and Provider C. Assume you already peer with provider C.
You have no way to know if customer Z will be part of your routes to Provider B, or if you will prefer them over provider C, without having the route list.
This is a standard problem resolved in the set theory. Pick your set. Measure. Pick your set again, measure. Repeat N times. Decide which set of results you accept as more likely. Use them. Alex
This is a very common situation if you have any decent amount of peering, and/or if you are considering peering with a provider who has any reasonable number of multihomed customers. As we've already proved in previous nanog emails, the top 20 route-announcing providers added together have enough routes to cover the internet around 8 times over. Even looking glasses may not contain all the paths available.
Projecting actual IP traffic onto actual IP routes is the only way to do it.
--
At 09:16 PM 12/16/2002 -0500, K. Scott Bethke wrote:
Impressive numbers but of course, slackers aside, if it was your connection and resources wouldnt you want more accurate information than just a guess?
Yes, but I am also sympathetic to the challenges to ISPs in this economy, and the challenges with large networks where there are so many ingress/egress points that getting sampling in place is problematic. I hear from some Tier 1 ISPs that in some cases sampling is not available on the too new or too old NIC. In some cases there are simply too many points to measure, requiring too much disk, time, processing, etc. I heard stories of those that process the data monthly and do so at great expense, with the occasional crashes of the weekend jobs. Sometimes the quick and dirty approach is easier. Doing the research it was surprising to find how many of the largest ISPs in the world don't/can't do the detailed traffic analysis. <snip>
Interesting idea. Comments?
Again it seems to iffy. What if you get a short DOS when you shift an ASN.. how much of a chump will you look like when you need that peer to be 1gbps and you hook up and its only pulling 2mpbs ?
Good point - another assumption (3) that the traffic is normal predictable sinusoidal pattern such that the peak for the target AS matches the peak of the rest of the traffic.
The other approach some ISPs use is to set up a "trial" peering session, usually using a private cross connect to measure the traffic volume and relative traffic ratios. Then both side can get an idea of the traffic before engaging in a contractual Settlement-Free Peering relationship.
I like this one the best if I didnt have Netflow stat's... however I doubt everyone will allow this because of time, money, resources, security, etc.
Yes, the Empirical Approach is most accurate but, besides the cost of implementing the trial peering, there are examples of Tier 2 ISPs trying to game the trial with a Tier 1 ISP in order to obtain the peering relationship. I heard stories of some pretty wacky routing and traffic engineering in order to demonstrate during the trial that ratios and traffic volumes fell within a certain range. ( The "Art of Peering" documents a few of these tactics.) I can understand why the Tier 1's are hesitant to do the trial peering even when they don't have the data to refute the "peering worthiness".
I tend to look at peering as something you need to know when to do because the data tells you so. In this industry as it stands now why would you NOT run netflow stats to give you this information? all you are doing is wasting more money paying for transit that could be offloaded to peering.
Me too, but differentiate between Tier 1 and Tier 2 solely for the motives; Tier 2's want to peer broadly to reduce transit fees, while Tier 1's by definition don't pay transit fees to anyone.
And the flipside is also true.. why even worry about peering if you cant get more than a meg or two max to each AS?
I have found peering to have additive value; a lot of 1-2 Mbps peering sessions can save as much money for you as a single large traffic peer. The more traffic, the stronger the case for peering. Bill
On Mon, Dec 16, 2002 at 05:46:10PM -0800, William B. Norton wrote:
1) You adjust routing to prefer one transit provider or the other for the AS 2) Shift traffic to the particular AS from one transit provider to the other, noting the change in the loads on the transit providers.
Ouch. Among other problems, you really have no idea what is peak time for that peer, which may be very different from your overall peak time. Is there some reason you don't have netflow available, or are you just trying to work around the known problems of it, such as not being able to get a true reading due to best path issues? I am quite happy with my method, using netflow export projected against an external routing table. It gives pretty accurate results, *IF* you have the correct external routing table. For two potential peers this is easy to get, but for other use it is sometimes fairly difficult to get accurate and current table of someone's customer routes. Unfortunately it seems many people view this as "NDAable information", and won't make it accessable via a route-views type service. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
Hi all -
Here is the problem: Everyone wants to know how much traffic would ultimately be passed in peering relationships at an IX before signing up/building into an IX.
I heard an interesting solution recently to estimating the traffic volume destined to an AS in the absence of NetFlow or the like.
I have an amazingly simple proposition - as opposite to guesstimating the data coming up with excuses why not to use NetFlow, get NetFlow data for your own network. Alex
participants (8)
-
alex@yuriev.com
-
dre
-
Grant A. Kirkwood
-
Joe Abley
-
Joe Wood
-
K. Scott Bethke
-
Richard A Steenbergen
-
William B. Norton