sFlow vs netFlow/IPFIX

Todd Crane

28 Feb 2016 28 Feb '16

8:06 p.m.

This maybe outside the scope of this list but I was wondering if anybody had advice or lessons learned on the whole sFlow vs netFlow debate. We are looking at using it for billing and influencing our sdn flows. It seems like everything I have found is biased (articles by companies who have commercial offerings for the "better" protocol) Todd Crane

Show replies by date

Nick Hilliard

28 Feb 28 Feb

10:40 p.m.

Todd Crane wrote:

...

This maybe outside the scope of this list but I was wondering if anybody had advice or lessons learned on the whole sFlow vs netFlow debate. We are looking at using it for billing and influencing our sdn flows. It seems like everything I have found is biased (articles by companies who have commercial offerings for the "better" protocol)

There is a lot of religion floating around about this subject. Netflow was designed to measure flows, and it turned out that the design was robust enough for it to be more-or-less good enough for billing purposes. It's "more or less" because on larger routers, you can't do 1:1 data export and you end up needing to do traffic sampling, at which point you're billing based on realistic estimates rather than exact data. That's fine if your contract with your customer says it's ok. Netflow works by tracking individual flows in the data plane. This is pretty complicated in practice and requires dedicated hardware to handle it at line rate. You generally end up with two packet forwarding engines on a hardware-forwarded router: one to handle the forwarding, and the other to categorise and handle the flow data. This means that netflow is expensive to design, build and run. Sflow is a simple packet header sampling mechanism. The only thing it does is to pick out every 1 in N packets, and to try to figure out where the headers stop and the data begins. The header is then forwarded to the sflow collector, which is where all the smart stuff is done. If your netflow / sflow packet sampling mechanism is accurate and your router is configured appropriate for the quantity of flow data being exported (i.e. it isn't dropping data samples due to overload), then for the most part, there will be no substantial difference between using sflow and sampled netflow (depending on the data flow type), assuming that each protocol provides the data you're looking for. Obviously, if your sampling mechanism is broken or your exporter is overloaded, then both sflow and netflow will produce trash. If you're using unsampled netflow, then netflow will be more accurate, assuming you don't end up overflowing the netflow data export mechanism. Anything which uses sampling - regardless of whether it's for netflow or sflow - needs to be profiled before being pushed into production, because you need to understand the limits of the sampling mechanism. Hardware sampling often doesn't work properly or plateaus off at a certain stage, dropping packets in the process. This can cause unwelcome surprises. Without knowing anything more about your requirements or your choice of equipment, I'd suggest that sflow would probably be a better choice for SDN tuning and probably netflow would be better for billing, but YMMV. Nick

Baldur Norddahl

11:26 p.m.

On 28 February 2016 at 23:40, Nick Hilliard <nick@foobar.org> wrote:

...

Netflow was designed to measure flows, and it turned out that the design was robust enough for it to be more-or-less good enough for billing purposes. It's "more or less" because on larger routers, you can't do 1:1 data export and you end up needing to do traffic sampling, at which point you're billing based on realistic estimates rather than exact data. That's fine if your contract with your customer says it's ok.

Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users. Then store that data for 1 year, so the police and other parties can request a copy (with a warrant but you are never allowed to tell anyone that they came for the data and the judges will never say no). My routers can apparently actually do 1:1 netflow and the documentation does not state any limits on that. So maybe I am lucky? To the original question: in this country sFlow only is apparently about to become illegal. Regards, Baldur

Roland Dobbins

29 Feb 29 Feb

2:24 a.m.

On 29 Feb 2016, at 6:26, Baldur Norddahl wrote:

...

Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users.

That's interesting, given that most larger routers don't support 1:1. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Valdis.Kletnieks＠vt.edu

2:41 a.m.

On Mon, 29 Feb 2016 09:24:42 +0700, "Roland Dobbins" said:

...

On 29 Feb 2016, at 6:26, Baldur Norddahl wrote:

...
Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users.

That's interesting, given that most larger routers don't support 1:1.

In the war between reality and governmental paranoia, reality usually loses.

Pavel Odintsov

7:26 a.m.

Hello, folks! I've huge experience for battle sflow vs netflow because in my free DDoS detection toolkit fastnetmon we support both capture methods. You could look at this comparison table: https://github.com/pavel-odintsov/fastnetmon/blob/master/docs/CAPTURE_BACKEN...

...

From my own experience sflow should be selected if you are interested in internal packet payload (for dpi / ddos detection) or you need fast reaction time on some actions (ddos is best example).

...

From hardware point of view almost all brand new switches support sflow free of charge (no additional licenses or modules). But be aware, Cisco do not support this protocol at all (that's pretty weird, really). Also keep in mind sflow implemented in hardware ASIC with small help from CPU and it's pretty fast and suitable for really any

If you just need to count traffic and you accept pretty long reaction time and not enough accurate traffic bandwidth data you could select netflow. traffic bandwidth. I have experience with sflow analytics for 1.5 Tb+ network and it's working really well! For netflow sometimes you need additional modules / software licenses and sometime devices completely haven't support for it. And if you have software devices (for example small SRX routers from Juniper) netflow generation will be pretty expensive from CPU point of view because netflow need pretty big amount of CPU resources for aggregation. On Mon, Feb 29, 2016 at 5:41 AM, <Valdis.Kletnieks@vt.edu> wrote:

...

On Mon, 29 Feb 2016 09:24:42 +0700, "Roland Dobbins" said:

...
On 29 Feb 2016, at 6:26, Baldur Norddahl wrote:

...
Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users.

That's interesting, given that most larger routers don't support 1:1.

In the war between reality and governmental paranoia, reality usually loses.

-- Sincerely yours, Pavel Odintsov

Roland Dobbins

7:32 a.m.

On 29 Feb 2016, at 14:26, Pavel Odintsov wrote:

...

From my own experience sflow should be selected if you are interested in internal packet payload (for dpi / ddos detection) or you need fast reaction time on some actions (ddos is best example).

This does not match my experience. In particular, the implied canard about flow telemetry being inadequate for timely DDoS detection/classification/traceback grows tiresome, as it's used for that purpose every day, and works quite well. If one is also using an IDMS-type device to mitigate DDoS traffic, the device sees the whole packet, anyways. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Pavel Odintsov

7:41 a.m.

Sorry but I could not understand what issues you've found in sflow. Could you describe they in details? Recently I had speech at RIPE 71 and show pattern of real attack which achieved 6 gbps in first 30 seconds (just check slide 6 here http://www.slideshare.net/pavel_odintsov/ripe71-fastnetmon-open-source-dos-d...). And sflow device could offer 3-4 seconds detection time from this case. But netflow __could__ delay telemetry up to 30 seconds (in case of huge syn/syn-ack flood for example) and you network will experience downtime. But with sflow you could detect and mitigate this attack in seconds. Is it make sense? On Mon, Feb 29, 2016 at 10:32 AM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 29 Feb 2016, at 14:26, Pavel Odintsov wrote:

...
From my own experience sflow should be selected if you are interested in internal packet payload (for dpi / ddos detection) or you need fast reaction time on some actions (ddos is best example).

This does not match my experience. In particular, the implied canard about flow telemetry being inadequate for timely DDoS detection/classification/traceback grows tiresome, as it's used for that purpose every day, and works quite well.

If one is also using an IDMS-type device to mitigate DDoS traffic, the device sees the whole packet, anyways.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

-- Sincerely yours, Pavel Odintsov

Roland Dobbins

7:53 a.m.

On 29 Feb 2016, at 14:41, Pavel Odintsov wrote:

...

Could you describe they in details?

Inconsistent stats, lack of ifindex information.

...

But netflow __could__ delay telemetry up to 30 seconds (in case of huge syn/syn-ack flood for example) and you network will experience downtime.

This is incorrect, and reflects an inaccurate understanding of how NetFlow/IPFIX actually works, in practice. It's often repeated by those with little or no operational experience with NetFlow/IPFIX. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Pavel Odintsov

8:12 a.m.

What you mean as lack of ifindex in sflow? I could offer example sflow v5 sample structure description (it's from my C++ based sflow parser but actually it's pretty simple to understand): uint32_t sample_sequence_number; // sample sequence number uint32_t source_id_type; // source id type uint32_t source_id_index; // source id index uint32_t sampling_rate; // sampling ratio uint32_t sample_pool; // number of sampled packets uint32_t drops_count; // number of drops due to hardware overload uint32_t input_port_type; // input port type uint32_t input_port_index; // input port index uint32_t output_port_type; // output port type uint32_t output_port_index; // outpurt port index uint32_t number_of_flow_records; ssize_t original_payload_length; As you can see we have source id, sampling rate and definitely we have full information about source and destination ifindexes. In addition to sample structure (which consist of first X bytes of each packet) we have counter structures which working as old good "snmp counters" and offer detailed information about load on each port. Looks like you haven't so much field experience with sflow. I could help and offer some real field experience below. --- Few words about netflow. When you are speaking about "netflow" you should mentions explicit vendors. Because netflow is very-very-very vendor specific. I have my own netflow collector implementation for netflow v5, netflow v9 and IPFIX (just check my repository https://github.com/pavel-odintsov/fastnetmon/blob/master/src/netflow_plugin/...). I spent so much nights on debugging this protocols. So you know about Mirkotik implementation of netflow (they have minimum possible active and inactive timeout - 60 seconds) ? Or what about old Cisco routers which support only 180 seconds as active timeouts? Could they offer affordable time for telemetry delivery? The only one way to have accurate bandwidth data in netflow to use some sort of average or moving average for certain time (30 seconds for example). But if you have really huge network you should use netflow sampling. And here I should say multiple nice questions! Cisco and Juniper are using really incompatible way to encode sampling rate. That's really funny but that is. I do not know about other vendors because network sampling is very specific feature. But with sampling (if your collector could decode yet another netflow incompatible implementation) things going really weird :) And you could get accurate bandwidth data only if you have really HUGE network with only 10-100GE customers because traffic speed to small (100 - 1000mbps) customers will be really weird :) What's your ideas about this all? Please mention vendor names whet you vote for netflow next time. Because not all netflow implementations are OK. And definitely some netflow implementations are broken. On Mon, Feb 29, 2016 at 10:53 AM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 29 Feb 2016, at 14:41, Pavel Odintsov wrote:

...
Could you describe they in details?

Inconsistent stats, lack of ifindex information.

...
But netflow __could__ delay telemetry up to 30 seconds (in case of huge syn/syn-ack flood for example) and you network will experience downtime.

This is incorrect, and reflects an inaccurate understanding of how NetFlow/IPFIX actually works, in practice. It's often repeated by those with little or no operational experience with NetFlow/IPFIX.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

-- Sincerely yours, Pavel Odintsov

Roland Dobbins

8:38 a.m.

On 29 Feb 2016, at 15:12, Pavel Odintsov wrote:

...

Looks like you haven't so much field experience with sflow. I could help and offer some real field experience below.

I've already recounted my real-world operational experience with NetFlow.

...

I have my own netflow collector implementation for netflow v5, netflow v9 and IPFIX (just check my repository

Coding something and using something operationally are two different things. I'm not a coder, but I've used NetFlow operationally since 1998, primarily on Cisco platforms (some Junipers, but I don't know a lot about Juniper boxes).

...

So you know about Mirkotik implementation of netflow (they have minimum possible active and inactive timeout - 60 seconds) ?

Yes. That does not equate to a 60s delay in detection/classifying/tracing back a SYN-flood, or anything else.

...

Or what about old Cisco routers which support only 180 seconds as active timeouts?

I think you're referring to the *default* value for the active flow timer, which can of course be altered.

...

Could they offer affordable time for telemetry delivery?

Yes, because there has never been any such router, and also because cache size and other tunable parameters, as well as FIFOing out of flows when the cache is full, guarantees that very few flows of the type seen in DDoS traffic hang around in the cache for any appreciable length of time.

...

Because not all netflow implementations are OK. And definitely some netflow implementations are broken.

You can search the archives on this list and see my previous detailed explanation of NetFlow caveats on Cisco 6500/7600 with EARL6 and EARL7 ASICs. Your statements about it taking an inordinately long time to detect/classify/traceback SYN-floods and other types of DDoS attacks utilizing NetFlow implementations (with the exceptions of crippled implementations like the aforementioned EARL6/EARL7 and pre-Sup7 Cisco 4500) are simply untrue. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Pavel Odintsov

8:53 a.m.

Thanks for explained answer! But actually it's mistake to think I haven't real field experience just because I'm developer. In world of big companies nobody could do ops and development. But I'm trying to keep close to both worlds. And could conclude it's definitely possible. It's definitely possible thanks to my flexible company :) But actually "I think you're referring to the *default* value for the active flow timer, which can of course be altered." It's not about default. It's about minimal possible. For Mikrotik routers same issue. Minimal possible timeout is 60 / 60. And impossible to decrease it. Also so much routers could not do enough accurate netflow without additional (and very expensive) line cards just for netflow generation. OK, we could handle some sort of SYN flood. But what about 20 Gbps http flood with valid requests when each customer are real (and not spoofied) and they are sending huge post requests and hang on connection? How netflow will handle correctly handshaked connection, established http session but haven't closed correctly for a while? Actually it could wait for active/inactive timeout and you will get bad news from ops guys about network downtime. But sflow will handle it with flying colors without delay. What about destination http host detection with netflow? Could it extract "host" header from netflow? And drop only part of traffic to our own host? Definitely not. Netflow haven't any information about http headers but sflow has. What about same issue for dns flood when somebody flood out some certain host? You could detect this attack with netflow. But you could not extract information about certain type of DDoS attack and attacked domain. When we speaking about "very rough" DDoS attack mitigation and filtering we could use netflow. But when we are really care about network stability, customer service SLA and ability to filter malicious traffic with perfect precision we should use sflow. I really like to hear feedback about my vision. On Mon, Feb 29, 2016 at 11:38 AM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 29 Feb 2016, at 15:12, Pavel Odintsov wrote:

...
Looks like you haven't so much field experience with sflow. I could help and offer some real field experience below.

I've already recounted my real-world operational experience with NetFlow.

...
I have my own netflow collector implementation for netflow v5, netflow v9 and IPFIX (just check my repository

Coding something and using something operationally are two different things. I'm not a coder, but I've used NetFlow operationally since 1998, primarily on Cisco platforms (some Junipers, but I don't know a lot about Juniper boxes).

...
So you know about Mirkotik implementation of netflow (they have minimum possible active and inactive timeout - 60 seconds) ?

Yes. That does not equate to a 60s delay in detection/classifying/tracing back a SYN-flood, or anything else.

...
Or what about old Cisco routers which support only 180 seconds as active timeouts?

I think you're referring to the *default* value for the active flow timer, which can of course be altered.

...
Could they offer affordable time for telemetry delivery?

Yes, because there has never been any such router, and also because cache size and other tunable parameters, as well as FIFOing out of flows when the cache is full, guarantees that very few flows of the type seen in DDoS traffic hang around in the cache for any appreciable length of time.

...
Because not all netflow implementations are OK. And definitely some netflow implementations are broken.

You can search the archives on this list and see my previous detailed explanation of NetFlow caveats on Cisco 6500/7600 with EARL6 and EARL7 ASICs.

Your statements about it taking an inordinately long time to detect/classify/traceback SYN-floods and other types of DDoS attacks utilizing NetFlow implementations (with the exceptions of crippled implementations like the aforementioned EARL6/EARL7 and pre-Sup7 Cisco 4500) are simply untrue.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

-- Sincerely yours, Pavel Odintsov

Roland Dobbins

9:42 a.m.

On 29 Feb 2016, at 15:53, Pavel Odintsov wrote:

...

It's not about default. It's about minimal possible.

To my knowledge, there has never been a Cisco router which only allowed an active flow timer value of 180s, which wasn't user-configurable. I would appreciate the details of any such router.

...

For Mikrotik routers same issue. Minimal possible timeout is 60 / 60. And impossible to decrease it.

As we've seen already from another poster in this thread, that isn't the case.

...

Also so much routers could not do enough accurate netflow without additional (and very expensive) line cards just for netflow generation.

I believe you're referring to PICs on Juniper routers, yes? Or perhaps the requirement for E3 or E5 linecards on Cisco 12Ks? Or maybe DFCs on Cisco 6500s/7600s? Or possibly M-series linecards on Cisco N7Ks (which are switches, of course)? TANSTAAFL.

...

OK, we could handle some sort of SYN flood.

As noted previously, this is indeed the case.

...

But what about 20 Gbps http flood with valid requests when each customer are real (and not spoofied) and they are sending huge post requests and hang on connection?

Attacks of this nature generally leave a 'wake' or 'contrail' which is pretty easily spotted if one's statistical anomaly detection routines are optimal.

...

Actually it could wait for active/inactive timeout and you will get bad news from ops guys about network downtime.

As a network ops guy, I can assure you that you are incorrect, largely because you don't seem to understand the interplay of active flow timer, inactive flow timer, NetFlow cache size, NetFlow cache FIFOing, and normal flow cache baselines.

...

But sflow will handle it with flying colors without delay.

NetFlow handles it with flying colors without delay.

...

What about destination http host detection with netflow? Could it extract "host" header from netflow? And drop only part of traffic to our own host?

Of course not, for classical flow telemetry templates - but that's when one drops from the macroanlytical to the microanalytical. And flow telemetry doesn't 'drop' anything. For some reason, you don't mention Flexible NetFlow at all. It's true that it's taken a while to become practical to use (back when the then-Cisco NetFlow PM asked me to create the CLI grammar and syntax for FNF, I noted that it wouldn't take off until there was a decent control-plane interface for creating, configuring, and tearing down dynamic flow caches, as well as some degree of ASIC support on larger platforms), but now that the various 'SDN'-type provisioning mechanisms are being implemented, and now that at least partial FNF is supported to varying degrees on various ASICs, this will hopefully change.

...

Netflow haven't any information about http headers but sflow has.

See above. This isn't necessary, and it isn't possible at scale with s/Flow, either.

...

What about same issue for dns flood when somebody flood out some certain host? You could detect this attack with netflow. But you could not extract information about certain type of DDoS attack and attacked domain.

There's no need to do this with flow telemetry. Once the attack has been detected/classified/traced back, one drops to the microanalytical for situationally-appropriate mitigation.

...

When we speaking about "very rough" DDoS attack mitigation and filtering we could use netflow.

Not just 'very rough', see above.

...

But when we are really care about network stability, customer service SLA and ability to filter malicious traffic with perfect precision we should use sflow.

This is demonstrably incorrect. Many of the largest networks in the world successfully utilize NetFlow telemetry for all these purposes; they have for many years, and will continue to do so. [And, btw, nothing has 'perfect precision'.] That doesn't mean that NetFlow (or IPFIX) is perfect, and it doesn't mean that all implementations are perfect, and it doesn't mean that the ability to get more information about traffic via FNF or IPFIX EE mechanisms isn't desirable. But you are simply wrong about the utility of NetFlow and/or IPFIX with classical flow templates.

...

I really like to hear feedback about my vision.

See above. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Pavel Odintsov

9:59 a.m.

Thanks for detailed question! I have only one question. Why you against sFLOW protocol telemetry with so huge passion ? :) It's not proprietary technology and not an product from yet another big company. I'm not trying to sell anything because... nothing to sell. Really, isn't it? It's just yet another open standard to analyze data approved and implemented as RFC. If somebody developed this standard. Implemented it in ASIC (they have very huge price for development and production you know). That's means "somebody" really want it and will definitely use it. Actually, sflow is not so popular as netflow. But to be honest it's pretty young standard in compare with netflow and it implements slightly different approach. Which will be useful in some cases. For example, at huge Internet Exchanges you actually haven't any netflow enabled devices (just check design architecures from AMX-IX, DEC-IX, LINX or even MSK-IX). Almost all IX developed with L2 in ming and they actually haven't any devices which could produce netflow. So IX could not use netflow even if they want. But you vote for "sflow is weird protocol and you should avoid it". How IX could monitor traffic if they haven't netflow? So if they follow your recommendations they should drop idea about traffic monitoring at all :) I do not like holy wars about something vs something. But actually in modern network world every technology has applicable usage and it's not good idea to avoid it just because your religion (I'm speaking about netflow religion) prohibit it for you. Actually you are writing this email from company email and I could conclude it's Arbor vision and is not your own. Could you clarify it? Could I use your vision as Arbor's vision in public speeches / presentations? Thanks! On Mon, Feb 29, 2016 at 12:42 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 29 Feb 2016, at 15:53, Pavel Odintsov wrote:

...
It's not about default. It's about minimal possible.

To my knowledge, there has never been a Cisco router which only allowed an active flow timer value of 180s, which wasn't user-configurable. I would appreciate the details of any such router.

...
For Mikrotik routers same issue. Minimal possible timeout is 60 / 60. And impossible to decrease it.

As we've seen already from another poster in this thread, that isn't the case.

...
Also so much routers could not do enough accurate netflow without additional (and very expensive) line cards just for netflow generation.

I believe you're referring to PICs on Juniper routers, yes? Or perhaps the requirement for E3 or E5 linecards on Cisco 12Ks? Or maybe DFCs on Cisco 6500s/7600s? Or possibly M-series linecards on Cisco N7Ks (which are switches, of course)?

TANSTAAFL.

...
OK, we could handle some sort of SYN flood.

As noted previously, this is indeed the case.

...
But what about 20 Gbps http flood with valid requests when each customer are real (and not spoofied) and they are sending huge post requests and hang on connection?

Attacks of this nature generally leave a 'wake' or 'contrail' which is pretty easily spotted if one's statistical anomaly detection routines are optimal.

...
Actually it could wait for active/inactive timeout and you will get bad news from ops guys about network downtime.

As a network ops guy, I can assure you that you are incorrect, largely because you don't seem to understand the interplay of active flow timer, inactive flow timer, NetFlow cache size, NetFlow cache FIFOing, and normal flow cache baselines.

...
But sflow will handle it with flying colors without delay.

NetFlow handles it with flying colors without delay.

...
What about destination http host detection with netflow? Could it extract "host" header from netflow? And drop only part of traffic to our own host?

Of course not, for classical flow telemetry templates - but that's when one drops from the macroanlytical to the microanalytical. And flow telemetry doesn't 'drop' anything.

For some reason, you don't mention Flexible NetFlow at all. It's true that it's taken a while to become practical to use (back when the then-Cisco NetFlow PM asked me to create the CLI grammar and syntax for FNF, I noted that it wouldn't take off until there was a decent control-plane interface for creating, configuring, and tearing down dynamic flow caches, as well as some degree of ASIC support on larger platforms), but now that the various 'SDN'-type provisioning mechanisms are being implemented, and now that at least partial FNF is supported to varying degrees on various ASICs, this will hopefully change.

...
Netflow haven't any information about http headers but sflow has.

See above. This isn't necessary, and it isn't possible at scale with s/Flow, either.

...
What about same issue for dns flood when somebody flood out some certain host? You could detect this attack with netflow. But you could not extract information about certain type of DDoS attack and attacked domain.

There's no need to do this with flow telemetry. Once the attack has been detected/classified/traced back, one drops to the microanalytical for situationally-appropriate mitigation.

...
When we speaking about "very rough" DDoS attack mitigation and filtering we could use netflow.

Not just 'very rough', see above.

...
But when we are really care about network stability, customer service SLA and ability to filter malicious traffic with perfect precision we should use sflow.

This is demonstrably incorrect. Many of the largest networks in the world successfully utilize NetFlow telemetry for all these purposes; they have for many years, and will continue to do so.

[And, btw, nothing has 'perfect precision'.]

That doesn't mean that NetFlow (or IPFIX) is perfect, and it doesn't mean that all implementations are perfect, and it doesn't mean that the ability to get more information about traffic via FNF or IPFIX EE mechanisms isn't desirable. But you are simply wrong about the utility of NetFlow and/or IPFIX with classical flow templates.

...
I really like to hear feedback about my vision.

See above.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

-- Sincerely yours, Pavel Odintsov

Roland Dobbins

10:14 a.m.

On 29 Feb 2016, at 16:59, Pavel Odintsov wrote:

...

I have only one question. Why you against sFLOW protocol telemetry with so huge passion ? :)

Because I've had very poor experiences with it. And it doesn't seem to scale very well.

...

Actually, sflow is not so popular as netflow. But to be honest it's pretty young standard in compare with netflow and it implements slightly different approach.

sFlow has been around for a while, though. It isn't new.

...

So IX could not use netflow even if they want.

This depends upon the devices utilized - there are actually some devices which can export layer-2 NetFlow. There are other issues with NetFlow as it's currently generally implemented which are also concerns with IX scenarios, FYI. I will leave it as an exercise for you to find out what they are.

...

But you vote for "sflow is weird protocol and you should avoid it".

My view is that it's generally better to use NetFlow or IPFIX, where and when possible.

...

How IX could monitor traffic if they haven't netflow? So if they follow your recommendations they should drop idea about traffic monitoring at all :)

Straw man. I never said that nor implied it. If sFlow is all that's available, then of course operators can and should use it.

...

But actually in modern network world every technology has applicable usage and it's not good idea to avoid it just because your religion (I'm speaking about netflow religion) prohibit it for you.

It isn't 'religion'. It's based upon the fact that a) my experiences with sFlow have been suboptimal and b) sFlow isn't generally available on large routers used at network edges.

...

Actually you are writing this email from company email and I could conclude it's Arbor vision and is not your own.

No, that is incorrect. I speak only for myself. And as I previously noted, Arbor products support sFlow, and have for many years; I'm just not a big fan of it.

...

Could you clarify it?

I just did.

...

Could I use your vision as Arbor's vision in public speeches / presentations?

No, you may not, per the above. Arbor is telemetry-neutral; we aim to support all relevant telemetry formats in line with the expressed needs of our customers. And that includes sFlow. These trollish, passive-aggressive rhetorical tactics grow wearisome. I will not reply any further to this thread, so as to avoid further spamming the list. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Edward Dore

12:16 p.m.

...

On 29 Feb 2016, at 09:59, Pavel Odintsov <pavel.odintsov@gmail.com> wrote:

For example, at huge Internet Exchanges you actually haven't any netflow enabled devices (just check design architecures from AMX-IX, DEC-IX, LINX or even MSK-IX).

LINX use IPFIX (which is derived from NetFlow) for the Juniper LAN. The Extreme LAN uses sFlow. Edward Dore Freethought Internet

Pavel Odintsov

12:37 p.m.

Hello! Nice information. Very interesting architecture. They are using L3 on IX? How big Juniper Lan in comparison with Extreme lan? On Mon, Feb 29, 2016 at 3:16 PM, Edward Dore <edward.dore@freethought-internet.co.uk> wrote:

...

...
On 29 Feb 2016, at 09:59, Pavel Odintsov <pavel.odintsov@gmail.com> wrote:

For example, at huge Internet Exchanges you actually haven't any netflow enabled devices (just check design architecures from AMX-IX, DEC-IX, LINX or even MSK-IX).

LINX use IPFIX (which is derived from NetFlow) for the Juniper LAN.

The Extreme LAN uses sFlow.

Edward Dore Freethought Internet

-- Sincerely yours, Pavel Odintsov

Edward Dore

12:55 p.m.

...

On 29 Feb 2016, at 12:37, Pavel Odintsov <pavel.odintsov@gmail.com> wrote:

Hello!

Nice information. Very interesting architecture. They are using L3 on IX? How big Juniper Lan in comparison with Extreme lan?

Hi Pavel, The Juniper LAN is VPLS and the Extreme LAN is standard layer 2 with EAPS instead of *STP. The Juniper LAN peaks at ~2.7Tbps (https://stats.linx.net/lans#lon1), whilst the Extreme LAN peaks at ~0.6Tbps (https://stats.linx.net/lans#lon2). Edward Dore Freethought Internet

Pavel Odintsov

12:56 p.m.

Thanks! Very interesting. Will dig into details :) On Mon, Feb 29, 2016 at 3:55 PM, Edward Dore <edward.dore@freethought-internet.co.uk> wrote:

...

...
On 29 Feb 2016, at 12:37, Pavel Odintsov <pavel.odintsov@gmail.com> wrote:

Hello!

Nice information. Very interesting architecture. They are using L3 on IX? How big Juniper Lan in comparison with Extreme lan?

Hi Pavel,

The Juniper LAN is VPLS and the Extreme LAN is standard layer 2 with EAPS instead of *STP.

The Juniper LAN peaks at ~2.7Tbps (https://stats.linx.net/lans#lon1), whilst the Extreme LAN peaks at ~0.6Tbps (https://stats.linx.net/lans#lon2).

Edward Dore Freethought Internet

-- Sincerely yours, Pavel Odintsov

Nick Hilliard

1:11 p.m.

Roland Dobbins wrote:

...

Inconsistent stats, lack of ifindex information.

I've not yet come across an sflow implementation which didn't fill out the ifindex field. No doubt they exist. Not sure what you mean by "inconsistent stats". Nick

Nikolay Shopik

10:15 a.m.

Cisco Nexus switches support sflow, since they are broadcom based. On 29/02/16 10:26, Pavel Odintsov wrote:

...

Cisco do not support this protocol at all (that's pretty weird, really).

Pavel Odintsov

1 Mar 1 Mar

7:44 a.m.

Yep, Broadcom doing thing right way! :) But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches. But I like this direction of switch silicom unification :) Focus moved from "network brands" with Not Invented Here syndrome to enough smart agnostic hardware vendors. On Mon, Feb 29, 2016 at 1:15 PM, Nikolay Shopik <shopik+lists@nvcube.net> wrote:

...

Cisco Nexus switches support sflow, since they are broadcom based.

On 29/02/16 10:26, Pavel Odintsov wrote:

...
Cisco do not support this protocol at all (that's pretty weird, really).

-- Sincerely yours, Pavel Odintsov

Mark Tinka

2:24 p.m.

On 1/Mar/16 09:44, Pavel Odintsov wrote:

...

But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches.

As opposed to? We are looking at the Nexus 7700 for 100Gbps core switching. Mark.

Pavel Odintsov

2:33 p.m.

As opposed to older Cisco switches. Btw, 100GE is pretty new and actually I have experience only with Extreme Black Diamond 8. On Tue, Mar 1, 2016 at 5:24 PM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...

On 1/Mar/16 09:44, Pavel Odintsov wrote:

...
But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches.

As opposed to?

We are looking at the Nexus 7700 for 100Gbps core switching.

Mark.

-- Sincerely yours, Pavel Odintsov

Mark Tinka

2:38 p.m.

On 1/Mar/16 16:33, Pavel Odintsov wrote:

...

As opposed to older Cisco switches.

Well, every vendor has older switches.

...

Btw, 100GE is pretty new and actually I have experience only with Extreme Black Diamond 8.

Does not mean the Nexus is a bad choice for high capacity core switching. Just means you know Extreme. Mark.

Pavel Odintsov

2:42 p.m.

Yep, actually do not mean. I've never used Nexus and haven't any experience with it :) I mentioned this in original message. I'm pretty sure it's awesome switch. But as I haven't any experience I do not known cons and pros about it. On Tue, Mar 1, 2016 at 5:38 PM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...

On 1/Mar/16 16:33, Pavel Odintsov wrote:

...
As opposed to older Cisco switches.

Well, every vendor has older switches.

...
Btw, 100GE is pretty new and actually I have experience only with Extreme Black Diamond 8.

Does not mean the Nexus is a bad choice for high capacity core switching. Just means you know Extreme.

Mark.

-- Sincerely yours, Pavel Odintsov

David Bass

2:37 p.m.

I don't agree with that statement (about rare to find big companies using Nexus). If you want 10 gig/40 gig (or 100 gig soon) your options are Cisco Nexus/Arista/Juniper QFX...some periphery devices as well, but the majority use one of those 3. The merchant silicon based switches are pretty reasonably priced too.

...

On Mar 1, 2016, at 9:24 AM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...
On 1/Mar/16 09:44, Pavel Odintsov wrote: But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches.

As opposed to?

We are looking at the Nexus 7700 for 100Gbps core switching.

Mark.

Josh Reynolds

2:43 p.m.

Brocade as well. On Mar 1, 2016 8:39 AM, "David Bass" <davidbass570@gmail.com> wrote:

...

I don't agree with that statement (about rare to find big companies using Nexus). If you want 10 gig/40 gig (or 100 gig soon) your options are Cisco Nexus/Arista/Juniper QFX...some periphery devices as well, but the majority use one of those 3.

The merchant silicon based switches are pretty reasonably priced too.

...
On Mar 1, 2016, at 9:24 AM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...
On 1/Mar/16 09:44, Pavel Odintsov wrote: But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches.

As opposed to?

We are looking at the Nexus 7700 for 100Gbps core switching.

Mark.

Nikolay Shopik

3:07 p.m.

On 01/03/16 10:44, Pavel Odintsov wrote:

...

But unfortunately they (Cisco Nexus) are pretty expensive and fairly new for DC and ISP market. It's pretty rare to find big company with switching backbone on Nexus switches.

You could go with withbox switches, which is based on same broadcom ASIC, but this means you have to deal with new commercial NOS and learn its quirks. Or you could hack around with OpenSwitch and ask Broadcom to include you favorite vendor/model into OpenNSL, so you could actually use ASIC w/o siging NDA.

Mark Tinka

2:13 p.m.

On 29/Feb/16 12:15, Nikolay Shopik wrote:

...

Cisco Nexus switches support sflow, since they are broadcom based.

Not all of them, just the Nexus 9000, IIRC. Mark.

Nikolay Shopik

2:55 p.m.

On 01/03/16 17:13, Mark Tinka wrote:

...

On 29/Feb/16 12:15, Nikolay Shopik wrote:

...
Cisco Nexus switches support sflow, since they are broadcom based.

Not all of them, just the Nexus 9000, IIRC.

Nexus 3000 also broadcom, but maybe not all models.

Peter Phaal

3:18 p.m.

On Tue, Mar 1, 2016 at 6:13 AM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...

On 29/Feb/16 12:15, Nikolay Shopik wrote:

...
Cisco Nexus switches support sflow, since they are broadcom based.

Not all of them, just the Nexus 9000, IIRC.

The situation in the Cisco Nexus line is confusing. In addition, to the Nexus 9000 series, the Nexus 3000 series and 3100 series are also Broadcom based and also support sFlow. The Nexus 3500 series and 6000 series use Cisco ASICs and don't have sFlow or NetFlow support. It also appears that Cisco's merchant silicon based switches have a greater variety of orchestration capabilities, Python, NX-API, Ansible, etc.

Mark Tinka

2 Mar 2 Mar

6:04 a.m.

On 1/Mar/16 17:18, Peter Phaal wrote:

...

It also appears that Cisco's merchant silicon based switches have a greater variety of orchestration capabilities, Python, NX-API, Ansible, etc.

We were initially looking at at the Nexus 9000, but then moved to the 7700 because the Broadcom chip on the 7700 cannot do single flows larger than 40Gbps on the 100Gbps ports. As a general note, I'm having to avoid merchant silicon left-right-and-centre. Every time I try to give them a chance, they don't cut the mustard. When the next chip solves the last issue, I discover it can't support another feature. The cycle repeats. Mark.

Mark Tinka

6:12 a.m.

On 2/Mar/16 08:04, Mark Tinka wrote:

...

We were initially looking at at the Nexus 9000, but then moved to the 7700 because the Broadcom chip on the 7700 cannot do single flows larger than 40Gbps on the 100Gbps ports.

The Broadcom chip on the 9000, I meant... Mark.

Peter Phaal

5:25 p.m.

...

On Mar 1, 2016, at 10:12 PM, Mark Tinka <mark.tinka@seacom.mu> wrote:

...
On 2/Mar/16 08:04, Mark Tinka wrote:

We were initially looking at at the Nexus 9000, but then moved to the 7700 because the Broadcom chip on the 7700 cannot do single flows larger than 40Gbps on the 100Gbps ports.

The Broadcom chip on the 9000, I meant...

Mark.

The Nexus 3200 should work well with 100G flows - I believe it's based on the latest Broadcom Tomahawk ASIC. The older Trident II ASICs in the Nexus 9k are 40g parts

Nick Hilliard

5:30 p.m.

Peter Phaal wrote:

...

The Nexus 3200 should work well with 100G flows - I believe it's based on the latest Broadcom Tomahawk ASIC. The older Trident II ASICs in the Nexus 9k are 40g parts

does nx-os still force ingress-and-egress sflow? sflow is pretty useless you can define an accounting perimeter, which means that you need either ingress across the board, or egress. ingress-and-egress is basically useless because you end up double counting everything. Nick

Peter Phaal

10:37 p.m.

On Wed, Mar 2, 2016 at 9:30 AM, Nick Hilliard <nick@foobar.org> wrote:

...

Peter Phaal wrote:

...
The Nexus 3200 should work well with 100G flows - I believe it's based on the latest Broadcom Tomahawk ASIC. The older Trident II ASICs in the Nexus 9k are 40g parts

does nx-os still force ingress-and-egress sflow? sflow is pretty useless you can define an accounting perimeter, which means that you need either ingress across the board, or egress. ingress-and-egress is basically useless because you end up double counting everything.

Monitoring ingress and egress in the switch is wasteful of resources. In most use switch use cases (a leaf / spine fabric for example) the next hop switch will also be reporting ingress sFlow and so when you combine sFlow streams from both switches you get bi-directional visibility into every link. Enabling ingress only sFlow on all switch ports catches all packet paths and halves the overhead of bi-directional sampling. The sFlow architecture shifts intelligence from the devices to external software. The goal is to have a general purpose telemetry stream that can be used for a variety of purposes. Rather than having the complexity of configuring sFlow selectively at the sender, the receiver is responsible for de-duplicate the sFlow stream for accounting (the packet stream selection and elimination you are doing in the switch configuration can equally be applied on receipt). Shifting the decision to the collector means you can also use the stream to diagnose performance problems (for example identifying top flows on a busy link), traffic engineering of large flows, etc. If the sender is configured to suite one application, you limit the value of the measurements for other applications. An often overlooked feature of sFlow is that the agent also periodically sends interface counters (reducing or eliminating the need for SNMP polling in many use cases). The counters and packet samples are tied together in the sFlow data model - for example you can use the interface speed information from the counter samples to compute utilizations based on the packet sample stream etc). Broadcom also defined sFlow metrics to provide additional visibility into the ASIC forwarding pipeline (layer 2 / layer 3 / ACL table utilization, buffer utilization, microburst detection) and the inclusion of these metrics with the samples packet data in the sFlow telemetry stream provides a way to identify the traffic that is consuming the hardware resources.

Nick Hilliard

10:45 p.m.

Peter Phaal wrote:

...

Monitoring ingress and egress in the switch is wasteful of resources.

It's more than a waste of resources: it's pathologically broken and Cisco decline to fix it, despite the fact that enabling ingress-only or egress-only is fully supported via API in the Broadcom SDKs, and consequently the amount of configuration glue required to fix it in NX-OS is nearly zero. Broadcom chipsets don't support netflow, so sflow is the only game in town if you need data telemetry on broadcom-based ToR boxes. As I said in a previous email on this thread, refusing to support this properly is a harmful and short sighted approach to customers' requirements. Nick

Peter Phaal

11:05 p.m.

On Wed, Mar 2, 2016 at 2:45 PM, Nick Hilliard <nick@foobar.org> wrote:

...

Peter Phaal wrote:

...
Monitoring ingress and egress in the switch is wasteful of resources.

It's more than a waste of resources: it's pathologically broken and Cisco decline to fix it, despite the fact that enabling ingress-only or egress-only is fully supported via API in the Broadcom SDKs, and consequently the amount of configuration glue required to fix it in NX-OS is nearly zero.

Broadcom chipsets don't support netflow, so sflow is the only game in town if you need data telemetry on broadcom-based ToR boxes.

As I said in a previous email on this thread, refusing to support this properly is a harmful and short sighted approach to customers' requirements.

I think "pathologically broken" somewhat overstates the case. Bidirectional sampling is allowed by the sFlow spec and other vendors have made that choice. Another vendor used to implement egress only sampling (also allowed) but unusual. I agree that ingress is the most common and easiest to deal with, but a decent sFlow analyzer should be able to handle all three cases without over / under counting. More annoying is differences in how vendors report telemetry from LAG / MLAG topologies. The "sFlow LAG Counters Structure" extension was published in 2012 and defines how counters and samples should be generated on LAGs. Anyone with using LAG / MLAG topologies might want to ask their vendor if they support / plan to support the extension.

Nick Hilliard

3 Mar 3 Mar

11:53 a.m.

Peter Phaal wrote:

...

I think "pathologically broken" somewhat overstates the case. Bidirectional sampling is allowed by the sFlow spec and other vendors have made that choice. Another vendor used to implement egress only sampling (also allowed) but unusual. I agree that ingress is the most common and easiest to deal with, but a decent sFlow analyzer should be able to handle all three cases without over / under counting.

Bidirectional sampling doesn't allow you to define an sampling perimeter on your switch topology. This means that if you if you have anything other than a trivial topology, you will end up double-counting your traffic. The only way to work around this is to get the collector to discard 50% of the samples or otherwise write down the amount of traffic by 50%, assuming a standard accounting perimeter configuration. This is broken. The thing is, this is ridiculously easy to fix in code. The hooks are already there. Nick

Peter Phaal

3:26 p.m.

While it would be nice if the Nexus switches supported ingress sampling, you can get exactly the same result at the receiving end by dropping the egress samples. The following sflowtool output shows some of the metadata contained in the packet sample: startSample ---------------------- sampleType_tag 0:1 sampleType FLOWSAMPLE sampleSequenceNo 1022129 sourceId 0:7 meanSkipCount 128 samplePool 130832512 dropEvents 0 inputPort 7 outputPort 10 The two fields of interest are the sourceId (0:7) indicating that this measurement came from a data source of type ifIndex (0) and that the ifIndex of the data sources is 7. The inputPort is the ifIndex of the port that received the packet. In this case because the dataSource ifIndex and the inputPort ifIndex are the same, this is an ingress sampled packet. A simple filter along the lines: if ( sourceId.split(':')[1] != inputPort) return; would allow your sFlow analyzer to eliminate the unwanted samples. You could also enable / disable ports on your switches to ensure that each path is sampled once, but that does limit the types of analysis you can do with the data. A better approach is to simply add additional input filters to specify which edge data sources you want to include / exclude in your traffic accounting application since this would allow the full sFlow feed to be used for other purposes as well (identifying traffic on busy links, etc.) The overhead of enabling sFlow on all ports and all devices is generally quite small since packets are sampled in hardware and production sampling rates tend to be in range (1,000 - 50,000) so very little traffic measurement traffic is actually generated. A more important consideration is operational complexity. If you have thousands of switches, designing customized configurations for each one doesn't make a lot of sense. It's much better if the intelligence is applied at the collecting end. Taking this approach and including sensible defaults in the agents can get the sFlow agent configuration down to something as simple as: sflow { DNSSD = off collector { ip = 10.0.0.162 } } And you could go even simpler if you use DNS SRV records to identify the sFlow collector(s) sflow { DNSSD=on } These configurations are from Cumulus Linux. One of the trends in merchant silicon based platforms is inclusions of the ONIE boot loader. If you don't like the network operating system, you can install a different operating system to better suite your requirements without ripping and replacing hardware. There are many virtually identical switches built around the Broadcom ASICs, giving a lot of choice in hardware and network operating system vendor. On Thu, Mar 3, 2016 at 3:53 AM, Nick Hilliard <nick@foobar.org> wrote:

...

Peter Phaal wrote:

...
I think "pathologically broken" somewhat overstates the case. Bidirectional sampling is allowed by the sFlow spec and other vendors have made that choice. Another vendor used to implement egress only sampling (also allowed) but unusual. I agree that ingress is the most common and easiest to deal with, but a decent sFlow analyzer should be able to handle all three cases without over / under counting.

Bidirectional sampling doesn't allow you to define an sampling perimeter on your switch topology. This means that if you if you have anything other than a trivial topology, you will end up double-counting your traffic. The only way to work around this is to get the collector to discard 50% of the samples or otherwise write down the amount of traffic by 50%, assuming a standard accounting perimeter configuration. This is broken.

The thing is, this is ridiculously easy to fix in code. The hooks are already there.

Nick

Nick Hilliard

5:16 p.m.

Peter Phaal wrote:

...

sampled packet. A simple filter along the lines:

if ( sourceId.split(':')[1] != inputPort) return;

It's not a one-liner in practice. It involves maintaining an array of source ip + egressPorts with sflow enabled and (depending on how you implement it) e.g. ensuring that all flow samples with a destination port of one of the entries on the list is excluded from the flow processing. Alternatively, you can set up a full accounting perimeter and lop off 50% of the packet and byte counts. The beauty of sflow is that you can do anything in the collector, but most people aren't going to do this because it means maintaining two sets of data about your flow configuration: one set on the switch and one set in your collector code which you've now diverged from the mainstream distribution, thereby creating a requirement for future maintenance, with associated costs. Cisco could just fix the problem rather than lalala-ing about how it's now a feature because it's been documented. Broadcom's SDK makes it simple to implement this and there is no technical reason for Cisco to continue to decline to fix the problem. Nick

Peter Phaal

6:23 p.m.

On Thu, Mar 3, 2016 at 9:16 AM, Nick Hilliard <nick@foobar.org> wrote:

...

The beauty of sflow is that you can do anything in the collector, but most people aren't going to do this because it means maintaining two sets of data about your flow configuration: one set on the switch and one set in your collector code which you've now diverged from the mainstream distribution, thereby creating a requirement for future maintenance, with associated costs.

I completely agree that you don't want to maintain two sets of configurations (switch and collector) that need to be updated. However, it's much better to focus on minimizing switch configuration complexity than it is to focus on reducing analyzer software configuration complexity. If you push the problem of de-duplication to the switch configurations then you end up with VxT sets of switch configuration in a multi-vendor network (where T is the number of topologically different wiring configurations used for the switches and V is the number of vendors - actually it can be even worse, since each vendor product line might have different configuration options, CatOS vs NX-OS for example). Adopting a standard sFlow agent configuration in which monitoring is enabled on all switch ports with policy based default sampling rates (a good default sampling rate = port speed in gigabits per second x 1000. e.g. 1-in-10,000 on a 10G port) greatly simplifies large scale sFlow deployments. Now instead of maintaining and updating VxT configurations in thousands of switches, there are only V switch configurations that are installed when the switches are initially provisioned and remain static over the lifetime of the network. Updating the single, central configuration of the sFlow analyzer software is much simpler and easily automated. It also makes it much easier to roll out new analytics capabilities since they are simply configuration and software updates to the central collector and don't require building, testing and deploying configurations to all the devices.

Mark Tinka

6:47 a.m.

On 2/Mar/16 19:25, Peter Phaal wrote:

...

The Nexus 3200 should work well with 100G flows - I believe it's based on the latest Broadcom Tomahawk ASIC. The older Trident II ASICs in the Nexus 9k are 40g parts.

Yes, the new Tomahawk chips support native 100Gbps lanes. Mark.

Nick Hilliard

29 Feb 29 Feb

1:19 p.m.

Pavel Odintsov wrote:

...

From hardware point of view almost all brand new switches support sflow free of charge (no additional licenses or modules). But be aware, Cisco do not support this protocol at all (that's pretty weird, really).

sflow is supported on the Nexus 3k range, but it's available as ingress + egress only, non-configurable (despite the fact that the option to configure is available at the chipset level). This means that you can't define an account perimeter on the switch, which makes it mostly useless from a production point of view. Pretty much every other switch on the market will allow you to specify either ingress or egress. The only consistent explanation I can suggest for Cisco's vehement antipathy towards sflow is "not invented here". It's supported in hardware by several of the other chipsets that Cisco uses, but left non-enabled in software. TBH it's a short sighted and disappointing approach to telemetry. Nick

Saku Ytti

12:03 p.m.

On 29 February 2016 at 04:24, Roland Dobbins <rdobbins@arbor.net> wrote:

...

...
Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users.

That's interesting, given that most larger routers don't support 1:1.

I find that strange, because if you're doing in in HW, doing hash lookup for flow and adding packets and bytes to the counter is cheap. It's expensive having lot of those flows, but incrementing their packet and byte counter isn't. I know that all JNPR Trio kit (MX, T, EX9k...) do 1:1. I guess if you're doing it in LC CPU things are very different. -- ++ytti

sthaug＠nethelp.no

12:17 p.m.

...

...
That's interesting, given that most larger routers don't support 1:1.

I find that strange, because if you're doing in in HW, doing hash lookup for flow and adding packets and bytes to the counter is cheap. It's expensive having lot of those flows, but incrementing their packet and byte counter isn't.

I know that all JNPR Trio kit (MX, T, EX9k...) do 1:1. I guess if you're doing it in LC CPU things are very different.

A relevant question might be if the Trio hardware can do 1:1 while handling multiple ports of line rate DDoS traffic consisting of small packets with different port numbers (i.e. high pps traffic resulting in basically 1 flow per packet). No, I don't know the answer (but I suspect it might be negative). Here we're using Trio hardware with 1:100 sampling, and are reasonably happy with the results. Steinar Haug, AS2116

Saku Ytti

12:41 p.m.

On 29 February 2016 at 14:17, <sthaug@nethelp.no> wrote:

...

A relevant question might be if the Trio hardware can do 1:1 while handling multiple ports of line rate DDoS traffic consisting of small packets with different port numbers (i.e. high pps traffic resulting in basically 1 flow per packet). No, I don't know the answer (but I suspect it might be negative).

I cannot see why not, it's cheap. You're doing 1-2 LPM on the packet, QoS lookup, ACL lookup, incrementing various counters, etc., adding one hash lookup and two counters is not going to be relevant cost to the lookup time. Having many entries in the hash table is an issue, incrementing their counters is not.

...

Here we're using Trio hardware with 1:100 sampling, and are reasonably happy with the results.

-- ++ytti

Nick Hilliard

1:05 p.m.

Saku Ytti wrote:

...

I cannot see why not, it's cheap. You're doing 1-2 LPM on the packet, QoS lookup, ACL lookup, incrementing various counters, etc., adding one hash lookup and two counters is not going to be relevant cost to the lookup time.

depends on what you define by "cheap". Netflow requires separate packet forwarding lookup and ACL handling silicon.

...

Having many entries in the hash table is an issue, incrementing their counters is not.

it is certainly an issue if you get splatted with lots of discrete junk flow, yes. Neither of these are a problem for sflow. It just plucks packets out of the data plane at a pre-defined rate and forwards their headers to the collector. So long as your sampler is accurate, it's great. Nick

Saku Ytti

1:31 p.m.

On 29 February 2016 at 15:05, Nick Hilliard <nick@foobar.org> wrote:

...

depends on what you define by "cheap". Netflow requires separate packet forwarding lookup and ACL handling silicon.

That's not inherently so, it depends how specialised your hardware is. If it's very specialised like implementing just LPM, sure. If it's NPU, then no, that's not given. The cost is many entries in the hash table, not updating them. But if you'd emulate sflow behaviour in IPFIX then you don't need the hash tables or the counters.

...

Neither of these are a problem for sflow. It just plucks packets out of the data plane at a pre-defined rate and forwards their headers to the collector. So long as your sampler is accurate, it's great.

ACK and as in explained in earlier post, there is nothing stopping from IPFIX working like this. sflow is subset of what's possible in IPFIX. -- ++ytti

Phil Bedard

3:40 p.m.

-----Original Message----- From: NANOG <nanog-bounces@nanog.org> on behalf of Saku Ytti <saku@ytti.fi> Date: Monday, February 29, 2016 at 08:31 To: Nick Hilliard <nick@foobar.org> Cc: nanog list <nanog@nanog.org> Subject: Re: sFlow vs netFlow/IPFIX

...

On 29 February 2016 at 15:05, Nick Hilliard <nick@foobar.org> wrote:

...
depends on what you define by "cheap". Netflow requires separate packet forwarding lookup and ACL handling silicon.

That's not inherently so, it depends how specialised your hardware is. If it's very specialised like implementing just LPM, sure. If it's NPU, then no, that's not given.

I don’t think anyone uses dedicated Netflow HW these days. The ASICs have functionality for other things like mirroring, etc. which are augmented for Netflow use. Usually it’s a mix of dedicated functions in the ASICs and then the LC CPU and general CPU on some platforms. Really in the end the router is doing something like SFlow internally.

...

The cost is many entries in the hash table, not updating them. But if you'd emulate sflow behaviour in IPFIX then you don't need the hash tables or the counters.

It would be interesting to get some data from vendors on what the actual limitation is. I know with some new platforms like the NCS 55XX from Cisco (BRCM Jericho) it has limited space for counters, but I don’t know if that contributes to its minimum 1:8000 Netflow sampling rate. The new PTX FPC supporting Netflow has a minimum of 1:1000. Phil

Saku Ytti

7:06 p.m.

On 29 February 2016 at 17:40, Phil Bedard <bedard.phil@gmail.com> wrote:

...

It would be interesting to get some data from vendors on what the actual limitation is. I know with some new platforms like the NCS 55XX from Cisco (BRCM Jericho) it has limited space for counters, but I don’t know if that contributes to its minimum 1:8000 Netflow sampling rate. The new PTX FPC supporting Netflow has a minimum of 1:1000.

Are they are doing netflow in HW at all? To me it sounds like they might be doing it in LC CPU and HW is only doing sampled punting. Which would explain the performance hit. I would be very surprised if Jericho could do netflow in HW. And I'm 100% sure PE chip can't do netflow in HW. -- ++ytti

freedman＠freedman.net

7:27 a.m.

Re: limits - For Cisco/Juniper it's in the low hundreds of thousands of flows/sec per chipset/linecard for 1:1 NetFlow/IPFIX, I think. Then of course, as has been mentioned, you'll need to be able to send it and receive it to something - and store+query. Avi Freedman CEO, Kentik

...

On 28 February 2016 at 23:40, Nick Hilliard <nick@foobar.org> wrote:

<snip>

...

Around here they are currently voting on a law that will require unsampled 1:1 netflow on all data in an ISP network with more than 100 users. Then store that data for 1 year, so the police and other parties can request a copy (with a warrant but you are never allowed to tell anyone that they came for the data and the judges will never say no).

My routers can apparently actually do 1:1 netflow and the documentation does not state any limits on that. So maybe I am lucky?

To the original question: in this country sFlow only is apparently about to become illegal.

Regards,

Baldur

Phil Bedard

28 Feb 28 Feb

11:15 p.m.

What HW are your looking at our are you rolling your own probes? Router/switch HW almost never does both. Netflow/IPFIX puts the flow intelligence in the router, but with that comes more limitations. Sflow typically uses more BW because you are sending headers for each packet. The sflow collector also needs more intelligence since it's doing flow correlation, AS matching, etc. instead of the router doing it. However it is more flexible since adding a new header, like vxlan or NSH is much easier to implement in some analysis SW than router SW. Phil From: Todd Crane Sent: Sunday, February 28, 2016 3:09 PM To: nanog@nanog.org Subject: sFlow vs netFlow/IPFIX This maybe outside the scope of this list but I was wondering if anybody had advice or lessons learned on the whole sFlow vs netFlow debate. We are looking at using it for billing and influencing our sdn flows. It seems like everything I have found is biased (articles by companies who have commercial offerings for the "better" protocol) Todd Crane

freedman＠freedman.net

29 Feb 29 Feb

7:38 a.m.

...

This maybe outside the scope of this list but I was wondering if anybody had advice or lessons learned on the whole sFlow vs netFlow debate. We are looking at using it for billing and influencing our sdn flows. It seems like everything I have found is biased (articles by companies who have commercial offerings for the "better" protocol)

Todd Crane

Most vendors that take "flow" take both so there tends not to be THAT much religion unless you talk to someone who hates being flooded with 1:1 flow, or debugging broken (usually NetFlow) implementations. In our experience, they both basically work for ops use cases nowadays, for major vendors of routers, and most switches. sFlow gives faster feedback and more accurate (adding things up, * sample rates, closer to SNMP counter data) than most NetFlow/IPFIX implementations. How much varies from slightly to extreme (if you're using Catalysts for NetFlow/IPFIX). My thesis overall re: why sFlow 'just works' a bit better is that it's just so much easier to implement sFlow because there's no need to track flows (hash table or whatever data structure you need). Just grab samples of headers, parse, fill structs, and send. That said, things are hugely less sucky than 10 or even 5 years ago in the NetFlow world, and for the right vendor and box and software it's possible to get NetFlow/IPFIX essentially as accurate. And has been noted, it at least in theory some boxes that do tens to hundreds of gigabits (or low terabits) of traffic support 1:1, which you could in theory do with sFlow as a transport, but I haven't seen a switch or router that does that. Re: 1-1 flow - the boxes supporting that are generally not the biggest purchase-able from Cisco or Juniper, but are used as the big-boy backbone and border routers by a good number of multi-terabit networks, and even some multi-tens-of-terabit networks. Good luck in your flow journeys. Avi Freedman CEO, Kentik

Saku Ytti

12:12 p.m.

On 28 February 2016 at 22:06, Todd Crane <todd.crane@n5tech.com> wrote:

...

This maybe outside the scope of this list but I was wondering if anybody had advice or lessons learned on the whole sFlow vs netFlow debate. We are looking at using it for billing and influencing our sdn flows. It seems like everything I have found is biased (articles by companies who have commercial offerings for the "better" protocol)

I view sflow as subset of NetflowV9 V10/IPFIX. You could produce sflow behaviour in IPFIX, by adding record of packet sample, and exporting immediately instead of keeping state of flows. However you couldn't produce IPFIX behaviour in sflow, inherently so. sflow is older than IPFIX (v10) or v9 netflow, and I'm guessing no one would invent sflow today, they'd instead specify some restricted IPFIX with same behaviour. I completely understand why sflow was needed netflowv5 time, but I don't really see much point there now. It just means that collectors need to be more complicated than they must, by having two parsers. -- ++ytti

3520

Age (days ago)

3524

Last active (days ago)

List overview

Download

55 comments

16 participants

participants (16)

Baldur Norddahl
David Bass
Edward Dore
freedman＠freedman.net
Josh Reynolds
Mark Tinka
Nick Hilliard
Nikolay Shopik
Pavel Odintsov
Peter Phaal
Phil Bedard
Roland Dobbins
Saku Ytti
sthaug＠nethelp.no
Todd Crane
Valdis.Kletnieks＠vt.edu