NetFlow - path from Routers to Collector - Test

NetFlow - path from Routers to Collector

Serge Vautour

1 Sep 2015 1 Sep '15

3:33 p.m.

Hello, For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security? Thanks, Serge

Show replies by date

Roland Dobbins

1 Sep 1 Sep

3:38 p.m.

On 1 Sep 2015, at 22:33, Serge Vautour wrote:

...

Or do you send this traffic over private network management type path?

This is how to do it. So in case of a network partition event, you don't end up losing visibility into your network traffic when you need it the most. You should already have an OOB/DCN network for managing your routers/switches/hosts, no? Use that, after ensuring its scaled adequately. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Mark Tinka

3:39 p.m.

On 1/Sep/15 17:33, Serge Vautour wrote:

...

Hello,

For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security?

We forward it in-band. Have been doing so for years, no major drama. Mark.

Roland Dobbins

3:43 p.m.

On 1 Sep 2015, at 22:39, Mark Tinka wrote:

...

Have been doing so for years, no major drama.

Until there is. ;> ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Rod Beck

3:59 p.m.

Roland is correct. With the caveat that your Internet customer traffic may flow over the fibers as your separate management circuits. You should aim for end to end physical diversity. This is all common sense, but laziness some times takes precedence. Roderick Beck Sales - Europe and the Americas Hibernia Networks http://www.hibernianetworks.com Budapest and New York 36-30-859-5144 rod.beck@hibernianetworks.com ________________________________________ From: NANOG <nanog-bounces@nanog.org> on behalf of Roland Dobbins <rdobbins@arbor.net> Sent: Tuesday, September 1, 2015 5:43 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector On 1 Sep 2015, at 22:39, Mark Tinka wrote:

...

Have been doing so for years, no major drama.

Until there is. ;> ----------------------------------- Roland Dobbins <rdobbins@arbor.net> This e-mail and any attachments thereto is intended only for use by the addressee(s) named herein and may be proprietary and/or legally privileged. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this email, and any attachments thereto, without the prior written permission of the sender is strictly prohibited. If you receive this e-mail in error, please immediately telephone or e-mail the sender and permanently delete the original copy and any copy of this e-mail, and any printout thereof. All documents, contracts or agreements referred or attached to this e-mail are SUBJECT TO CONTRACT. The contents of an attachment to this e-mail may contain software viruses that could damage your own computer system. While Hibernia Networks has taken every reasonable precaution to minimize this risk, we cannot accept liability for any damage that you sustain as a result of software viruses. You should carry out your own virus checks before opening any attachment.

Shane Ronan

4:02 p.m.

It's usually not laziness, it's most often related to cost. On Sep 1, 2015 12:00 PM, "Rod Beck" <Rod.Beck@hibernianetworks.com> wrote:

...

Roland is correct. With the caveat that your Internet customer traffic may flow over the fibers as your separate management circuits. You should aim for end to end physical diversity. This is all common sense, but laziness some times takes precedence.

Roderick Beck Sales - Europe and the Americas Hibernia Networks http://www.hibernianetworks.com Budapest and New York 36-30-859-5144 rod.beck@hibernianetworks.com

________________________________________ From: NANOG <nanog-bounces@nanog.org> on behalf of Roland Dobbins < rdobbins@arbor.net> Sent: Tuesday, September 1, 2015 5:43 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector

On 1 Sep 2015, at 22:39, Mark Tinka wrote:

...
Have been doing so for years, no major drama.

Until there is.

;>

----------------------------------- Roland Dobbins <rdobbins@arbor.net> This e-mail and any attachments thereto is intended only for use by the addressee(s) named herein and may be proprietary and/or legally privileged. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this email, and any attachments thereto, without the prior written permission of the sender is strictly prohibited. If you receive this e-mail in error, please immediately telephone or e-mail the sender and permanently delete the original copy and any copy of this e-mail, and any printout thereof. All documents, contracts or agreements referred or attached to this e-mail are SUBJECT TO CONTRACT. The contents of an attachment to this e-mail may contain software viruses that could damage your own computer system. While Hibernia Networks has taken every reasonable precaution to minimize this risk, we cannot accept liability for any damage that you sustain as a result of software viruses. You should carry out your own virus checks before opening any attachment.

Mark Tinka

2 Sep 2 Sep

6:02 a.m.

On 1/Sep/15 17:59, Rod Beck wrote:

...

Roland is correct. With the caveat that your Internet customer traffic may flow over the fibers as your separate management circuits. You should aim for end to end physical diversity. This is all common sense, but laziness some times takes precedence.

Not very straight forward when you have a network spanning several continents. Mark.

Roland Dobbins

9:38 a.m.

On 2 Sep 2015, at 13:02, Mark Tinka wrote:

...

Not very straight forward when you have a network spanning several continents.

Again, VLANs/VRFs are generally Good Enough, and more than a few globe-spanning networks do just that. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Mark Tinka

9:48 a.m.

On 2/Sep/15 11:38, Roland Dobbins wrote:

...

Again, VLANs/VRFs are generally Good Enough, and more than a few globe-spanning networks do just that.

Those VLAN's and VRF's are following the same path as the global table, just in a different routing table. That is easy, and we do that already. Your assertion, before, was that the OoB network is physically separate from the routers it is supporting. This is less feasible at scale. Mark.

Roland Dobbins

10:11 a.m.

On 2 Sep 2015, at 16:48, Mark Tinka wrote:

...

Those VLAN's and VRF's are following the same path as the global table, just in a different routing table. That is easy, and we do that already.

Sure. But it's better than mixing it in with customer traffic.

...

Your assertion, before, was that the OoB network is physically separate from the routers it is supporting. This is less feasible at scale.

Ideally, it should be - that's what I was trying to get across. I understand that this isn't free, either from a capex or opex perspective. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Niels Bakker

1:25 p.m.

* rdobbins@arbor.net (Roland Dobbins) [Wed 02 Sep 2015, 12:12 CEST]:

...

On 2 Sep 2015, at 16:48, Mark Tinka wrote:

...
Those VLAN's and VRF's are following the same path as the global table, just in a different routing table. That is easy, and we do that already.

Sure. But it's better than mixing it in with customer traffic.

Why? Do your customer packets have cooties?

...

...
Your assertion, before, was that the OoB network is physically separate from the routers it is supporting. This is less feasible at scale.

Ideally, it should be - that's what I was trying to get across. I understand that this isn't free, either from a capex or opex perspective.

Which is exactly the argument that people with experience have been making on this mailing list. OOB is the 3G dialout on a terminal server that it uses once its regular outside connection fails. You don't want flow exports there, to give just one counterexample to your earlier assertions. -- Niels.

Roland Dobbins

2:02 p.m.

On 2 Sep 2015, at 20:25, Niels Bakker wrote:

...

Why? Do your customer packets have cooties?

Because you don't want things which disrupt customer traffic to disrupt your ability to see what's happening. Just as you don't want it to disrupt your ability to configure/manage your infrastructure.

...

Which is exactly the argument that people with experience have been making on this mailing list.

I think the problem here is that I failed to distinguish between logical and physical OOB. Physical is best, logical is generally Good Enough. There are some operators who send flow telemetry across physically distinct OOB infrastructure. More do it logically. Most still do it in-band mixed with production network traffic, but that is slowly changing.

...

OOB is the 3G dialout on a terminal server that it uses once its regular outside connection fails.

That's one example, yes.

...

You don't want flow exports there, to give just one counterexample to your earlier assertions.

On that particular category of OOB, of course not. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

2:08 p.m.

...

On Sep 2, 2015, at 10:02 AM, Roland Dobbins <rdobbins@arbor.net> wrote:

On 2 Sep 2015, at 20:25, Niels Bakker wrote:

...
Why? Do your customer packets have cooties?

Because you don't want things which disrupt customer traffic to disrupt your ability to see what's happening. Just as you don't want it to disrupt your ability to configure/manage your infrastructure.

It’s really because some people who drink the MPLS/VPN/VRF/VLAN kook-aid think it’s some magic that undoes fate sharing and proper engineering and planning. That a few bytes for a label of VLAN tag make your data more secure. It’s possible to build a network that works without all these vendor pushed tricks. I see where Roland is trying to go and he’s in the “magic byte” realm of the extra label makes it “OOB” where as the rest of us just see 1’s and 0’s on the wire and know a bit is a bit regardless of tag-switching (the original name for MPLS) or IEEE 802.1q label. I’m sure there are people still doing ISL but i’d rather not. - Jared

Roland Dobbins

2:13 p.m.

On 2 Sep 2015, at 21:08, Jared Mauch wrote:

...

I see where Roland is trying to go and he’s in the “magic byte” realm of the extra label makes it “OOB” where as the rest of us just see 1’s and 0’s on the wire and know a bit is a bit regardless of tag-switching (the original name for MPLS) or IEEE 802.1q label.

I know a bit is a bit, and it's on the same boxes and the same linecards and the same interfaces and the same RPs (if it comes to that). But keeping stuff separate at the IP logical network level is better than mixing it together, even on the same hardware. Doing it physically separately is best. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Mark Tinka

3:26 p.m.

On 2/Sep/15 16:13, Roland Dobbins wrote:

...

But keeping stuff separate at the IP logical network level is better than mixing it together, even on the same hardware.

But how, Roland. When the line card congests, it doesn't care that one bit was part of a VRF and the other doesn't. It all goes kaboom (even with the best of QoS intentions). Mark.

Roland Dobbins

4:11 p.m.

On 2 Sep 2015, at 22:26, Mark Tinka wrote:

...

When the line card congests, it doesn't care that one bit was part of a VRF and the other doesn't. It all goes kaboom (even with the best of QoS intentions).

You don't necessarily have to put everything on the same fiber, interface, the same ASIC cluster, the same LC-CPU/-NPU, the same linecard, etc. Fat-fingers in the global table or the Internet VRF or whatever won't cause problems in the management VRF, unless via route-leaking policies which allow them to do so or the kind of routing-table explosion which takes down a linecard or the whole box. A hardware casualty or software fault which takes down a linecard may not take down the whole box. And so forth. iACLs are simpler, don't have to be updated so frequently to account for moves/adds/changes of the management infrastructure. It's easier to apply QoS policies to reserve bandwidth for telemetry and other management-plane traffic, etc. And so forth. All this is highly variable and situationally-specific, but logical separation of management-plane traffic from production data-plane traffic is in general desirable, even as it's running on (at least some of) the same hardware. It isn't as good as true physical separation, but there's no sense in making the perfect the enemy of the merely good. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

jim deleskie

4:20 p.m.

Adding VRFs/VLAN's/anything else to separate the traffic to reduce fate sharing is only adding complexity that will likely result in operator errors. While many of us have clue, even when we don't agree on the solutions, there are many more out there typing at routers at 2am, when even the simplest of configs will mix someone up and cause an out. The stats prove out these types of errors are more likely to cause an outage then DDoS or anything else. Now if we could only build and sell devices to stop operator error. On Wed, Sep 2, 2015 at 1:11 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 2 Sep 2015, at 22:26, Mark Tinka wrote:

When the line card congests, it doesn't care that one bit was part of a

...
VRF and the other doesn't. It all goes kaboom (even with the best of QoS intentions).

You don't necessarily have to put everything on the same fiber, interface, the same ASIC cluster, the same LC-CPU/-NPU, the same linecard, etc.

Fat-fingers in the global table or the Internet VRF or whatever won't cause problems in the management VRF, unless via route-leaking policies which allow them to do so or the kind of routing-table explosion which takes down a linecard or the whole box. A hardware casualty or software fault which takes down a linecard may not take down the whole box. And so forth.

iACLs are simpler, don't have to be updated so frequently to account for moves/adds/changes of the management infrastructure. It's easier to apply QoS policies to reserve bandwidth for telemetry and other management-plane traffic, etc. And so forth.

All this is highly variable and situationally-specific, but logical separation of management-plane traffic from production data-plane traffic is in general desirable, even as it's running on (at least some of) the same hardware. It isn't as good as true physical separation, but there's no sense in making the perfect the enemy of the merely good.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Roland Dobbins

4:30 p.m.

On 2 Sep 2015, at 23:20, jim deleskie wrote:

...

The stats prove out these types of errors are more likely to cause an outage then DDoS or anything else.

Completely concur that there are always complexity tradeoffs. And of course, the goal is not to have to type on routers at all, or at least minimally. Progress is being made in this arena, but as you indicate, it's unevenly distributed. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Mark Tinka

3:22 p.m.

On 2/Sep/15 16:08, Jared Mauch wrote:

...

It’s really because some people who drink the MPLS/VPN/VRF/VLAN kook-aid think it’s some magic that undoes fate sharing and proper engineering and planning. That a few bytes for a label of VLAN tag make your data more secure.

It’s possible to build a network that works without all these vendor pushed tricks. I see where Roland is trying to go and he’s in the “magic byte” realm of the extra label makes it “OOB” where as the rest of us just see 1’s and 0’s on the wire and know a bit is a bit regardless of tag-switching (the original name for MPLS) or IEEE 802.1q label. I’m sure there are people still doing ISL but i’d rather not.

There was a time when the early MPLS/VPN adopters built physically separate routers for MPLS traffic. When it became clear that this was not a good way to scale, they moved to building dedicated line cards in shared routers for MPLS/VPN's. As we see today, those that build - heaven forbid - "converged" networks tend to derive better ROI's from their network infrastructure. I'd be hard-pressed to hear from even the largest of operators physically separating MPLS and IP traffic at the hardware and/or link level. As you, Jared, say, and as I said in a previous post, both MPLS and IP traffic follows the same data plane. The routing table separation construct does not survive chassis-wide failures. Mark.

Roland Dobbins

3:26 p.m.

On 2 Sep 2015, at 22:22, Mark Tinka wrote:

...

As you, Jared, say, and as I said in a previous post, both MPLS and IP traffic follows the same data plane. The routing table separation construct does not survive chassis-wide failures.

Everyone here understands that. We also understand that pps budgets, ASIC resources, LC-CPU resources, etc. are held in common in such scenarios. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Mark Tinka

3:18 p.m.

On 2/Sep/15 12:11, Roland Dobbins wrote:

...

Sure. But it's better than mixing it in with customer traffic.

Does not make much of a difference - it's the same data plane infrastructure.

...

Ideally, it should be - that's what I was trying to get across. I understand that this isn't free, either from a capex or opex perspective.

We are in agreement. Mark.

Mark Tinka

6:01 a.m.

On 1/Sep/15 17:43, Roland Dobbins wrote:

...

Until there is.

As with everything in life... Mark.

Job Snijders

1 Sep 1 Sep

4:10 p.m.

On Tue, Sep 01, 2015 at 08:33:42AM -0700, Serge Vautour wrote:

...

For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security?

To answer your first question: i see no issue in transporting flow export traffic over the same backbone used to serve customer traffic. Not entirely security related, but a neat trick is to use a tool like 'samplicator' to distribute the UDP packets to all applications of interest. You'll find that on many router platforms you can only configure a limited amount of netflow/sflow collectors, often less then the amount of applications that need the data for dissemination. Especially if you have multiple independent instances of the application for redundancy purposes! And, keep in mind, you can anycast the instances of 'samplicator' in your network :-) https://github.com/sleinen/samplicator Kind regards, Job

Roland Dobbins

4:14 p.m.

On 1 Sep 2015, at 23:10, Job Snijders wrote:

...

To answer your first question: i see no issue in transporting flow export traffic over the same backbone used to serve customer traffic.

This is not good advice, for the reasons I stated previously in this thread. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Steve Meuse

5:08 p.m.

On Tue, Sep 1, 2015 at 12:14 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

...

On 1 Sep 2015, at 23:10, Job Snijders wrote:

...
To answer your first question: i see no issue in transporting flow export traffic over the same backbone used to serve customer traffic.

This is not good advice, for the reasons I stated previously in this thread.

Your advice is not "one size fits all". I've done netflow over production links for two very large backbone networks. Over the combined 17(?) years, never saw a problem. If your network is likely to be partitioned by a small number of failures, that might be a different case, but you can't make blanket statements. -Steve

Roland Dobbins

5:12 p.m.

On 2 Sep 2015, at 0:08, Steve Meuse wrote:

...

Your advice is not "one size fits all".

Actually, it is. Large backbone networks have DCNs/OOBs, and that's where they export their NDE.

...

I've done netflow over production links for two very large backbone networks. Did you manage your routers and switches and hosts and so forth in-band, too?

...

Over the combined 17(?) years, never saw a problem.

Until you do. Running flow telemetry in-band is penny-wise and pound-foolish, for networks of any size, in any circumstances. All management-plane traffic (and that's what flow telemetry is) should be segregated from the production network data plane. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Shane Ronan

5:18 p.m.

Roland, While your way may be best practice, sometimes real life gets in the way of best practice. Shane On 9/1/15 1:12 PM, Roland Dobbins wrote:

...

On 2 Sep 2015, at 0:08, Steve Meuse wrote:

...
Your advice is not "one size fits all".

Actually, it is.

Large backbone networks have DCNs/OOBs, and that's where they export their NDE.

...
I've done netflow over production links for two very large backbone networks. Did you manage your routers and switches and hosts and so forth in-band, too?

...
Over the combined 17(?) years, never saw a problem.

Until you do.

Running flow telemetry in-band is penny-wise and pound-foolish, for networks of any size, in any circumstances. All management-plane traffic (and that's what flow telemetry is) should be segregated from the production network data plane.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Niels Bakker

5:18 p.m.

* rdobbins@arbor.net (Roland Dobbins) [Tue 01 Sep 2015, 19:13 CEST]:

...

Running flow telemetry in-band is penny-wise and pound-foolish, for networks of any size, in any circumstances.

You're just wrong here. -- Niels.

Roland Dobbins

5:29 p.m.

On 2 Sep 2015, at 0:18, Niels Bakker wrote:

...

You're just wrong here.

Sorry, I'm not. I've seen what happens when flow telemetry is 'squeezed out' by pipe-filling DDoS attacks, interrupted by fat-fingers, etc. It'll happen to you, one day. And then you'll understand. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Shane Ronan

5:32 p.m.

So in your world, the money always exists for a separate flow telemetry network? On 9/1/15 1:29 PM, Roland Dobbins wrote:

...

On 2 Sep 2015, at 0:18, Niels Bakker wrote:

...
You're just wrong here.

Sorry, I'm not. I've seen what happens when flow telemetry is 'squeezed out' by pipe-filling DDoS attacks, interrupted by fat-fingers, etc.

It'll happen to you, one day. And then you'll understand.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Roland Dobbins

5:36 p.m.

On 2 Sep 2015, at 0:32, Shane Ronan wrote:

...

So in your world, the money always exists for a separate flow telemetry network?

It should've already been spent for an OOB/DCN network, which should've been provisioned with flow telemetry in mind. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

George, Wes

7:38 p.m.

On 9/1/15, 1:36 PM, "NANOG on behalf of Roland Dobbins" <nanog-bounces@nanog.org on behalf of rdobbins@arbor.net> wrote:

...

It should've already been spent for an OOB/DCN network, which should've been provisioned with flow telemetry in mind.

I'm going to interpret that "should" in the same way as the MUST in RFC6919. :-) Yes, it's a good practice, but like most other proactive security measures, is extremely hard to justify spending money on it to avoid the risk that it breaks fantastically when it is needed most. Though you could provide a little insurance against the problem you're highlighting here via a QoS policy that prioritizes flow data over customer traffic. Several of the OOB networks/designs I'm familiar with significantly predate the entire concept of flow telemetry, as well as my own networking career, and are still rocking the same set of Cisco 2500 routers with async cards (many with uptimes measured in years) and 64k leased lines or dialup on demand they've been using for literally almost 2 decades. When one of those ancient devices dies of old age, you scrounge for the cheapest equivalent you can find to replace it to maintain your oob access to the 9600/8/1/none console ports for when things have gone truly pear-shaped. Often there is a separate management network that can deal with ethernet speeds, but it's separate for security reasons and not always as rigidly independent from the in band network for connectivity, i.e. It might be a VPN riding over the regular network and thus not completely protected from the problem you're concerned about. Thanks, Wes Anything below this line has been added by my company’s mail server, I have no control over it. -----------

...

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.

Roland Dobbins

11:03 p.m.

On 2 Sep 2015, at 2:38, George, Wes wrote:

...

Often there is a separate management network that can deal with ethernet speeds, but it's separate for security reasons and not always as rigidly independent from the in band network for connectivity, i.e. It might be a VPN riding over the regular network and thus not completely protected from the problem you're concerned about.

Sure, or a VRF, or whatever. While that's not ideal, it's far better than doing management-plane stuff inband in the production network, though. And those 2500 console concentrator connections are a great resource to have when everything goes haywire and you need something that lets you get to and actually type on the console. I'm not knocking them, and I understand that old, grandfathered equipment is used for these applications, and understand that in many cases they're underprovisioned for flow telemetry. Which is why using VLANs, VRFs, whatever on the production network gear is completely understandable, and a lot of folks do just as you say. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Niels Bakker

2 Sep 2 Sep

2:11 p.m.

* rdobbins@arbor.net (Roland Dobbins) [Wed 02 Sep 2015, 01:06 CEST]:

...

Sure, or a VRF, or whatever.

You just moved the goalposts. :> -- Niels.

Tarko Tikan

1 Sep 1 Sep

7:47 p.m.

hey,

...

It should've already been spent for an OOB/DCN network, which should've been provisioned with flow telemetry in mind.

Bad advice. No amount of money will fix major platforms that are not happy to export flow telemetry via router management ports. Sometimes it can be done via nasty vrf leaking hacks, sometimes it cannot be done at all. Management ports are typically directly connected to routing engines while netflow data is generated in hardware in PFE. In-band netflow works on all platforms without such issues. -- tarko

Valdis.Kletnieks＠vt.edu

8:13 p.m.

On Tue, 01 Sep 2015 22:47:09 +0300, Tarko Tikan said:

...

hey,

...
It should've already been spent for an OOB/DCN network, which should've been provisioned with flow telemetry in mind.

Bad advice. No amount of money will fix major platforms that are not happy to export flow telemetry via router management ports.

And that box ended up in your rack, why exactly? <insert excuses that boil down to "We were willing to accept gear that didn't have this functionality because we were OK with sending flow telemetry over the inband network">

Niels Bakker

8:24 p.m.

...

On Tue, 01 Sep 2015 22:47:09 +0300, Tarko Tikan said:

...
Bad advice. No amount of money will fix major platforms that are not happy to export flow telemetry via router management ports.

Correct. And for a proper network you may not wish to have those connections from in-band ports to your OOB/management network everywhere. * Valdis.Kletnieks@vt.edu (Valdis.Kletnieks@vt.edu) [Tue 01 Sep 2015, 22:13 CEST]:

...

And that box ended up in your rack, why exactly?

Because variety of flow telemetry delivery options isn't the #1 ranked purchasing decider. Otherwise no Cisco would ever have been sold.

...

<insert excuses that boil down to "We were willing to accept gear that didn't have this functionality because we were OK with sending flow telemetry over the inband network">

Everything is a tradeoff. Welcome to the real world, where we have to make things work rather than pose on mailing lists about what we think other people should have. -- Niels.

Roland Dobbins

10:36 p.m.

On 2 Sep 2015, at 3:24, Niels Bakker wrote:

...

Because variety of flow telemetry delivery options isn't the #1 ranked purchasing decider.

Actually, it is more often than you think. No use routing packets if you can't see what they do.

...

Otherwise no Cisco would ever have been sold.

Which is utter nonsense, of course, since Cisco a) invented flow telemetry and b) has been the consistent leader in innovating flow telemetry (FNF, IPFIX, anyone?). The EARL6/EARL7 problems are the only stumbles Cisco has made in this regard. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

10:38 p.m.

...

On Sep 1, 2015, at 6:36 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

On 2 Sep 2015, at 3:24, Niels Bakker wrote:

...
Because variety of flow telemetry delivery options isn't the #1 ranked purchasing decider.

Actually, it is more often than you think. No use routing packets if you can't see what they do.

...
Otherwise no Cisco would ever have been sold.

Which is utter nonsense, of course, since Cisco a) invented flow telemetry and b) has been the consistent leader in innovating flow telemetry (FNF, IPFIX, anyone?). The EARL6/EARL7 problems are the only stumbles Cisco has made in this regard.

Roland, Please stop digging, Sounds like you haven’t used Cisco recently. I’m happy to elaborate privately. - Jared

Roland Dobbins

10:51 p.m.

On 2 Sep 2015, at 5:38, Jared Mauch wrote:

...

Please stop digging,

Since I'm not digging, I've no reason to stop. I see and deal with the various quirks of more different platforms exporting flow telemetry than most folks, all day, every day, so I know just a little bit about this topic.

...

Sounds like you haven’t used Cisco recently.

I use Cisco all the time, thanks. They aren't perfect - no vendor is. They have various issues with their NetFlow implementations on various platforms - for example, bursts of wildly inaccurate flow statistics from CRS boxes when a linecard is rebooted, a problem which has persisted for years and is just now being addressed. Odd stuff with EARL8 on Sup2T/DFC4 in certain configurations, and so forth. But Niels is grossly exaggerating. I get very usable flow telemetry from them in many, many networks. I deal with flow telemetry from many, many vendors/platforms, and I can confidently assert that Cisco are nowhere near the bottom of the heap when it comes to the verisimilitude and functionality of their flow telemetry export. Quite the opposite ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

11:06 p.m.

...

On Sep 1, 2015, at 6:51 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

On 2 Sep 2015, at 5:38, Jared Mauch wrote:

...
Please stop digging,

Since I'm not digging, I've no reason to stop. I see and deal with the various quirks of more different platforms exporting flow telemetry than most folks, all day, every day, so I know just a little bit about this topic.

You are, Avi has said that the number of people with a network is outnumbered about 50:1 using his most favorable numbers. This means for your one example there are 50 people not doing this and the world hasn’t ended for them. If you aren’t listening to Avi, please trust me, you don’t need your own OOB network for flow, nor is putting your flow there going to provide you some magical value. If you can’t provision enough bandwidth for your telemetry data, you will obviously need to prune it back. 1:10k sampling works and you don’t need much more than that unless you’re at extremely low bitrates. Most attacks last under 1 hour and even the small ones shout out in netflow data doing a simple hash sort algorithm with the proper keys. You can even use QoS to mitigate if your goal is attack traffic as they’re mostly UDP based attacks, see: https://tools.ietf.org/html/draft-byrne-opsec-udp-advisory-00 for some advice/input. I’ve shared my own input at recent NANOG meetings, including policers to keep the attacks under control.

...

...
Sounds like you haven’t used Cisco recently.

I use Cisco all the time, thanks. They aren't perfect - no vendor is. They have various issues with their NetFlow implementations on various platforms - for example, bursts of wildly inaccurate flow statistics from CRS boxes when a linecard is rebooted, a problem which has persisted for years and is just now being addressed. Odd stuff with EARL8 on Sup2T/DFC4 in certain configurations, and so forth.

I’m not talking about datacenter class equipment that you seem so focused on like the Earl7 with the TICO etc that did software sampling out of the hardware tcam and would be overrun.

...

But Niels is grossly exaggerating. I get very usable flow telemetry from them in many, many networks. I deal with flow telemetry from many, many vendors/platforms, and I can confidently assert that Cisco are nowhere near the bottom of the heap when it comes to the verisimilitude and functionality of their flow telemetry export. Quite the opposite

What people often don’t see is true “scale”[1] of netflow. When you have enough attributes or want to actually look at your IPv6 there have been significant shortcomings. We had to remind the patent holder for netflow how to implement it for their own hardware. - Jared aside: will you be in Yokohama? We should get lunch/dinner. [1] - I hate this word, vendors use it as an excuse to hardcode limits and to not properly respond to valid use cases

Roland Dobbins

2 Sep 2 Sep

12:01 a.m.

On 2 Sep 2015, at 6:06, Jared Mauch wrote:

...

You are, Avi has said that the number of people with a network is outnumbered about 50:1 using his most favorable numbers.

Again, to clarify - I count VLANs/VRFs as being sufficiently out-of-band to handle flow telemetry on a reasonable basis without mixing it in with customer traffic. That changes the ratio.

...

This means for your one example there are 50 people not doing this and the world hasn’t ended for them. If you aren’t listening to Avi, please trust me, you don’t need your own OOB network for flow, nor is putting your flow there going to provide you some magical value.

I agree with you, Avi, and others that a dedicated OOB network *just for flow telemetry* doesn't make economic sense in most (any?) scenarios. What I'm saying is that it oughtn't to be mixed in with customer data-plane traffic. Ideally, all management-plane traffic would traverse a separate physical infrastructure. Since we don't live in an ideal world, virtual separation is generally Good Enough.

...

1:10k sampling works and you don’t need much more than that unless you’re at extremely low bitrates. Most attacks last under 1 hour and even the small ones shout out in netflow data doing a simple hash sort algorithm with the proper keys

Concur 100%. I spend a lot of time explaining to customers that no, they don't need/want 1:1 even if they could get it, and that the 'wake' left by attack traffic stands out very well even at relatively high sampling ratios. Most of the network-oriented folks seem to grasp this pretty quickly. It's generally the 'security' types who often seem conceptually/attitudinally incapable of understanding these principles.

...

. You can even use QoS to mitigate if your goal is attack traffic as they’re mostly UDP based attacks, see: https://tools.ietf.org/html/draft-byrne-opsec-udp-advisory-00 for some advice/input.

I know you do this, and I understand why. Not everyone agrees with this and does it, and I also understand why (not). ntp is easy, because there's the timesync packet-size classification hook. It gets a little dicier with other things.

...

I’ve shared my own input at recent NANOG meetings, including policers to keep the attacks under control.

And it's valuable experience to share, nobody disputes that.

...

I’m not talking about datacenter class equipment that you seem so focused on like the Earl7 with the TICO etc that did software sampling out of the hardware tcam and would be overrun.

I'm pretty sure the CRSes I referred to with the linecard-reboot issue in my example aren't datacenter-class equipment. ;>

...

What people often don’t see is true “scale”[1] of netflow. When you have enough attributes or want to actually look at your IPv6 there have been significant shortcomings. We had to remind the patent holder for netflow how to implement it for their own hardware.

This is very true. IPv6 flow telemetry is another area in which IPv4/IPv6 feature parity lags. Because of your focus on large-scale IPv6 deployment over the course of many years, you see and experience a lot more IPv6-related deficiencies than most folks.

...

aside: will you be in Yokohama? We should get lunch/dinner.

Yes, and yes. ;>

...

[1] - I hate this word, vendors use it as an excuse to hardcode limits and to not properly respond to valid use cases

Concur 100%. Another annoying vendor trait is use-case obsession. In many contexts, the right answer is to understand that there is a baseline plateau of vitally necessary scaling (that word, again) capacity and required functionality which is universally applicable, irrespective of variations in particular use cases. I recently had a discussion with someone who was asking me how many attack sources one typically sees in a given DDoS attack. My response was that there is no 'typical'; and that for IPv4, the theoretical potential is 2^32 sources, while in IPv6, the theoretical potential is for 2^128 sources. It was a light-bulb moment. ;> ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Nick Hilliard

1 Sep 1 Sep

8:34 p.m.

On 01/09/2015 21:13, Valdis.Kletnieks@vt.edu wrote:

...

On Tue, 01 Sep 2015 22:47:09 +0300, Tarko Tikan said:

...
Bad advice. No amount of money will fix major platforms that are not happy to export flow telemetry via router management ports.

And that box ended up in your rack, why exactly?

<insert excuses that boil down to "We were willing to accept gear that didn't have this functionality because we were OK with sending flow telemetry over the inband network">

this approach is fine for bitty boxes handling small quantities of traffic. If you want to handle netflow data export for large amounts of traffic, it would be pretty dumb to push it through the management plane of the router. Nick

Roland Dobbins

10:37 p.m.

On 2 Sep 2015, at 3:34, Nick Hilliard wrote:

...

If you want to handle netflow data export for large amounts of traffic, it would be pretty dumb to push it through the management plane of the router.

Concur 100%. You must use a port capable of doing so. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

10:49 p.m.

...

On Sep 1, 2015, at 6:37 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

On 2 Sep 2015, at 3:34, Nick Hilliard wrote:

...
If you want to handle netflow data export for large amounts of traffic, it would be pretty dumb to push it through the management plane of the router.

Concur 100%. You must use a port capable of doing so.

My experience in running large networks is these ports often can’t handle the traffic involved. The packet path in a juniper (for example to go from the PFE -> RE -> Ethernet) is very sensitive to the jitter introduced by increased traffic loads and may result in the box becoming unstable. Other platforms (e.g.: IOS-XR based) have issues with the MgmtEther interfaces which make them inoperable for many use-cases. There are many technical details that are easily overlooked by those not using the routers to their abilities, so a small network (as Wes mentioned before with 2500s/T1s) still as OOB is unlikely to see data rates comparable to what is seen from a large router exporting data from hundreds of gigs of flows. Often net flow vendors tell customers things that create more flow records which equals slightly higher data resolution but no actual net difference in results except for the lowest of bitrates. Making sure your flow implementation is optimized (ingress only, relevant links only) is one part of having it scale. I’ve seen many a solution that scales poorly or requires dozens of boxes for datasets that don’t require it. It’s easy to say over specify for an attack because of the “Think of the Children^WDDoS” mentality that exists, but when you are on the receiving end of a large attack there are better tools to use. - Jared

Roland Dobbins

2 Sep 2 Sep

12:05 a.m.

On 2 Sep 2015, at 5:49, Jared Mauch wrote:

...

Other platforms (e.g.: IOS-XR based) have issues with the MgmtEther interfaces which make them inoperable for many use-cases.

I'm agreeing with you. Dedicated management ports on many boxes don't actually support important management-plane functions, like flow telemetry - which is nuts, but that's what happens.

...

There are many technical details that are easily overlooked by those not using the routers to their abilities, so a small network (as Wes mentioned before with 2500s/T1s) still as OOB is unlikely to see data rates comparable to what is seen from a large router exporting data from hundreds of gigs of flows.

That's true. I understand that even on large networks, the OOB/DCN is built from old, grandfathered equipment. I spend a lot of time helping network operators calculate optimal flow sampling rates, flow cache sizes, etc., and an important consideration in making optimal configuration choices is what the OOB/DCN network can handle.

...

Often net flow vendors tell customers things that create more flow records which equals slightly higher data resolution but no actual net difference in results except for the lowest of bitrates.

Concur 100%. I spend a non-trivial amount of time talking folks down from the assumption that unnecessarily-low flow sampling ratios are required (these are mainly 'security' folks, not network engineers). ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Chuck Church

1 Sep 1 Sep

8:45 p.m.

Agree. Most OOB is lacking redundancy too, so a single failure can really take the shine off an OOB deployment. Especially when you've put your management traffic on it, including radius traffic, and you're using 802.1X. Found that out the hard way a few years ago. Chuck -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Tarko Tikan Sent: Tuesday, September 01, 2015 3:47 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector hey,

...

It should've already been spent for an OOB/DCN network, which should've been provisioned with flow telemetry in mind.

Roland Dobbins

10:39 p.m.

On 2 Sep 2015, at 3:45, Chuck Church wrote:

...

Most OOB is lacking redundancy too, so a single failure can really take the shine off an OOB deployment.

Even if you're using old, grandfathered equipment for it, there's no reason why your OOB/DCN can't have a reasonable degree of redundancy. Since, it's like, *what you use to control your entire network*. Underinvesting in management capabilities and capacities has always been a problem, of course. Some organizations just won't learn until they've gone through a disaster or three. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

10:57 p.m.

...

On Sep 1, 2015, at 6:39 PM, Roland Dobbins <rdobbins@arbor.net> wrote:

On 2 Sep 2015, at 3:45, Chuck Church wrote:

...
Most OOB is lacking redundancy too, so a single failure can really take the shine off an OOB deployment.

Even if you're using old, grandfathered equipment for it, there's no reason why your OOB/DCN can't have a reasonable degree of redundancy. Since, it's like, *what you use to control your entire network*.

Most networks use inband to manage them.

...

Underinvesting in management capabilities and capacities has always been a problem, of course. Some organizations just won't learn until they've gone through a disaster or three.

Yes. let me know when the vendors catch up in this area. I often see people say to create a new network as job security vs making the inband network survive attacks or be provisioned properly. Most people I’ve seen have little data or insight into their networks, or don’t have the level that they would desire as tools are expensive or impossible to justify due to capital costs. Tossing in a recurring opex cost of DC XC fee + transport + XC fee + redundant aggregation often doesn’t have the ROI you infer here. I’ve put together some models in this area. It seems to me the DC/real estate companies involved could make a lot (more) money by offering an OOB service that is 10Mb/s flat-rate for the same as an XC fee and compete with their customers. Things continue to be a challenge as less equipment works with a serial console and the expectation of developers of these embedded solutions don’t take into account low bitrate connections that are often used in last-resort situations. We have a well oiled set of processes and checklists to monitor and test our management network. Patrick Gilmore has personally mocked me because of its method and technique, but the reality is it works. - Jared

Roland Dobbins

10:33 p.m.

On 2 Sep 2015, at 2:47, Tarko Tikan wrote:

...

In-band netflow works on all platforms without such issues.

There's no law that says that you must only plug designated management ports into OOB/DCN management networks. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Jared Mauch

5:44 p.m.

I think the key here is that Roland isn't often constrained by these financial considerations. I would respectfully disagree with Roland here and agree with Job, Niels, etc. A few networks have robust out of band networks, but most I've seen have an interesting mixture of things and inband is usually the best method. Those that do have "seperate" networks may actually be CoC services from another deparment in the same company riding the same P/PE devices (sometimes routers). I've seen oob networks on DSL, datacenter wifi or cable swaps through the fence to an adjacent rack. An oob network need not be high bandwidth enough to do netflow sampling, this is well regarded as a waste of money by many as the costs for these can often be orders of magnitude more compared to a pure-IP or internet service. I'll say this ranks up there with people who think MPLS VPN == Encryption. It's not unless you think a few byte label is going to confuse people. - Jared On Tue, Sep 01, 2015 at 01:32:04PM -0400, Shane Ronan wrote:

...

So in your world, the money always exists for a separate flow telemetry network?

On 9/1/15 1:29 PM, Roland Dobbins wrote:

...
On 2 Sep 2015, at 0:18, Niels Bakker wrote:

...
You're just wrong here.

Sorry, I'm not. I've seen what happens when flow telemetry is 'squeezed out' by pipe-filling DDoS attacks, interrupted by fat-fingers, etc.

It'll happen to you, one day. And then you'll understand.

----------------------------------- Roland Dobbins <rdobbins@arbor.net>

-- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.

Roland Dobbins

11:22 p.m.

On 2 Sep 2015, at 0:44, Jared Mauch wrote:

...

I think the key here is that Roland isn't often constrained by these financial considerations.

That's entirely true. I deal every day with customers who are, though.

...

I would respectfully disagree with Roland here and agree with Job, Niels, etc.

I understand where you and they are coming from, in this regard. I just disagree, as well.

...

A few networks have robust out of band networks, but most I've seen have an interesting mixture of things

Concur 100%.

...

and inband is usually the best method.

Let me be clear - OOB for flow telemetry can be actually provisioned on the same boxes which are handling the production network traffic. It isn't ideal, but it's better than running it truly inband in the production network, mixed in with customer traffic. VLANs, VRFs, whatever are a reasonable compromise, and a lot of folks do this. Inband is a huge risk, especially in a world of multi-hundred gb/sec reflection/amplification attacks (not to mention the other catastrophic failure scenarios). I know you sink a lot of UDP at the edges of your network to ameliorate this problem, but not all operators do that or agree with it either in principle or as a matter of optimal utility. I understand that this sort of thing is a decision that all network operators must make for themselves based upon their knowledge of their own networks and customer needs.

...

Those that do have "seperate" networks may actually be CoC services from another deparment in the same company riding the same P/PE devices (sometimes routers).

Yes, that's what I'm getting at above. It isn't ideal, but there's no reason to make the perfect the enemy of the merely good, agreed.

...

I've seen oob networks on DSL, datacenter wifi or cable swaps through the fence to an adjacent rack.

Absolutely. All kinds of creative lashups to get console access in difficult situations (and, as you noted previously, an increasing number of devices don't support serial console at all, which is highly annoying). ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Frank Bulk

10 Sep 10 Sep

2:37 p.m.

Does anyone else have a serial to IP dongle for devices that are IP only? That dongle would need to have telnet and SSH support. Or an IP-to-IP dongle, that would support a routing table? There's Brocade kit that has a mgmt. port, but it doesn't have its own routing table (they now have a mgmt. vrf in some software releases), making it local-only or something you have to use some kind of pseudo-NAT (all public IPs are translated to mgmt-network IPs). Frank -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Roland Dobbins Sent: Tuesday, September 01, 2015 6:23 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector <snip> Absolutely. All kinds of creative lashups to get console access in difficult situations (and, as you noted previously, an increasing number of devices don't support serial console at all, which is highly annoying). ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

White, Andrew

3:33 p.m.

$50 on ebay http://www.ebay.com/sch/i.html?_from=R40&_trksid=p2050601.m570.l1313.TR12.TRC2.A0.H0.Xconsole+server.TRS0&_nkw=console+server&_sacat=0 Andrew White Desk: 314.394-9594 | Cell: 314.308-7730 NetOps Consultant, DAS DNS group Charter Communications 12405 Powerscourt Drive St. Louis, MO 63131 -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Frank Bulk Sent: Thursday, September 10, 2015 9:37 AM To: 'Roland Dobbins'; nanog@nanog.org Subject: RE: NetFlow - path from Routers to Collector Does anyone else have a serial to IP dongle for devices that are IP only? That dongle would need to have telnet and SSH support. Or an IP-to-IP dongle, that would support a routing table? There's Brocade kit that has a mgmt. port, but it doesn't have its own routing table (they now have a mgmt. vrf in some software releases), making it local-only or something you have to use some kind of pseudo-NAT (all public IPs are translated to mgmt-network IPs). Frank -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Roland Dobbins Sent: Tuesday, September 01, 2015 6:23 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector <snip> Absolutely. All kinds of creative lashups to get console access in difficult situations (and, as you noted previously, an increasing number of devices don't support serial console at all, which is highly annoying). ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Todd K Grand

8:29 p.m.

Mikrotik router-boards can be used as a serial to IP converter. Complete with rip, ospf, bgp, etc. -----Original Message----- From: White, Andrew Sent: Thursday, September 10, 2015 10:33 AM To: Frank Bulk ; 'Roland Dobbins' ; nanog@nanog.org Subject: RE: NetFlow - path from Routers to Collector $50 on ebay http://www.ebay.com/sch/i.html?_from=R40&_trksid=p2050601.m570.l1313.TR12.TRC2.A0.H0.Xconsole+server.TRS0&_nkw=console+server&_sacat=0 Andrew White Desk: 314.394-9594 | Cell: 314.308-7730 NetOps Consultant, DAS DNS group Charter Communications 12405 Powerscourt Drive St. Louis, MO 63131 -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Frank Bulk Sent: Thursday, September 10, 2015 9:37 AM To: 'Roland Dobbins'; nanog@nanog.org Subject: RE: NetFlow - path from Routers to Collector Does anyone else have a serial to IP dongle for devices that are IP only? That dongle would need to have telnet and SSH support. Or an IP-to-IP dongle, that would support a routing table? There's Brocade kit that has a mgmt. port, but it doesn't have its own routing table (they now have a mgmt. vrf in some software releases), making it local-only or something you have to use some kind of pseudo-NAT (all public IPs are translated to mgmt-network IPs). Frank -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Roland Dobbins Sent: Tuesday, September 01, 2015 6:23 PM To: nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector <snip> Absolutely. All kinds of creative lashups to get console access in difficult situations (and, as you noted previously, an increasing number of devices don't support serial console at all, which is highly annoying). ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Niels Bakker

1 Sep 1 Sep

5:46 p.m.

* rdobbins@arbor.net (Roland Dobbins) [Tue 01 Sep 2015, 19:30 CEST]:

...

On 2 Sep 2015, at 0:18, Niels Bakker wrote:

...
You're just wrong here.

Sorry, I'm not. I've seen what happens when flow telemetry is 'squeezed out' by pipe-filling DDoS attacks, interrupted by fat-fingers, etc.

This is the dumbest thing I've read on this mailing list in a while. (On the other hand, I don't read most threads.)

...

It'll happen to you, one day. And then you'll understand.

Who is saying I haven't done all of the above already? -- Niels.

Roland Dobbins

11:10 p.m.

On 2 Sep 2015, at 0:46, Niels Bakker wrote:

...

This is the dumbest thing I've read on this mailing list in a while.

It happens. You can deny it all you like, but I've seen it happen, and the resultant confusion and additional time to resolve problems it causes. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Leo Bicknell

8:22 p.m.

In a message written on Wed, Sep 02, 2015 at 12:29:25AM +0700, Roland Dobbins wrote:

...

On 2 Sep 2015, at 0:18, Niels Bakker wrote:

...
You're just wrong here.

Sorry, I'm not. I've seen what happens when flow telemetry is 'squeezed out' by pipe-filling DDoS attacks, interrupted by fat-fingers, etc.

It'll happen to you, one day. And then you'll understand.

Ah, I see your mistake, you're thinking everyone cares about that problem. They don't. Good, fast, cheap, pick two. You've selected good. Some people pick fast and cheap. They are not wrong, you are not right. Just a different lifestyle choice. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/

Mark Tinka

2 Sep 2 Sep

6:08 a.m.

On 1/Sep/15 19:12, Roland Dobbins wrote:

...

Running flow telemetry in-band is penny-wise and pound-foolish, for networks of any size, in any circumstances. All management-plane traffic (and that's what flow telemetry is) should be segregated from the production network data plane.

Looking at the span of my network, when I am building an OoB platform to maintain management access to any device that may lose in-band connectivity, I'll be honest, moving flow data across that is the least of my worries. Simply because the costs and effort associated with getting OoB management plane access that is cleanly separated from in-band topologies does not always encourage the addition of traffic that could kill the case for the same. When my device is down, my primary objective is to get it back up, fast. Flow data being lost is less of a problem. Mark.

Pierfrancesco Caci

6:25 a.m.

...

...
...
...
...
"Roland" == Roland Dobbins <rdobbins@arbor.net> writes:

Roland> On 2 Sep 2015, at 0:08, Steve Meuse wrote: >> Your advice is not "one size fits all". Roland> Actually, it is. Roland> Large backbone networks have DCNs/OOBs, and that's where they export Roland> their NDE. 2 out of 2 large backbone networks I've experience with use inband for flow export. -- Pierfrancesco Caci, ik5pvx

Roland Dobbins

9:43 a.m.

On 2 Sep 2015, at 13:25, Pierfrancesco Caci wrote:

...

2 out of 2 large backbone networks I've experience with use inband for flow export.

There was supposed to be a 'should' in there, apologies. I've already clarified that VLAN/VRF is Good Enough for flow telemetry, and that most who separate it out from customer traffic do so logically. There are a few very large networks who do this completely physically out-of-band; they made the decision that they were going to invest in the ability to ship telemetry around and analyze it, and spent a lot of capex and opex to be able to do just that. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Serge Vautour

4:29 p.m.

Hello again, Well, this generated a bit more discussion than I was expecting. I've retained the following from all your comments: -Doing flow export over an OOB network can help make sure you still "see" your network during a DDoS -If we do this over an OOB network, it may not work over the OOB port on the RE/RSP. I do have some specific questions for the folks who are OK with doing this inband: -Are you concerned with someone intercepting the Flow streams? I assume if someone has the ability to do so, you've got bigger problems. -If we make the assumption that someone can intercept the Flow steam, do you think the data in the steam can be used for anything? It's just L3 & L4 headers. In other words, do you feel an OOB network is require to secure the flow data? Thanks again, your comments are very helpful. Serge -------------------------------------------- On Tue, 9/1/15, Serge Vautour <sergevautour@yahoo.ca> wrote: Subject: NetFlow - path from Routers to Collector To: nanog@nanog.org Received: Tuesday, September 1, 2015, 12:33 PM Hello, For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security? Thanks, Serge

Roland Dobbins

4:32 p.m.

On 2 Sep 2015, at 23:29, Serge Vautour wrote:

...

I assume if someone has the ability to do so, you've got bigger problems.

This is the key, IMHO. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>

Baldur Norddahl

5:03 p.m.

We use the VRF approach not because we think this will give us more stability ie. no fate sharing, but because it is best practice in a security perspective. We keep our internal network separated from customer traffic for the same reason our customers run firewalls. Minimize the attack surface. Customers or people from the internet should not be able to even attempt hacking the infrastructure. They should not be able to send packets that will get routed to the collector. ACLs is a poor man's solution compared to running in a VRF or equallent (vlan). Regards Baldur Den 02/09/2015 18.31 skrev "Serge Vautour" <sergevautour@yahoo.ca>:

...

Hello again,

Well, this generated a bit more discussion than I was expecting. I've retained the following from all your comments:

-Doing flow export over an OOB network can help make sure you still "see" your network during a DDoS -If we do this over an OOB network, it may not work over the OOB port on the RE/RSP.

I do have some specific questions for the folks who are OK with doing this inband:

-Are you concerned with someone intercepting the Flow streams? I assume if someone has the ability to do so, you've got bigger problems. -If we make the assumption that someone can intercept the Flow steam, do you think the data in the steam can be used for anything? It's just L3 & L4 headers. In other words, do you feel an OOB network is require to secure the flow data?

Thanks again, your comments are very helpful.

Serge

-------------------------------------------- On Tue, 9/1/15, Serge Vautour <sergevautour@yahoo.ca> wrote:

Subject: NetFlow - path from Routers to Collector To: nanog@nanog.org Received: Tuesday, September 1, 2015, 12:33 PM

Hello,

For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security?

Thanks, Serge

James Bensley

11 Sep 11 Sep

8:35 a.m.

On 1 September 2015 at 16:33, Serge Vautour <sergevautour@yahoo.ca> wrote:

...

Hello,

For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security?

Thanks, Serge

Erik Sundberg

1:18 p.m.

Mainly management type traffic over an Out of band Management Network. This way during and outage we don't miss any Netflow and SNMP Queries and more importantly we can still access the router. In the past I have also setup a Management VRF, but tend to stay away from this. During an outage you end up losing data or visibility while routes reconverge. -----Original Message----- From: NANOG [mailto:nanog-bounces+esundberg=nitelusa.com@nanog.org] On Behalf Of James Bensley Sent: Friday, September 11, 2015 3:35 AM To: serge@nbnet.nb.ca; nanog@nanog.org Subject: Re: NetFlow - path from Routers to Collector On 1 September 2015 at 16:33, Serge Vautour <sergevautour@yahoo.ca> wrote:

...

Hello,

For those than run Internet connected routers, how do you get your NetFlow data from the routers to your collectors? Do you let the flow export traffic use the same links as your customer traffic to route back to central collectors? Or do you send this traffic over private network management type path? If you send this traffic over the "Internet" (within your AS), are you worried about security?

Thanks, Serge

Hi Serge, Not encountered any worries regarding security, typically NetFow/ipfix/sFlow/etc is inside a management MPLS VPN so it is segregated from customer VPNs through the network. For the physical transport of the data, collecting the data via your OOB network is probably preferred however "it depends". Do you use NetFlow internally only or offer it as a chargeable service? Do you also graph traffic stats via SNMP too? And so on and so forth... In past experience, NetFlow data was exported over the productive links (the links also carrying customer data being measured using NetFlow) without issue. I recall two occasions a DDoS disrupted the NetFlow collecting because the DDoS traversed those links that are being monitored and carrying their own NetFlow traffic. However SNMP graphing was via the OOB network so we didn't really lose any vital visibility. So we could still see from the like 1000% increase in traffic which links along the network were being affected. A distress call from the customer being DDoS also helps :) Another part of the "it depends" puzzle is how much data you are collecting via NetFlow? Again in a part experience we were testing collecting everything (as much as we could), every single packet header (no payload data though), rather than sampling say 1 in 10 packets for example. We only got as far as testing this in the lab but one issue it threw up was we could generate several Mbps of NetFlow traffic. Some PoPs have ADSL for OOB and wouldn't have been able to support that so sites with ADSL or 3G OOB links would need the OOB link upgrading, that required additional Capex, cue management budget wrestle, blah blah... Cheers, James. ________________________________ CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain confidential information that is legally privileged. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error please notify the sender immediately by replying to this e-mail. You must destroy the original transmission and its attachments without reading or saving in any manner. Thank you.

3803

Age (days ago)

3813

Last active (days ago)

List overview

Download

65 comments

24 participants

participants (24)

Baldur Norddahl
Chuck Church
Erik Sundberg
Frank Bulk
George, Wes
James Bensley
Jared Mauch
Jared Mauch
jim deleskie
Job Snijders
Leo Bicknell
Mark Tinka
Nick Hilliard
Niels Bakker
Pierfrancesco Caci
Rod Beck
Roland Dobbins
Serge Vautour
Shane Ronan
Steve Meuse
Tarko Tikan
Todd K Grand
Valdis.Kletnieks＠vt.edu
White, Andrew