Re: EyeBall View

27 Oct 2015

...
All,
I had an idea to create a product where we would have a host on every EyeBall network. Customers could then connect to these hosts and check connectivity back to their network. For instance you may want to see what the speed is like from CableVision in central NJ to your network in South Florida or the latency etc. I go large scale I wanted to know how much demand there was for such a service.
Regards,
Dovid
Another approach to take is to enable monitoring of your infrastructure,
and then do active tests on top to web servers and other end points.

Passive instrumentation gives you the even bigger advantage of giving
you insight into issues actually affecting your users' traffic.

Just did a talk about this at NANOG 65:

https://www.nanog.org/sites/default/files/monday_general_freedman_flow.pdf

If you set up a tap or SPAN and grab a box with Intel (or many other kinds
of NICs), you can use PF_RING and nprobe to monitor at 100gig+ speeds.

For nprobe in particular as an "agent", some of the extended/augmented
data you can get via NetFlow includes:

http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf

[NFv9 57595][IPFIX 35632.123] %CLIENT_NW_DELAY_MS             Network latency client <-> nprobe (msec) 
[NFv9 57596][IPFIX 35632.124] %SERVER_NW_DELAY_MS             Network latency nprobe <-> server (residual msec) 
[NFv9 57597][IPFIX 35632.125] %APPL_LATENCY_MS                Application latency (msec) 
[NFv9 57581][IPFIX 35632.109] %RETRANSMITTED_IN_PKTS          Number of retransmitted TCP flow packets (src->dst) 
[NFv9 57582][IPFIX 35632.110] %RETRANSMITTED_OUT_PKTS         Number of retransmitted TCP flow packets (dst->src) 
[NFv9 57583][IPFIX 35632.111] %OOORDER_IN_PKTS                Number of out of order TCP flow packets (dst->src) 
[NFv9 57584][IPFIX 35632.112] %OOORDER_OUT_PKTS               Number of out of order TCP flow packets (dst->src) 
[NFv9 57585][IPFIX 35632.113] %UNTUNNELED_PROTOCOL            Untunneled IP protocol byte 

The NANOG PPT shows an example of some of the slicing and dicing
you can then do (focused around retransmitted TCP packets, which
is what most of our customers are interested in focusing on as a
simple proxy metric for 'network performance').  Not soliciting 
flames on what the magic metrics should be - store them all and
use the ones that best correlate for you :)

Luca/ntop are actively working on nprobe, so I'm sure you could
get him to add throughput and other metrics as ell.

Also -

The same approach should work with Cisco AVC on ASRs, though it's
something we're just starting to test and may only work with 
specific sets of filters (vs blanket apply to 40gig of traffic
through an ASR).

Definitely curious if anyone in the NANOG community has tried AVC?
Or any other switch/router-layer performance instrumentation?

We've been interested in putting an agent on some of the Linux white
box switches, but the Broadcom chips in the current gens don't 
allow 'flow sampling' - getting all headers or none for a flow,
for a % of flows matching a profile.  And that's needed to do 
retransmit/OOO/latency tracking (vs just seeing samples of packets
across flows).

Again, pointers to switches that have that capability and can run
*nix apps would be appreciated :)

Avi Freedman
CEO, Kentik
avi at kentik dot com

freedman＠freedman.net

tags

participants (1)