Netflow IP accounting and IP protocol numbers

25 Nov 1997

      When analyzing Netflow accounting data, we also found much traffic
with UDP and TCP port numbers that couldn't be attributed to specific
applications.

One important contributor to this are FTP transfers that don't use the
FTP-data port (TCP 20).  I assume that this happens when a client uses
PASV (passive-mode FTP).  This accounts for the majority of "unknown
TCP ports" traffic between SWITCH and the rest of the Internet.

Since we have our own software to process Netflow accounting packets,
I added the following heuristics to the program:

* When we see a TCP flow with unknown TCP port numbers, count it as
  "unknown TCP" traffic for now, but make a note containing the IP
  source and destination address, start/end time, and packet/byte
  counts (well we only count bytes not packets).

* When we see a TCP flow with either port 21 (FTP control), then we
  look whether we find notes about that particular pair of
  source/destination IP address that correspond to the lifetime of the
  FTP-control flow.  If so, we assume that those "unknown TCP" flows
  were actually FTP data transfers, and reclassify them as such.

The cost of this consists of storing some data about the "TCP unknown"
flows for 30 minutes (somewhat more depending on the time slicing of
your traffic counting) and doing a table lookup whenever FTP control
flows are seen.

We found that the number of such flows is sufficiently low, and the
amount of traffic they represent sufficiently high, for this to work
and be worth the effort.

An important win is that the remaining "unknown TCP" traffic can be
investigated much more efficiently once you get rid of this FTP
traffic.
-- 
Simon.

Netflow IP accounting and IP protocol numbers

Simon Leinen