When analyzing Netflow accounting data, we also found much traffic with UDP and TCP port numbers that couldn't be attributed to specific applications. One important contributor to this are FTP transfers that don't use the FTP-data port (TCP 20). I assume that this happens when a client uses PASV (passive-mode FTP). This accounts for the majority of "unknown TCP ports" traffic between SWITCH and the rest of the Internet. Since we have our own software to process Netflow accounting packets, I added the following heuristics to the program: * When we see a TCP flow with unknown TCP port numbers, count it as "unknown TCP" traffic for now, but make a note containing the IP source and destination address, start/end time, and packet/byte counts (well we only count bytes not packets). * When we see a TCP flow with either port 21 (FTP control), then we look whether we find notes about that particular pair of source/destination IP address that correspond to the lifetime of the FTP-control flow. If so, we assume that those "unknown TCP" flows were actually FTP data transfers, and reclassify them as such. The cost of this consists of storing some data about the "TCP unknown" flows for 30 minutes (somewhat more depending on the time slicing of your traffic counting) and doing a table lookup whenever FTP control flows are seen. We found that the number of such flows is sufficiently low, and the amount of traffic they represent sufficiently high, for this to work and be worth the effort. An important win is that the remaining "unknown TCP" traffic can be investigated much more efficiently once you get rid of this FTP traffic. -- Simon.