Traffic measurement techniques such as NetFlow work by associating some characteristics of inbound packets on an interface with a flow, e.g. some tuple like (source addr, source port, dest addr, dest port, protocol). Counters per flow are incremented, and the numbers are exported periodically or when flows become inactive. There are a few vendors who now provide traffic export from high-speed interfaces by sampling those interfaces at a particular rate, and using the sampled packets to populate the per-flow counters, rather than looking at every packet. Does anybody here know of recent research with real internet traffic which compares different sample rates wrt the representativeness of the resulting flow data? For example, if I am trying to rank the top traffic sinks for my network beyond an attached peer (i.e. an ordinal rather than cardinal measurement), will I get different answers if I use a sampling rate of 1:1000 compared to 1:50, given a statistically "long enough" measurement period? Intuitively, it seems to me that the answers should be the same. However, it also seems to me that statistics are frequently non- intuitive. Joe