On Fri, 29 Mar 2024 at 02:15, Nick Hilliard <nick@foobar.org> wrote:
Overall, sflow has one major advantage over netflow/ipfix, namely that it's a stateless sampling mechanism. Once you have hardware that can
Obviously, not all netflow/ipfix implementations implement flow state, but most do; some implement stateless sampling ala sflow. Also many
Tools should be chosen to fit the job. There are plenty of situations where sflow is ideal. There are others where netflow is preferable.
This seems like a long-winded way of saying, sFlow is a perfect subset of IPFIX. We will increasingly see IPFIX implementations omit state, because states don't do anything anymore in high-volume networks, you will only ever create flow in cache, then delay exporting the information for some seconds, but the flow is never hit twice, therefore paying massive cost for caching, without getting anything out of it. Anyone who actually needs caching, will have to buy specialised devices, as it will no longer be economical for peering-routers to offer such memory bandwidth and cache sizes that caches will actually do something. In a particular network we tried 1:5000 and 1:500 and in both cases flow records were 1 packet long, at which point we hit record export policer limit, and couldn't determine at which sampling rate we will start to see cache being useful. I've wondered for a long time, what would a graph look like, where you graph sampling ratio and percentage of flows observed, it will be linear to very high sampling ratios, but eventually it will start to taper off, I just don't have any intuitive idea when. And I don't think anyone really knows what ratio of flows they are observing in the sFlow/IPFIX, if you keep sampling ratio static over a period of time, say decade, you will continuously reduce your resolution, seeing a smaller percentage of flows. This worries me a lot, because statistician would say that you need this share of volume or this share of flows if you want to use the data like this with this confidence, therefore if we formally think the problem, we should constantly adjust our sampling ratios to fit our statistical model to keep same promises about data quality. -- ++ytti