On Thu, Jan 21, 2016 at 11:44:34PM +0900, Randy Bush wrote:
You can configure pmacct to specify on which properties of the received flow data it should aggregate its output data, one could configure pmacct to store data using the following primitives:
($timeperiod, $entrypoint_router_id, $bgp_nexthop, $packet_count)
Where $timeperiod is something like 5 minute ranges, and the post processing software calculates the distance between the entrypoint router and where the flow would leave the network ($bgp_nexthop).
See 'aggregate' on http://wiki.pmacct.net/OfficialConfigKeys
In short: you configure pmacct to throw away everything you don't need (maybe after some light pre-processing), and hope that what remains is small enough to fit in your cluster and at the same time offers enough insight to answer the question you set out to resolve.
but could you explain in detail how this tests the hypothesis?
even of all your traffic entered on a bgp hop and exited on a bgp hop, and all bgp entries set next_hop (which i think you do), you would be ignoring the 'distance' the packet traveled from source to get to your entry and traveled from your exit to get to the final destination.
Yes, correct. This is why I mentioned before: "However, this would be just one network's (biased) view on things." With this I meant that I can measure something, but only within a subset of the entire path a packet might traverse. (just that one routing domain), so not end-to-end. And what might be true for us might not be true for others.