I also looked at PMACCT, however I have to say that I am way more pleased with the current setup. PMACCT relies on bgp session etc, while you actually get that info from your (s)flow. No need for lookup. And I really didn't like the idea of SQL like DB, I know i could push it to influx. But i thought this could be done way simpler and i think i did exactly that. I store the "raw" info for 7 days, as detailed as possible. then downsample for a week again, a month and a year. And the picture rendering api function of grafana makes it possible for me to do for example /dst AS# or /peer #AS or /as-path AS# in a chatbot, and grafana actually pushes the graph to telegram in our case, but it could be any bot. So at peering conferences, i can see with 1 message how much traffic we do either to a peer, destination or if traffic passes your AS. I don't see any (pre-made) system beating this :P On 01-09-18 01:51, Paweł Małachowski wrote:
On Fri, Aug 31, 2018 at 11:09:19AM +0200, H I Baysal wrote:
My personal view is, as long as you can store your flow info in a timeseries database (like influxdb and NOT SQL LIKE!!!!!!!) you can do whatever you want with the (raw) data. And create custom triggers for different calculations. For one of our customers I've deployed good old pmacct + MySQL (using memory engine) backend for DDoS detection purposes. It has some drawbacks (e.g. one has to frequently delete old records to keep tables fit and fast) but it allows asking complex SQL queries against these short term data (e.g. different detection logic per subnets) or precompute with triggers.
Flows are on the fly and are coming in constantly, you could have a calculation like group by srcip and whatever protocol you want or just srcip, Beware of high cardinality issues when facing random src IP floods.
BTW, once again pmacct (with some glue) is nice for feeding flow data into time series database. It can pre aggregate and pre filter low volume flows to reduce storage requirements.