Some long long long time ago I wrote a small tool called snmpstatd. Back then Sprint management was gracious to allow me to release it as a public-domain code. It basically collects usage statistics (in 30-sec "peaks" and 5-min averages), memory and CPU utilization from routers, by performing _asynchronous_ SNMP polling. I believe it can scale to about 5000-10000 routers. It also performs accurate time base interpolation for 30-sec sampling (i.e. it always requests router's local time and uses it for computing accurate 30-sec peak usage). The data is stored in text files which are extremely easy to parse. The configuration is text-based; it also includes compact status alarm output (i.e. which routers/links are down), PostScript chart generator, and troff/nroff based text report generator, with summary downtime and usage figures + significant events. The tool was used routinely to produce reporting on ICM-NET performance for NSF. This thing may need some hacking to accomodate later-day IOS bogosities, though. If anyone wants it, I have it at www.kotovnik.com/~avg/snmpstatd.tar.gz --vadim On Mon, 22 Jul 2002, Gary E. Miller wrote:
Yo Alexander!
On Tue, 23 Jul 2002, Alexander Koch wrote:
imagine some four routers dying or not answering queries, you will see the poll script give you timeout after timeout after timeout and with some 50 to 100 routers and the respective interfaces you see mrtg choke badly, losing data.
Yep. Anything gets behind and it all gets behind.
That is why we run multiple copies of MRTG. That way polling for one set of hosts does not have to wait for another set. If one set is timing out the other just keeps on as usual.
RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676