FW: [cacti-announce] Cacti 0.8.6j Released (fwd)
Many of us run cacti. FYI. ---------- Forwarded message ---------- Date: Thu, 18 Jan 2007 08:26:37 -0500 From: Warner Moore <wmoore@2co.com> To: bugtraq@securityfocus.com Subject: FW: [cacti-announce] Cacti 0.8.6j Released That's right, it's not vendor specific guys. Yay! --------------------------------------------------------------- Cacti version 0.8.6j has been released to address multiple vulnerabilities discovered in Cacti's PHP-based poller. It is recommended that all users upgrade immediately. A patch containing only the security fixes has been provided for both Cacti versions 0.8.6h and 0.8.6i. Please see the official patches page for application instructions and further information. http://www.cacti.net/download_patches.php See the release notes for additional information about this release. http://www.cacti.net/release_notes_0_8_6j.php All files related to this release can be found under the downloads section on the Cacti website. http://www.cacti.net/download_cacti.php Ian _______________________________________________ cacti-announce mailing list cacti-announce@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cacti-announce -- Warner Moore Enterprise Services Group 2CheckOut.com, Inc.
On Thu, Jan 18, 2007 at 11:40:06AM -0600, Gadi Evron wrote:
Many of us run cacti. FYI.
Thanks for posting this, even though it's slightly OT. Not to start an opinion war, but those who do run Cacti should really consider removing this software from their boxes permanently. http://secunia.com/advisories/23528/ For those who don't have the time/care enough to go look at the Secunia report, I'll summarise it: 1) cmd.php and copy_cacti_user.php both blindly pass arguments passed in the URL to system(). This, IMHO, is reason enough to not run this software. 2) cmd.php and copy_cacti_user.php both blindly pass arguments passed in the URL to whatever SQL back-end is used (MySQL most commonly); no escaping or sanitising is done. Otherwise known as an "SQL injection" flaw. There are other flaws mentioned, but they're simply subsets of the above two. Also, register_argc_argv is enabled (rightfully so) by default in PHP, so don't let that decrease the severity of this atrocity. (I can forgive SQL injections, but cannot blindly calling system()). I'd been considering (off and on for about a year) using Cacti for statistics gathering, and now I'm glad I didn't. This kind-of flaw reflects directly on the programming ethics and of the authors behind this software. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On Thu, 18 Jan 2007, Jeremy Chadwick wrote:
For those who don't have the time/care enough to go look at the Secunia report, I'll summarise it:
1) cmd.php and copy_cacti_user.php both blindly pass arguments passed in the URL to system(). This, IMHO, is reason enough to not run this software.
I've said this several times recently in less public places, but IMO, cacti is a bit of a security train wreck. The glaring problem isn't that the above mentioned php scripts have poor security / user supplied input sanitization. It's that those scripts were never intended to be run via the web server. So WTF are they doing in a directory served by the web server in a default cacti install? It seems to me, it would make much more sense for cacti to either split itself into 2 totally separate directories, one for things the web server needs to serve, one for everything else, or at least put all the 'web content' portions under one subdirectory of the cacti install directory, so that subdirectory can be either the DocumentRoot of a server or symlinked from elsewhere in a DocumentRoot. There's no reason for things like poller.php or any of the others that are only meant to be run by the admin from the commandline to be in directories served by the web server. I've heard from several people, and spent some time trying to help one of them, who had servers compromised (entry via cacti, then a local root compromise) over the past weekend due to this. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
NMS Software should not be placed in the public domain/internet. By the time anyone who would like to attack Cacti itself can access the server and malform an HTTP request to run this attack, then can also go see your entire topology and access your SNMP keys (assuming v1). There is this Network Management theory called Out of Band Management. If you are concerned about security, you should only be polling anything you expect to be secure on a private management link/network. If you want to run an MRTG stats collector that is publicly visible and expect it to be secure, write it yourself or purchase it from a vendor that can support and guarantee the security of the product. Cacti is a free open source tool, and in my opinion these should never be expected to be 100% free of bugs, errors, and exploits. If it is that is great. I would say you get what you pay for, but if you use good practices around it, cacti can be a very useful and powerful tool. That my 2 cents, -Scott ------------------------------------------------------------------------ ------------ Scott Berkman CCNP 404-975-0097 Network Engineer scott.berkman@reignmaker.net -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Jon Lewis Sent: Thursday, January 18, 2007 3:40 PM To: Jeremy Chadwick Cc: Gadi Evron; nanog@merit.edu Subject: Re: FW: [cacti-announce] Cacti 0.8.6j Released (fwd) On Thu, 18 Jan 2007, Jeremy Chadwick wrote:
For those who don't have the time/care enough to go look at the Secunia report, I'll summarise it:
1) cmd.php and copy_cacti_user.php both blindly pass arguments passed in the URL to system(). This, IMHO, is reason enough to not run this software.
I've said this several times recently in less public places, but IMO, cacti is a bit of a security train wreck. The glaring problem isn't that the above mentioned php scripts have poor security / user supplied input sanitization. It's that those scripts were never intended to be run via the web server. So WTF are they doing in a directory served by the web server in a default cacti install? It seems to me, it would make much more sense for cacti to either split itself into 2 totally separate directories, one for things the web server needs to serve, one for everything else, or at least put all the 'web content' portions under one subdirectory of the cacti install directory, so that subdirectory can be either the DocumentRoot of a server or symlinked from elsewhere in a DocumentRoot. There's no reason for things like poller.php or any of the others that are only meant to be run by the admin from the commandline to be in directories served by the web server. I've heard from several people, and spent some time trying to help one of them, who had servers compromised (entry via cacti, then a local root compromise) over the past weekend due to this. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
On Thu, 18 Jan 2007, Berkman, Scott wrote:
NMS Software should not be placed in the public domain/internet. By the time anyone who would like to attack Cacti itself can access the server and malform an HTTP request to run this attack, then can also go see your entire topology and access your SNMP keys (assuming v1). There is this Network Management theory called Out of Band Management. If you are concerned about security, you should only be polling anything you expect to be secure on a private management link/network. If you want to run an MRTG stats collector that is publicly visible and expect it to be secure, write it yourself or purchase it from a vendor that can support and guarantee the security of the product.
Sound theory. However as someone who has setup network management & monitoring (both using open-source and proprietary software) dozen times for multiple companies (and wrote software myself when necessary), I can tell you that it can not work in every situation. In particular while its correct idea to setup separate management network for accessing devices through SNMP, the actual management or monitoring workstation/server usually needs to be placed somewhere where its accessible from regular network, so that is exactly how cacti is used. The correct setup would be to require SSL connection (if its webinterface) and password authentication to access your management/monitoring server and if it is necessary to make data available to outside, then do it through separate controlled interface. For example you could setup separate page for read-only access to certain graphs using RRD files created by cacti (and make sure CGI is not run under apache but under its own user and that user is different then the one cacti is using so that community strings in cacti are not available if outside interface is hacked; note that I'm speaking really more generally - I don't use cacti and do not know if it allows to do it properly). All that requires of course certain amount of security knowledge and admin skills and sometimes even programming skills which a lot of network administrators who choose to use cacti do not have (in fact cacti seems so popular exactly because its easy to setup by junior admins). BTW - personally I use nagios for both monitoring and providing graphing results for the data (that obviously reduces number of SNMP queries as I do not need to do it twice) useing nagiosgrapher with very heavy customization (I rewrote their webinterface and parts of the library and collection), result looks like this: http://www.elan.net/~william/nagios/printscreen_ngrapher5_nagioshost.pdf and some bits of software as far as I had time to release it is at http://www.elan.net/~william/nagios/
Cacti is a free open source tool, and in my opinion these should never be expected to be 100% free of bugs, errors, and exploits.
You know, above applies to commercial software just as much as to non-commercial/open-source. In fact the theory is that commercial software has more bugs & security flows because its code is not available and thus can not be examined by outsiders and similarly for the same reasons the bugs are less often found and when it is, the details about the bug may not be made available to the public beyond some simple "software update". Just think of how many bugs and security updates are released by software coming from Redmond and compare to Linux, OpenBSD, FreeBSD, etc -
If it is that is great. I would say you get what you pay for
So free software like apache are no good, right? How may security bugs is there again found in apache and compare that to IIS? The reality is that nowdays "what you pay for" no longer works when comparing open-source and commercial sofware. In fact commercial is very often just repackaged open-source supported by some vendor, i.e. enterprise companies just get a name to put blame to is there is an issue (plus of course support since many companies would have bunch of junior admins and only one or two senior engineers who are always kept very busy). -- William Leibzon Elan Networks william@elan.net
On Thu, 2007-01-18 at 14:33 -0700, Berkman, Scott wrote:
There is this Network Management theory called Out of Band Management.
Which is rarely properly applied. I lost count of the data centers that block mgmt traffic from external customers, but leave internal systems (which are often "sublet" to all sorts of external customers) wide open to mgmt servers/devices. Unfortunately mgmt systems need access to whatever they are monitoring, so if you're monitoring customer systems then you are more than likely exposed and should take high-priority at tightening your NMS systems. I know, I work for a NMS vendor and I wouldn't sign my name certifying that our stuff is secure. It's funny how pen testing seems to avoid NMS stuff. -Jim P.
* Berkman, Scott <Scott.Berkman@Reignmaker.net> [2007-01-18 22:34]:
Cacti is a free open source tool, and in my opinion these should never be expected to be 100% free of bugs, errors, and exploits.
very much opposed to commercial software, where you can be 100% sure that they are full of bugs, errors, and exploits -- Henning Brauer, hb@bsws.de, henning@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
On Thu, Jan 18, 2007 at 02:33:10PM -0700, Berkman, Scott wrote:
NMS Software should not be placed in the public domain/internet. By the time anyone who would like to attack Cacti itself can access the server and malform an HTTP request to run this attack, then can also go see your entire topology and access your SNMP keys (assuming v1).
I think there are a few factors at work here: 1) PHP is very easy to learn, but deals primarily with web input (i.e. potentially hostile). Since most novice programmers are happy to get the software working, they rarely ever consider the problem of trying to make it not not work. In other words, that it always behave correctly. That problem and assurance is much, much more difficult than just getting the software to work. You can't test it into the software. You can't rely on a good history to indicate there are no latent problems. 2) Furthermore, this is a service that is designed primarily for public consumption, unlike say NFS; it cannot be easily firewalled at the network layer if there is a problem or abuse. 3) The end devices rarely support direct VPN connections, and redundant infrastructure just for monitoring is expensive. 4) The functionality controlled by the user is too complicated. If all you are doing is serving images of graphs, generate them for the common scenarios and save them to a directory where a much more simple program can serve them. That is, most of the dynamically-generated content doesn't need to be generated on demand. If you're pulling data from a database, pull it all and generate static HTML files. Then you don't even need CGI functionality on the end-user interface. It thus scales much better than the dynamic stuff, or SSL-encrypted sessions, because it isn't doing any computation. As they say, there are two ways to design a secure system: 1) Make it so simple that there are obviously no vulnerabilities. 2) Make it so complex that there are no obvious vulnerabilities. I prefer the former, however unsexy and non-interactive it may be.
write it yourself or purchase it from a vendor that can support and guarantee the security of the product.
Unless you're a skilled programmer with a good understanding of secure coding techniques, the first suggestion could be dangerous. It seems that too many developers try to do things themselves without any research into similar programs and the kinds of security risks they faced, and end up making common mistakes in the form of security vulnerabilities. And no vendor of popular software I know of can guarantee that it is secure. I have seen a few companies that employ formal methods in their design practices and good software engineering techniques in the coding process, but they are almost unheard of. -- ``Unthinking respect for authority is the greatest enemy of truth.'' -- Albert Einstein -><- <URL:http://www.subspacefield.org/~travis/>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jan 21, 2007, at 11:35 PM, Travis H. wrote:
That is, most of the dynamically-generated content doesn't need to be generated on demand. If you're pulling data from a database, pull it all and generate static HTML files. Then you don't even need CGI functionality on the end-user interface. It thus scales much better than the dynamic stuff, or SSL-encrypted sessions, because it isn't doing any computation.
While I certainly agree that cacti is a bit of a security nightmare, what you suggest may not scale all that well for a site doing much graphing. I'm sure the average cacti installation is recording thousands of things every 5 minutes but virtually none of those are ever actually graphed. Those that are viewed certainly aren't viewed every 5 minutes. Even if polling and graphing took the same amount of resources that would double the load on the machine. My guess though is that graphing actually takes many times the resources of polling. Just makes sense to only graph stuff when necessary. Chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iD8DBQFFtE/NElUlCLUT2d0RAtbeAJ91qMtm8VtWSLHJ/gLsg3DnqitlwQCeK1pn bqmZZoK821K76KMj/0bxDNk= =Rx6P -----END PGP SIGNATURE-----
Anyone thats seen MRTG (simple, static) on a large network realizes that decoupling the graphing from the polling is necessary. The disk i/o is brutal. Cacti has a slick interface, but also doesn't scale all that well for large networks. I prefer RTG, though I haven't seen a nice interface for it, yet. Chris Owen wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Jan 21, 2007, at 11:35 PM, Travis H. wrote:
That is, most of the dynamically-generated content doesn't need to be generated on demand. If you're pulling data from a database, pull it all and generate static HTML files. Then you don't even need CGI functionality on the end-user interface. It thus scales much better than the dynamic stuff, or SSL-encrypted sessions, because it isn't doing any computation.
While I certainly agree that cacti is a bit of a security nightmare, what you suggest may not scale all that well for a site doing much graphing. I'm sure the average cacti installation is recording thousands of things every 5 minutes but virtually none of those are ever actually graphed. Those that are viewed certainly aren't viewed every 5 minutes. Even if polling and graphing took the same amount of resources that would double the load on the machine. My guess though is that graphing actually takes many times the resources of polling. Just makes sense to only graph stuff when necessary.
Chris
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin)
iD8DBQFFtE/NElUlCLUT2d0RAtbeAJ91qMtm8VtWSLHJ/gLsg3DnqitlwQCeK1pn bqmZZoK821K76KMj/0bxDNk= =Rx6P -----END PGP SIGNATURE-----
On Mon, 22 Jan 2007, Jason LeBlanc wrote:
Anyone thats seen MRTG (simple, static) on a large network realizes that decoupling the graphing from the polling is necessary. The disk i/o is brutal. Cacti has a slick interface, but also doesn't scale all that well for large networks. I prefer RTG, though I haven't seen a nice interface for it, yet.
How large did you have to get for cacti to "not scale"? Did you try the cactid poller [which is much faster than the standard poller]? ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
I would say somewhere around 4000 network interfaces (6-8 stats per int) and around 1000 servers (8-10 stats per server) we started seeing problems, both with navigation in the UI and with stats not reliably updating. I did not try that poller, perhaps its worth trying it again using it. I will also say this was about 2 years ago, I think the box it was running on was a dual P3-1000 with a raid 10 using 6 drives (10k rpm I think). After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking. Jon Lewis wrote:
On Mon, 22 Jan 2007, Jason LeBlanc wrote:
Anyone thats seen MRTG (simple, static) on a large network realizes that decoupling the graphing from the polling is necessary. The disk i/o is brutal. Cacti has a slick interface, but also doesn't scale all that well for large networks. I prefer RTG, though I haven't seen a nice interface for it, yet.
How large did you have to get for cacti to "not scale"? Did you try the cactid poller [which is much faster than the standard poller]?
---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
jml@packetpimp.org (Jason LeBlanc) writes:
After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking.
if funding were available, i know some developers we could hire to build the ultimate scalable pluggable network F/L/OSS management/monitoring system. if funding's not available then we're depending on some combination of hobbiests (who've usually got rent to pay, limiting their availability for this work) and in-house toolmakers at network owners (who've usually got other work to do, or who would be under pressure to monetize/license/patent the results if That Much Money was spent in ways that could otherwise directly benefit their competitors.) "been there, done that, got the t-shirt." is there funding available yet? like, $5M over three years? spread out over 50 network owners that's ~$3K a month. i don't see that happening in a consolidation cycle like this one, but hope springs eternal. "give randy and hank the money, they'll take care of this for us once and for all." -- Paul Vixie
Paul Vixie wrote:
jml@packetpimp.org (Jason LeBlanc) writes:
After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking. [..] "been there, done that, got the t-shirt." is there funding available yet? like, $5M over three years? spread out over 50 network owners that's ~$3K a month. i don't see that happening in a consolidation cycle like this one, but hope springs eternal. "give randy and hank the money, they'll take care of this for us once and for all."
Heh, for that kind of money you can even convince me to do it ;) Greets, Jeroen (dreams about a long holiday after finishing it ;)
jeroen@unfix.org (Jeroen Massar) writes:
..., $5M over three years? spread out over 50 network owners that's $3K a month. i don't see that happening in a consolidation cycle like this one, but hope springs eternal. "give randy and hank the money, they'll take care of this for us once and for all."
Heh, for that kind of money you can even convince me to do it ;)
glibly said, sir. but i disasterously underestimated the amount of time and money it would take to build BIND9. since i'm talking about a scalable pluggable portable F/L/OSS framework that would serve disparite interests and talk to devices that will never go to an snmp connectathon, i'm trying to set a realistic goal. anyone who want to convince me that it can be done for less than what i'm saying will have to first show me their credentials, second convince david conrad and jerry scharf. (after that, i'm all ears.) -- Paul Vixie
On 1/24/2007 3:05 PM, Paul Vixie wrote:
glibly said, sir. but i disasterously underestimated the amount of time and money it would take to build BIND9. since i'm talking about a scalable pluggable portable F/L/OSS framework that would serve disparite interests and talk to devices that will never go to an snmp connectathon, i'm trying to set a realistic goal. anyone who want to convince me that it can be done for less than what i'm saying will have to first show me their credentials, second convince david conrad and jerry scharf. (after that, i'm all ears.)
Trying to do a comprehensive monolith will certainly make it a 5-year process. It seems that such an effort is doomed from the start though (as you say, who would fund it?) so I'm not really sure why it would be offered up as the only available outcome. Take a different approach, it wouldn't be that hard to develop the framework alone. The killer for all these things is in the widgets that hang off them, but if the framework was usable and the widgets were easy to write (say, documented better than BIND9's API for example), the users would take care of providing the widgets. Look at all the noobs writing plugins for cacti and spamassassin and... users will write the plugins if the framework is accessible. Don't give me a package that tries to provide everything, give me a daemon with inter-process messaging, event triggers, an extensible OO inheritance model and I'll do my own damn widgets... It wouldn't take five years to write that. It's a summer project. Some of the things I want in an NMS that I can't find in end-all-be-all monolithic packages: self-config stuff default polling cycle authentication data-storage interfaces etc. host/device information static info (hostname, etc) dynamic info (hardware inventory, software inventory, etc) browser interface MIB browser CIM browser others polling events ICMP SNMP GET WBEM script interface TCP connection interface etc. alarm events SNMP traps WBEM notifications syslog eventlog etc. action events alerts (mail, pager, whatever) run local script run remote script manipulate escalation interface event unanswered, chain to other event event cleared, chain to other event reporting browser meters (eg, watch this mib with realtime tachometer) long-term graphing trend analysis/reporting etc. Really it comes down to having a framework in place that can be extended by end-user admins. IOW it's the section heads, not the list items. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
On Wed, Jan 24, 2007 at 08:05:24PM +0000, Paul Vixie wrote:
glibly said, sir. but i disasterously underestimated the amount of time and money it would take to build BIND9.
While I can't question your credentials at creating serious network infrastructure, I wonder about the comparison between BIND9 and a network monitoring framework that _I_ envision. I think I know a couple of requirement handicaps that BIND9 had which a new tool wouldn't. Specifically, you have to ensure compatibility with the RFCs, which locks you into a fairly complicated parser for the least-writable data format (the zone file) that I have ever had the displeasure of editing. While it gets easier over time, it seems remarkably difficult to get right the first time. Mostly, people forget to update the serial number, but other problems are common too. I imagine you also wanted to maintain the overall structure of the config file, but I don't see this as particularly problematic; it seems straightforward enough for me. Furthermore, there is the monolithic issue; while I find it very convenient to have two name servers instead of four on my home network, it seems that it is serving too many masters (pun not intended). If recursive queries and queries to authoritative name servers had different ports, then there would be little reason to have both in the same package. I can solve it easily right now with IP aliases, which I consider a kluge, but the package I would use for it doesn't support some things that would be nice, like dynamic updates, but I suppose those too could be split off fairly easily. Everybody I know who would have a use for a scalable monitoring system is capable of scripting, and most capable of programming to extend the framework. I suspect an attempt to anticipate every possible need and solve them all for once with one tool would end up growing to unmanagable complexity far too quickly. A framework is the easy part. At the URL in my signature you can find the dynamic firewall daemon, a framework for dynamically adjusting firewall rules. It has an asynch I/O core, so one thread, one program, one firewall, many clients. There is a python version for netfilter/Linux (which is very alpha and needs a new maintainer) and for BSD (pf of course). It supports fixed-size rule queues, rules which timeout at a particular time (can be relative to current), rule sets that can be enabled or disabled by commands, variable substitution (where variable means "modifiable by external programs", and so on, without requiring chains, tables or lists in the firewall syntax. Although I spent a lot of time on design in my head, writing the code is the easy part. It took about a thousand lines of code. I could probably do it in less than 40 hours, but I couldn't do it all at once. The real problem appears to be thinking something over and over and letting your subconscious work on it until you're pretty sure the answer you had converged on consciously is the right one. The hard part, I have found, is in getting people to contribute to it (or generating awareness, which may be a precondition). I'm thinking about writing up a paper on it for submitting to Usenix ;login: magazine, you might keep an eye open for it if you are interested. If you are interested in python and netfilter/iptables, and have some free time, then definitely send me an email; if you know anyone who would like to be an author of a cutting-edge network security system, let them know about it.
and talk to devices that will never go to an snmp connectathon,
Here is a scoping problem. If I started with this goal, I'd be stuck in analysis paralysis forever. I'd rather start with SNMP, and get a usable product that could be extended. The complexity of the task goes up with the square of the things to consider, so I think it's absolutely essential to start with limited objectives and generalize where appropriate on subsequent generations. It seems to me the scalability problem (where most of the data is never read, and one box has to do everything) is more a problem of not being able to have the clients provide some resources without also having a complicated remote interface. Computers are very fast and only getting faster (of course disk I/O bandwidth is not accelerating at all, compared to CPU or network bandwidth). I'm not convinced it would take something python or another very expressive language could provide if properly distributed, and that alone would reduce the time spent writing code by a factor of 10-100, excluding the time spent coming up with a simple (secure) way of distributing the load. I would be most interested in hearing what NANOG people would like to see in a monitoring tool. I think this is an excellent forum for hashing out what it should really do, and how. -- ``Unthinking respect for authority is the greatest enemy of truth.'' -- Albert Einstein -><- <URL:http://www.subspacefield.org/~travis/>
I see a reference in the response to RTG. RTG's claim to fame looks like speed. I've done some work with Cricket and have figured out a way to get at it's schema. I've been looking at mating Cricket' s 'getter and schema with Drraw and genDevConfig tools and putting a Mason based HTML wrapper around the whole thing so people can pick and choose the components of charts they want to see (per chart), (per page). And by filling in simple web forms, it would be easy to generate command lines for genDevConfig to go out and create the customized SNMP queries that are needed for Dial-Peers, Cisco's Quality of Service, etc. Would anyone be interested in such a contraption?
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Paul Vixie Sent: Wednesday, January 24, 2007 13:43 To: nanog@merit.edu Subject: Re: [cacti-announce] Cacti 0.8.6j Released (fwd)
jml@packetpimp.org (Jason LeBlanc) writes:
After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking.
if funding were available, i know some developers we could hire to build the ultimate scalable pluggable network F/L/OSS management/monitoring system. if funding's not available then we're depending on some combination of hobbiests (who've usually got rent to pay, limiting their availability for this work) and in-house toolmakers at network owners (who've usually got other work to do, or who would be under pressure to monetize/license/patent the results if That Much Money was spent in ways that could otherwise directly benefit their competitors.)
"been there, done that, got the t-shirt." is there funding available yet? like, $5M over three years? spread out over 50 network owners that's ~$3K a month. i don't see that happening in a consolidation cycle like this one, but hope springs eternal. "give randy and hank the money, they'll take care of this for us once and for all." -- Paul Vixie
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
I see a reference in the response to RTG. RTG's claim to fame looks like speed.
In comparison to RRDTOOL-based applications, RTG stores raw values rather than cooked averages, allowing for a great deal more flexibility in analysis. And you aren't limited to a temporally fixed window of data.
On Wed, 24 Jan 2007, Mark Boolootian wrote:
I see a reference in the response to RTG. RTG's claim to fame looks like speed.
In comparison to RRDTOOL-based applications, RTG stores raw values rather than cooked averages, allowing for a great deal more flexibility in analysis. And you aren't limited to a temporally fixed window of data.
And meaning that speed of analysis would be function of x where x is length of time in the analysed period. RRD takes good intermediate approach storing all the data for latest few data samples and averages for longer time period data. Some people also use double approach where data is both stored in RRD for quicker access to graphing (for day/month/year like network engineers here like to see since the days when MRTG came out) as well as storing data in SQL database for more detailed analysis to be done by request (of course your database is also continues to grow undefinetly unlike fixed-size RRD files). -- William Leibzon Elan Networks william@elan.net
Maybe this is overly naïve, but what about the ability to auto-magically import and search various vendor SNMP/WMI MIBs? I can think of 3 open source NMS that do a good job if you set up all 3 to monitor the network, but they all overlap and none of them really do a good job. I also am using a closed-source NMS at work that does little more than minimal on-system agent monitoring of Windows/Linux based servers (disk space cpu memory utilization). Good graphing, good alerts, good SNMP integration, granularity, and escalation, as well as pretty executive reports to keep PHB's happy (and that display the system as 5 9's uptime no matter how many times the mail server crashed!). The reason why the open-source tools don't work is a lack of comprehensive coverage of Cisco, third party network kit, Linux and Windows. It just doesn't quite "do it all". The reason why the closed-source tool didn't work (in my mind) is that it just doesn't have the flexibility to deal with anything other than what it's expecting. I've submitted a few dozen support tickets with them (and they will remain nameless) simply because of a lack of SNMP knowledge on their part. Please forgive me for all above M$ specific references, I work in a MS and *IX environment. Andrew D Kirch - All Things IT Office: 317-755-0202 "si hoc legere scis nimium eruditiones habes."
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Ray Burkholder Sent: Wednesday, January 24, 2007 1:12 PM To: nanog@merit.edu Subject: RE: [cacti-announce] Cacti 0.8.6j Released (fwd)
I see a reference in the response to RTG. RTG's claim to fame looks like speed.
I've done some work with Cricket and have figured out a way to get at it's schema. I've been looking at mating Cricket' s 'getter and schema with Drraw and genDevConfig tools and putting a Mason based HTML wrapper around the whole thing so people can pick and choose the components of charts they want to see (per chart), (per page). And by filling in simple web forms, it would be easy to generate command lines for genDevConfig to go out and create the customized SNMP queries that are needed for Dial-Peers, Cisco's Quality of Service, etc.
Would anyone be interested in such a contraption?
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Paul Vixie Sent: Wednesday, January 24, 2007 13:43 To: nanog@merit.edu Subject: Re: [cacti-announce] Cacti 0.8.6j Released (fwd)
jml@packetpimp.org (Jason LeBlanc) writes:
After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking.
if funding were available, i know some developers we could hire to build the ultimate scalable pluggable network F/L/OSS management/monitoring system. if funding's not available then we're depending on some combination of hobbiests (who've usually got rent to pay, limiting their availability for this work) and in-house toolmakers at network owners (who've usually got other work to do, or who would be under pressure to monetize/license/patent the results if That Much Money was spent in ways that could otherwise directly benefit their competitors.)
"been there, done that, got the t-shirt." is there funding available yet? like, $5M over three years? spread out over 50 network owners that's ~$3K a month. i don't see that happening in a consolidation cycle like this one, but hope springs eternal. "give randy and hank the money, they'll take care of this for us once and for all." -- Paul Vixie
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
-- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
Maybe this is overly naïve, but what about the ability to auto-magically import and search various vendor SNMP/WMI MIBs? I can think of 3 open source NMS that do a good job if you set up all 3 to monitor the network, but they all overlap and none of them really do a good job.
Importing and searching MIBs is an interesting idea. However, for some mibs, like Cisco's Qos and Dial-Peer mibs, sometimes wrapper code has to be used to ferret out the appropriate groupings to use as logical entities for displaying. WMI requires Windows Authentication, and if one is running Linux tools, there are issues. I havn't come a cross an easy way to get to WMI from Linux yet. Anyone have any suggestions? -- Scanned for viruses and dangerous content at http://www.oneunified.net and is believed to be clean.
On 1/24/2007 2:46 PM, Ray Burkholder wrote:
WMI requires Windows Authentication, and if one is running Linux tools, there are issues. I havn't come a cross an easy way to get to WMI from Linux yet. Anyone have any suggestions?
I've been working on this for a while actually. WMI is WBEM, except that WMI uses DCOM as a transfer protocol instead of using HTTP like WBEM. The big problem for Linux is that there aren't any implementations. However there are some interesting tools that provide gateway services that get around the problem. Part of the openpegasus tarball is a program called wmimapper that provides a WBEM to WMI gateway. Basically you send it WBEM queries with HTTP authentication etc, and it converts those into WMI requests. It runs on Windows (to generate the DCOM), and it's source-only so you'll need to compile it yourself (although IBM and HP also include older ports in their server monitoring software). I've been using it to pull Everest sensor data off Windows boxes into Cacti on Linux for a while. There are some problems with the whole thing, but it pretty much works. SNMP Informant has a WMI-SNMP gateway agent that makes some/most Windows data available through SNMP, which is handy. nsclient also provides access to some perfmon and static data through a custom agent/proxy protocol too. http://forums.cacti.net/viewtopic.php?t=11752 http://www.openpegasus.org/ http://www.snmp-informant.com/ http://nsclient.ready2run.nl/ -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
On Wed, Jan 24, 2007 at 02:12:01PM -0400, Ray Burkholder wrote:
I've done some work with Cricket and have figured out a way to get at it's schema. I've been looking at mating Cricket' s 'getter and schema with Drraw and genDevConfig tools and putting a Mason based HTML wrapper around the whole thing so people can pick and choose the components of charts they want to see (per chart), (per page). And by filling in simple web forms, it would be easy to generate command lines for genDevConfig to go out and create the customized SNMP queries that are needed for Dial-Peers, Cisco's Quality of Service, etc.
This seems to be a better solution than CGIs and web forms, both from a "graphic designer" point of view as well as KISS point of view. Unfortunately it's not currently compatible with HTTP/S and HTML, but any major improvement that was also simpler would have to be. -- ``Unthinking respect for authority is the greatest enemy of truth.'' -- Albert Einstein -><- <URL:http://www.subspacefield.org/~travis/> For a good time on my UBE blacklist, email john@subspacefield.org.
On Wed, Jan 24, 2007 at 08:34:19AM -0500, Jason LeBlanc wrote:
I would say somewhere around 4000 network interfaces (6-8 stats per int) and around 1000 servers (8-10 stats per server) we started seeing problems, both with navigation in the UI and with stats not reliably updating. I did not try that poller, perhaps its worth trying it again using it. I will also say this was about 2 years ago, I think the box it was running on was a dual P3-1000 with a raid 10 using 6 drives (10k rpm I think).
After looking for 'the ideal' tool for many years, it still amazes me that no one has built it. Bulk gets, scalable schema and good portal/UI. RTG is better than MRTG, but the config/db/portal are still lacking.
So, i've been the caretaker of a few different snmp pollers over a few years, as well as done some database foo (250m+ rows/day of data) and these things interrelate in a number of ways. First start with the polling, you need to do bulkget/bulkwalk of the various mibs to collect the data in a reasonable way, timestamp it all (either internally before you "cook" the data), poll frequently enough to detect spikes (including inaccurate spikes and backwards/missing counter bugs), etc.. Take a simple set of data you might want to collect: router interfaces (mib) up/down in/out octets, in/out packets, in errors/out drops speed (ifMIB too?) ifMIB (64-bit counters, but only sometimes) description speed (interface mib too?) mpls ? ldp? te? paths? mac accounting ? then you get into do you store the raw data you collect with markers for snmp timeouts, or just a 5 min calculation/sample? (this relates to the above 250m rows/day) how do you define your schema? how long does it take to insert/index/whatnot the data? how to handle ifindex moves (not just one vendor too, don't forget that)? how do you match that link to a customer for billing? who gets what reports? engineering reports too? provisioning link-in? tie to ip address db (interface ip<->customer mapping)? the list goes on and on, this is just part of it, let alone any possible tracking of assets/hardware, let alone proactive network monitoring (tie those traps/walks) to the internal ping(er) to passive network monitoring, etc.. this is a huge burden to figure it all out, implement and then monitor/operate 24x7. miss enough samples or data and you end up billing too little. this is why most folks have either cooked their own, or use some expensive suite of tools, leaving just a little bit of other stuff out there. in a lot of ways, just buying a ge/10ge and paying some alternate price for it may be cheaper than a burstable rate as it could reduce a lot of this extra cost. i remember hearing that it cost telcos more to count/track the calls to give you a detailed bill than for the call itself. this is why flat-rate is nearly king these days (in the us at least). - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
participants (17)
-
Andrew Kirch
-
Berkman, Scott
-
Chris Owen
-
Eric A. Hall
-
Gadi Evron
-
Henning Brauer
-
Jared Mauch
-
Jason LeBlanc
-
Jeremy Chadwick
-
Jeroen Massar
-
Jim Popovitch
-
Jon Lewis
-
Mark Boolootian
-
Paul Vixie
-
Ray Burkholder
-
Travis H.
-
william(at)elan.net