-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It has been a while since I have had to seriously think about network/system/application monitoring and now I have got to look at it. Can anyone point me towards: 1. Serious documents on monitoring (i.e. not vendor whitepapers) 2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios). Thanks, - ---> Phil -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) iD4DBQFHJ5sZa2RfHGe2XK4RAj3bAJ4oNXS4XUfUiUz5pc8zeL4cdnCUBwCY06dP qN872ahrJT+fCX/+g0hrZw== =b5pJ -----END PGP SIGNATURE-----
Nesser, Phil (nesser) writes:
It has been a while since I have had to seriously think about network/system/application monitoring and now I have got to look at it. Can anyone point me towards:
1. Serious documents on monitoring (i.e. not vendor whitepapers)
Hi Phil, There's lots of different papers out there -- define serious. Is an online column comparing monitoring systems serious enough ? What focus ? Best practices ? Agent vs SNMP based, etc... Topics are varied.
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
That can be a long thread as well... Nagios, OpenNMS, Zabbix, Hyperic, ZenOSS - for the application/ service/server/network monitoring, and Cacti, Smokeping, NFsen for capacity/availability monitoring. We used Nagios and co. until a few years ago, when we figured it wouldn't scale for large networks. Then we wrote our own :) Cheers, Phil
Stephen Stuart and Joe Abley did a tutorial at NANOG26 called "Managing IP Network with Free Software". It covers more than just monitoring, and is great if you aren't going to just roll your own... The PDF is here: http://www.nanog.org/mtg-0210/ppt/stephen.pdf I cannot remember if they mentioned it, but incase they didn't, you should also include NAV (Network Administration Visualized) -- http:// metanav.uninett.no/ W On Oct 30, 2007, at 4:59 PM, Nesser, Phil wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
It has been a while since I have had to seriously think about network/system/application monitoring and now I have got to look at it. Can anyone point me towards:
1. Serious documents on monitoring (i.e. not vendor whitepapers) 2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
Thanks,
- ---> Phil -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32)
iD4DBQFHJ5sZa2RfHGe2XK4RAj3bAJ4oNXS4XUfUiUz5pc8zeL4cdnCUBwCY06dP qN872ahrJT+fCX/+g0hrZw== =b5pJ -----END PGP SIGNATURE-----
On Tue, 30 Oct 2007, Nesser, Phil wrote:
It has been a while since I have had to seriously think about network/system/application monitoring and now I have got to look at it. Can anyone point me towards:
1. Serious documents on monitoring (i.e. not vendor whitepapers)
I think there have been several sets of slides presented at previous NANOG meetings that may be of interest, but I'll have to locate specific URLs.
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
As much as I hate to give a wishy-washy answer like "it depends", in this case, that's a reasonable start. What tools you use would depend on many factors, such as: * hardware and OS platforms that are realistic for your organization Put another way, if your IT or net mgmt organization standardizes on some flavor of Windows as part of a regular server build, it might not make sense to use tools that require Linux, *BSD, etc, unless you have the people and processes to handle that. Since you mentioned tools like nagios and MRTG, I'm assuming you're working in the unix/Linux/*BSD world, but you know what they say about assumptions :) * goals and metrics What information do you want to get out of your monitoring setup? Do you need to produce regular reports from your NM tools? Do they need to integrate with tools you already use? Do you want the tools to automatically trigger certain actions? if X consecutive pings to $router_ip fail, send out a page, email the NOC, etc... What data do you want to collect from your network devices? SNMP traps? Netflow records? Syslog messages? RMON? Do you need to visualize the data, i.e. generate usage graphs, top-talker scoreboards, etc? Do you need to store the output in a central SQL database so other apps can work with it, do reports, etc? This is by no means an all-inclusive list, but I think it covers some of the important points. jms
On Oct 31, 2007, at 3:59 AM, Nesser, Phil wrote:
1. Serious documents on monitoring (i.e. not vendor whitepapers)
Several NANOG presentations are available via VoD and preso files which discuss this subject, check the archives at nanog.org. Besides the usual SNMP instrumentation, I would recommend taking a look at NetFlow and starting with an open-source tool like nfsen/nfdump. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@cisco.com> // 408.527.6376 voice I don't sound like nobody. -- Elvis Presley
hi phil. long time. so, before opening my big mouth, the From: line makes me first ask o what is the scale of what you are trying to measure/monitor? o what kinds of parameters are you trying to measure/monitor? o what kinds of reporting/alerts are you seeking? o is this snmp kind of stuff, or far more? randy
Randy Bush (randy) writes:
hi phil. long time.
so, before opening my big mouth, the From: line makes me first ask
And, are you limited to monitoring, or are you actually thinking about network management as well ? (things like Rancid, RT, incident/ event management, configuration management/change management come to mind).
owner-nanog@merit.edu wrote on 10/30/2007 04:59:05 PM:
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
As mentioned, you can get alot of network information from netflow. There are several open-source options. One such for netflow collection/analysis is 'flow-tools' with 'FlowViewer'. http://www.splintered.net/sw/flow-tools (original development) http://code.google.com/p/flow-tools (active fork) http://ensight.eos.nasa.gov/FlowViewer Joe
On 10/30/07, Nesser, Phil <nesser@amazon.com> wrote:
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
I don't see netdisco mentioned in this space very much, but I recommend it for the "what is plugged into what" question - both in an enterprise environment ("where is this misbehaving MAC address?") and a data center ("which port was that server plugged into on the switch?"). Bill
Bill Fenner (fenner) writes:
On 10/30/07, Nesser, Phil <nesser@amazon.com> wrote:
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
I don't see netdisco mentioned in this space very much, but I recommend it for the "what is plugged into what" question - both in an enterprise environment ("where is this misbehaving MAC address?") and a data center ("which port was that server plugged into on the switch?").
Some of the Metanav features actually do this, but yes NetDisco is quite useful (especially its Perl modules are invaluable when doing configuration management across the bogos^H^Hvariety of Cisco equipment out there).
On Wed, 31 Oct 2007, Bill Fenner wrote:
On 10/30/07, Nesser, Phil <nesser@amazon.com> wrote:
2. Open Source Tools that you use or would recommend (I know the obvious smokeping, mrtg, nagios).
I don't see netdisco mentioned in this space very much, but I recommend it for the "what is plugged into what" question - both in an enterprise environment ("where is this misbehaving MAC address?") and a data center ("which port was that server plugged into on the switch?").
Anecdotal evidence of the usefulness of such tools: The environment was a pair of cat6509s running multiple gigabit etherchannel crossconnects, with lots of gigabit and 100mbit servers on either side, talking back and forth to each other, or up the stick to the egress routers. I was building an inventory tool to help me track down mislabelled or unlabelled ports, to clean up and audit the device inventory. I notice one lonely 100 meg port bridging a large number of MAC addresses that were homed on the *other* 6509. I mentioned it as odd in passing to the network engineer, and was advised that my tool was probably broken. I took it under advisement and when on about my business. A few hours later, I discovered that I could make the sysadmins and network engineers run around asking each other what was broken by scp'ing a huge file between two databases on opposite switches. When I stopped my transfer, they stopped running. Start it again, panic at the disco. Very refreshing. I brought up the 100 meg port bridging all those addresses, and lo an behold, a misconfigured load balancer had somehow suborned the multi-gig etherchannel crossconnects and was bridging everything in the one big vlan that all the servers sat in. (That's a different story.) - billn
participants (9)
-
Bill Fenner
-
Bill Nash
-
Joe Loiacono
-
Justin M. Streiner
-
Nesser, Phil
-
Phil Regnauld
-
Randy Bush
-
Roland Dobbins
-
Warren Kumari