NOC Automation / Best Practices
NOGGERS, The recent thread on ISP port blocking practice mentioned a way to identify infected machines through a highly automated manner. This got me thinking about other ways to automate aspects of network/system operations when it comes to tier-1 end user support (is it plugged in/is your wireless working etc) and tier-2/3 NOC support (abuse desk/incident response/routing issues etc) . I'm putting in a very high degree of monitoring/healing in place to reduce the amount of end user support calls that come in, and only bother a human when it's a real issue. I'm in the process of launching a small regional wireless ISP / ad delivery network in Los Angeles CA. I have a small staff (I'm the only full time engineer, I have a couple NOC techs and 1 help desk tech who will provide escalation for any serious issues). My initial thoughts/questions on the matter: 1) Are people integrating their PBX with their OSS/CRM systems? So when a call comes in the tech has all the relevant information? (perhaps even things like traceroute/port scan/AV/security health status based on their phone number or customer number?). This way if I take a user offline because they are spewing spam/virii the tech can refer them to our IT support partner organization to clean up their PC. :) 2) What sort of automated alerting/reporting/circuit turn down/RADIUS lock out is done in regards to alerting customers or even taking them offline when they have a security issue? 3) What are folks doing in terms of frontline offloading? Do you have your PBX set to play a different recording when you have an outage so the NOC techs phones don't go crazy and leave them free to deal with the issue? 4) Your comments here. :) The way I see it, an ounce of prevention is worth a pound of cure. Along those lines, I'm putting in some mitigation techniques are as follows (hopefully this will reduce the number of incidents and therefore calls to the abuse desk). I would appreciate any feedback folks can give me. A) Force any outbound mail through my SMTP server with AV/spam filtering. B) Force HTTP traffic through a SQUID proxy with SNORT/ClamAV running (several other WISPs are doing this with fairly substantial bandwidth savings. However I realize that many sites aren't cache friendly. Anyone know of a good way to check for that? Look at HTTP headers?). Do the bandwidth savings/security checking outweigh the increased support calls due to "broken" web sites? C) Force DNS to go through my server. I hope to reduce DNS hijacking attacks this way. Thanks!
On Sep 8, 2010, at 10:54 PM, Charles N Wyble wrote:
I would appreciate any feedback folks can give me.
<https://files.me.com/roland.dobbins/y4ykq0> <https://files.me.com/roland.dobbins/k54qkv> <https://files.me.com/roland.dobbins/prguob> <https://files.me.com/roland.dobbins/k4zw3x> <https://files.me.com/roland.dobbins/dweagy> <https://files.me.com/roland.dobbins/m4g34u> <https://files.me.com/roland.dobbins/9i8xwl> ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Sell your computer and buy a guitar.
participants (2)
-
Charles N Wyble
-
Dobbins, Roland