On Sun, Apr 26, 2009 at 11:18 PM, JC Dill <jcdill.lists@gmail.com> wrote:
How else do you propose getting outage information to your customers?
I should have clarified. Third party physical control isn't necessarily the issue, but third party administration and delivery (in the context of twitter) is. Dedicated servers are cheap and you can maintain control of the content. Its not quite the same as using twitter or other third party SaaS that is similar (which can, invariably, control the content at its whim and is a nightmare to manage persons authorized to view such outage info, depending on the service) Or even a mailer that is outside of the scope of your service ops and permit only customers to subscribe. Again, its more about distribution in these environments. If I'm Company A, I really don't want to readily provide my competitor, Company B, with information on outages and a full history of it for them to use in some marketing device (which can't be compared because Company B does not publish their info, but instead provides some nice glossy-paper stats). Physical control certainly can't be the question.. or we'd have the same argument in circles, "If we have physical control, how can we ensure the outage doesn't affect this net too? Better question, why can't we fail over to the net that's working to send these notifications/updates for our down services if the net isn't affected?" That point is moot. My biggest complaint has been with networks that setup a channel like this
but then get "too busy" during an outage to make use of it. If you are going to setup a channel like this, make sure you use it. Also, if you post a partial update, make sure you follow up with more information when you have it. Some of us read the archives to see if this information was posted and followed-up on in a timely fashion, to evaluate the outage reporting service record before signing up.
The way notifications should be distributed is in a proactive manner and followed up as a ticket or some other relevant mechanism. Implementing a process like this is trivial in many environments. Incident response should, in most cases, include a mechanism like this that has already been deployed today. Modifications (technically speaking) should not be a big issue. --WJM IV