
you know what's happening even more? ..Amazon not learning their lesson. they just had an outage quite similar.. they "performed a full audit" on electrical systems worldwide, according to the rfo/post mortem. looks like they need to perform a "full and we mean it" audit, and like I've been doing/participating in at dot coms for a decade plus: Actually Do Regular Load tests.. Related/equally to blame: companies that rely heavily on one aws zone, or arguably "one cloud" (period), are asking for it. Please stop these crappy practices, people. Do real world DR testing. Play "What If This City Dropped Off The Map" games, because tonight, parts of VA infact did. Down: Instagram, Pinterest, Netflix, Heroku, Woot. Pocket(Read It Later), and on and on. A bunch of openID sites. A bunch of DNS sites (think zoneedit et al). Infact, probably nearly a /12 if not more of space.. Blame lies both with AWS (again) and with these services providers. They all should know better. -j On Jun 29, 2012 11:22 PM, "Justin M. Streiner" <streiner@cluebyfour.org> wrote:
On Fri, 29 Jun 2012, Mike Lyon wrote:
Whatever happened to UPSs and generators?
They can and do fail. See list archives for numerous reports and examples :)
Generators are capable of not starting. ATSs can get into a situation where they don't transfer loads properly, or they can't start the generator(s) UPSs can fail, drain out, or be left in bypass. Breakers can trip and need a manual reset etc...
jms
On Fri, Jun 29, 2012 at 8:45 PM, Jason Baugher <jason@thebaughers.com
wrote:
Nature is such a PITA.
On 6/29/2012 10:42 PM, James Laszko wrote:
To further expand:
8:21 PM PDT We are investigating connectivity issues for a number of instances in the US-EAST-1 Region.
8:31 PM PDT We are investigating elevated errors rates for APIs in the US-EAST-1 (Northern Virginia) region, as well as connectivity issues to instances in a single availability zone.
8:40 PM PDT We can confirm that a large number of instances in a single Availability Zone have lost power due to electrical storms in the area. We are actively working to restore power.
-----Original Message----- From: Grant Ridder [mailto:shortdudey123@gmail.****com< shortdudey123@gmail.com> ] Sent: Friday, June 29, 2012 8:42 PM To: Jason Baugher Cc: nanog@nanog.org Subject: Re: FYI Netflix is down
From Amazon
Amazon Elastic Compute Cloud (N. Virginia) ( http://status.aws.amazon.com/**** <http://status.aws.amazon.com/**>) 8:21 PM PDT We are investigating connectivity issues for a number of instances in the US-EAST-1 Region. 8:31 PM PDT We are investigating elevated errors rates for APIs in the US-EAST-1 (Northern Virginia) region, as well as connectivity issues to instances in a single availability zone.
-Grant
On Fri, Jun 29, 2012 at 10:40 PM, Jason Baugher <jason@thebaughers.com
wrote:
Seeing some reports of Pinterest and Instagram down as well. Amazon
cloud services being implicated.
On 6/29/2012 10:22 PM, Joe Blanchard wrote:
Seems that they are unreachable at the moment. Called and theres a
recorded message stating they are aware of an issue, no details.
-Joe
-- Mike Lyon 408-621-4826 mike.lyon@gmail.com
http://www.linkedin.com/in/**mlyon <http://www.linkedin.com/in/mlyon>