On 5/1/2011 2:07 PM, Mike wrote:
I am still waiting for proof that single points of failure can realistically be completely eliminated from any moderately complicated network environment / application. So far, I think murphy is still winning on this one.
Sure they can, but as a thought exercise fully 2n redundancy is difficult on a small scale for anything web facing. I've seen a very simple implementation for a website requiring 5 9's that consumed over $50k in equipment, and this wasn't even geographically diverse. I have to believe that scaling up the concept of "doing it right" results in exponential cost increases. To illustrate the problem, I would give you the first step in the thought exercise: first find two datacenters with diverse carriers, that aren't on the same regional power grid (As we've learned in the (iirc) 2003 power outage, New York and DC won't work, nor will Ohio, so you need redundant teams to cover a very remote site).