James Downs wrote:
For Netflix (and all other similar services) downtime is money and money is downtime. There is a quantifiable cost for customer acquisition and a quantifiable churn during each minute of downtime. Mature organizations actually calculate and track this. The trick is to ensure that you have balanced the cost of greater redundancy vs the cost of churn/customer acquisition. If you are spending too much on redundancy, it's as big of mistake as spending too little.
Actually, for Netflix, so long as downtime is infrequent or short enough that users don't cancel, it actually saves them money. They're not paying royalties for movies being streamed during downtime, but they're still collecting their $8/month. There is no meaningful SLA for the end user to my knowledge. I imagine the threshold for *any* user churn based on downtime is very high for Netflix. So long as they are "about as good as cable/sattelite TV" in terms of uptime Netflix will do fine. You would have to get into 98% uptime or lower before people would really start getting irritated enough to cancel. Of course multiple short outages would be more painful than a few longer ones from a customer's perspective. I imagine Netflix is mature enough to track this data as you suggest, and that's why they use AWS - downtime isn't a big deal for their business unless it gets really, really bad.