Alex Rubenstein wrote:
Yup. Related: "100% availability" is a marketing person's dream; it sounds good in theory but is unattainable in practice, and is a reliable sign of non-100%-reliability.
You are confusing two different things.
Availability != Reliability.
Pardon the interruption... In the aforementioned statement, there appears an intense/flagrant - compartmentalization/separation of terms without sufficient explanation. Note that in being available, 'a' criteria to ensure reliability is met. If one has the desire to delve into some of the nuanced operational perspective, see: http://ow.ly/zmQg (pdf) or http://ow.ly/zmTB (web friendly). The article is also available through the IEEE Portal at http://ow.ly/zn3a (if one of the other links appear to be unavailable, anytime).
For instance, an airplane is designed to be 100% reliable, but much less available. To keep a 747 from not crashing (100% reliability) it needs significant downtime (not 100% available).
This explanation, aside from being unsatisfactory, is misleading. Operating times and maintenance times are very much separate quantities.
And even for those who follow best practices... You can inspect and maintain things until you're blue in the face. One day a contractor will drop a wrench into a PDU or UPS or whatever and spectacular things will happen.
That's were policies, procedures and methods come in (read: SAS70)
For the operationally minded -- on one hand, there is an assumption here that 'accidents' are not preventable; on the other hand, there is at least an assumption being made here that SAS 70 is the curative for 'accidents.' To be brief, accounting for human behavior as an underlying contributor to accidents can be a backbreaking and immensely messy endeavor. In this respect, SAS 70 can only be assistive. All the best, Robert Mathews. --