I'm looking for some good reference materials to do some "reliability engineering" calculations and projections. This is to justify increased redundancy, and I want to include quantifiable numbers based on MTBF data and other reliability factors, kind of a scientific justification instead of just the typical emotional appeal using analyst/vendor FUD. I'd appreciate references on how to do this in a network environment (what data to collect, how to collect it, how to analyze, etc). Also any data (or rules of thumb) on typical MTBFs for network events that I won't find on vendor product slicks (like what's the MTBF on IOS, or human-caused service outages of various types, etc). If someone has put together something remotely like this that they'd care to share, that'd be incredibly helpful. Thanks. Pete.