Reliability and Maintainability Analysis

Strategies for designing highly reliable systems differ depending on the way in which systems are used.

Mission critical systems are designed to be highly reliable. An example of such a system would be a missle. Design efforts are centered on using the most robust components. Fault tolerance and redundancy tend to secondary strategies.

Highly Available systems are designed to have minumum or no down time. An air traffic control system is designed using redundancy and data replication so that failed system nodes are backed up by standby systems. The presence of backup systems also means that system modifications can be performed without down time.

Fault Tolerant systems are systmes that continue to operate, albiet in a degraded manner when failures occur. The Electric Power Grid is an example. The system is designed and operated taking into account possible failures (i.e. continegency analysis)

In all cases probabilistic calcuations taking into account component failure rates, conditions that modify rates (sresses) and time to repair failures are to predict performance and to analyze different stragegies.