Common mode failure in engineering Common cause and special cause (statistics)







common mode failure has more specific meaning in engineering. refers events not statistically independent. is, failures in multiple parts of system caused single fault, particularly random failures due environmental conditions or aging. example when of pumps fire sprinkler system located in 1 room. if room becomes hot pumps operate, fail @ same time, 1 cause (the heat in room). example electronic system wherein fault in power supply injects noise onto supply line, causing failures in multiple subsystems.


this particularly important in safety-critical systems using multiple redundant channels. if probability of failure in 1 subsystem p, expected n channel system have probability of failure of p however, in practice, probability of failure higher because not statistically independent; example ionizing radiation or electromagnetic interference (emi) may affect channels.


the principle of redundancy states that, when events of failure of component statistically independent, probabilities of joint occurrence multiply. thus, instance, if probability of failure of component of system 1 in 1 thousand per year, probability of joint failure of 2 of them 1 in 1 million per year, provided 2 events statistically independent. principle favors strategy of redundancy of components. 1 place strategy implemented in raid 1, 2 hard disks store computer s data redundantly.


but there can many common modes: consider raid1 2 disks purchased online , installed in computer, there can many common modes:



the disks same manufacturer , of same model, therefore share same design flaws.
the disks have similar serial numbers, may share manufacturing flaws affecting production of same batch.
the disks have been shipped @ same time, have suffered same transportation damage.
as installed both disks attached same power supply, making them vulnerable same power supply issues.
as installed both disks in same case, making them vulnerable same overheating events.
they both attached same card or motherboard, , driven same software, may have same bugs.
because of nature of raid1, both disks subjected same workload , closely similar access patterns, stressing them in same way.

also, if events of failure of 2 components maximally statistically dependent, probability of joint failure of both identical probability of failure of them individually. in such case, advantages of redundancy negated. strategies avoidance of common mode failures include keeping redundant components physically isolated.


a prime example of redundancy isolation nuclear power plant. new abwr has 3 divisions of emergency core cooling systems, each own generators , pumps , each isolated others. new european pressurized reactor has 2 containment buildings, 1 inside other. however, here possible common mode failure occur (for example, in fukushima daiichi nuclear power plant, mains power severed tōhoku earthquake, thirteen backup diesel generators simultaneously disabled subsequent tsunami flooded basements of turbine halls).








Comments

Popular posts from this blog

Prosodic bootstrapping Bootstrapping (linguistics)

Principal leitmotifs Music of The Lord of the Rings film series

List of masters Devon and Somerset Staghounds