Reliability and availability (Part 2: MTBF)

Let’s start with the mystery around MTBF (Mean Time Between Failures). You may disagree with me, but please give me a chance and read to the end before you make your mind.

Many people believe that the higher MTBF figure the more reliable system they get. Sounds logical, right? However, it is not as simple as it sounds and you should be very careful and eventually also take things with a bit of salt.

First of all, there is a problem with the definition. So many specifications ask for MTBF value to be stated in the quotation, but so few also specify how the figure shall be obtained. Neither they refer to a standard that would provide a guideline for calculation. What does it practically mean? You get figures from manufacturers that are completely incomparable! While one vendor uses calculation based on reliability of each individual component inside the product or system, other vendor uses so called ‘field experience’. At first this sounds very attractive, but looking a bit closer several questions arise. Are all failures reported back to the manufacturer? During warranty this might be true, but once the warranty expires..? I seriously doubt. Customer might have spare parts and repair the defect himself. Also, if the spare parts are not very unique, the customer can buy them without noticing the equipment manufacturer. Therefore, field experience values tend to be a bit too optimistic!

Further, you shall get the definition from the vendor about what failures count for MTBF figures and what do NOT count. Did you know that failures related to design are usually excluded? Well, this is an essential fact you shall verify. A bad-designed product with many field issues might look great in terms of MTBF while a highly reliable and robust product would look worse due to few minor issues. Is that fair?

Customers with strong focus on reliability (like military) have their own standards that are extremely strict and define exactly how to calculate the figures. They would immediately doubt some non-realistic figures. However, if not explicitly requested, no one would normally follow these standards as they inherently give low (conservative) MTBF values and make your product look somewhat less cool.

You shall be extremely careful when comparing the reliability figures! As said in German language, ‘paper is patient’. The vendor can polish the figures to look very attractive on paper. Are these values typical or are they project specific? Are they binding or non-binding? What does a non-binding value tell you? An indication? Can you start something with that?

If you like the MTBF figure and it is not just an additional array in your datasheet to be filled in, consider specifying a proper definition and guideline how this figure shall be obtained. And if reliability and availability really matters to you, we recommend to request contractually guaranteed availability of your system or the main system components (at least during warranty period, optionally also throughout the lifecycle) including applicable penalties/liquidated damages. It shall force the supplier to use figures he is confident about. Otherwise the whole exercise does not bring too much added value.

Worst thing you can do is to disqualify a high quality manufacturer that is being honest and conservative by comparing his product with other vendor providing some science fiction figures without solid base. Unfortunately, such practice still happens and it is hard work to change it. You can contribute to such a change!