The reason I do these things is that I want to improve in some way. I don’t think that we can ever escape measuring, tracking, or following things. As goes our personal lives, so goes our work. “You can’t manage what you don’t measure”, said Peter Drucker.

Whatever you call it — Measurement or key performance indicators (KPI) — maintenance and engineering managers must have performance measurements in place either to validate that the work their staffs are performing in achieving the departments’ goals and objectives or to identify opportunities for continuous improvement.

Among the most commonly used measurements that engineers and managers can put into practice to determine performance are:

- Mean Time to Repair (MTTR)
- Mean Time Between Failure (MTBF)

**Availability**

These measurements enable managers to track equipment, personnel and reliability performance. At the end of the day, each of these measurements has a financial impact on the organization.

**Measurements Matter**

For managers, measuring and monitoring their departments’ activities is essential in determining the way that these activities affect the facility’s overall condition and performance. Below are examples of tracking and measuring that can produce tangible results for both departments and facilities.

**MTTR**

Sometimes referred to as maintainability, MTTR is the measure of the department’s ability to perform maintenance to retain or restore assets to a specified condition. It measures the average time required to restore an asset to its full operational condition after a failure. Typically is expressed in hours, the equation is straightforward: the total repair time divided by the number of repairs or replacement events.

**MTTR-2**

For example, a facility is responsible for maintaining a standard Chiller Unit that has operated for 3,600 hours over the past two years. The Chiller Pump unit has failed 12 times over this period resulting in 720 minutes of repair time. Taking the total time to repair the unit (720) and dividing that number by the number of repairs (12) produces an average time to repair the unit of 60 minutes. So, the MTTR is one hour.

**MTBF**

MTBF is a basic measure of an asset’s reliability. It is calculated by dividing the total operating time of the asset by the number of failures over a given period of time.

**MTBF-2**

Taking the example of the Chiller Unit above, the calculation to determine MTBF is: 3,600 hours divided by 12 failures. The result is 300 operating hours.

**Availability**

This measurement expresses the probability that an asset can perform its intended function satisfactorily when needed in a stated environment. The availability of an asset will diminish over time as the equipment is being used. The availability will not improve unless changes are made to upgrade the asset.

Technicians can extend the equipment’s availability by increasing its reliability. There is a generally accepted availability standard of 95 percent for equipment, but mission- critical equipment in facilities requires a much higher level of availability.

To calculate availability, use the formula of MTBF divided by (MTBF + MTTR).

**Availability-1**

By continuing with the above example of the Chiller Unit, its availability is: 300 divided by 360. The result is 83.3 percent availability.

**Probability of Failure**

This calculation gets a little more complicated mathematically. At times, managers need to calculate the probability that a piece of equipment will fail. Continue with example of the Chiller. A manager needs to ensure the availability of the Chiller for the next 72 hours. What is the probability of failure?

The Reliability Function for the Exponential Distribution

R(t)=e ^ −λt

Given a failure rate, lambda, we can calculate the probability of success over time, t.

In probability theory and statistics, the exponential distribution, which is also known as negative exponential distribution, is the probability that describes the time between events

t is 1 divided by MTBF. In the Chiller example, the MTBF is 300, so 1 divided by 300 is 0.00333.

So the calculation is: R(72) = e – ^ (72)(0.00333). The result is 78.68 percent probability of failure.