Using Reliability Centred Maintenance (RCM) to determine which automated monitoring systems to install to new and existing equipment

Using Reliability Centred Maintenance (RCM) to determine which automated monitoring systems to install to new and existing equipment CHRIS JAMES RELIABILITY MANAGEMENT LTD Cairo, November 2018

The rate of technological advance is accelerating *Source: Michael Lee, SA Museum; World Economic Forum

And the number of monitoring options is also growing rapidly Moubray s Original RCM book published in 1991 50 separate condition-monitoring techniques listed (APPENDIX 4) Moubray s RCM book (V2) published in 1997 102 separate condition-monitoring techniques listed (APPENDIX 4) Context: New rail projects Over 200 separate technologies and/or products identified for tracks, trains and supporting infrastructure The list was by no means exhaustive

A few selected examples of Automated Monitoring Solutions (from a list of over 200)

Some Case Studies covering new equipment Context: New Rail Project (Track and Infrastructure) The Challenge: Which automated condition monitoring technologies should be employed? Context: New Oil and Gas Platform The Challenge: Which automated condition monitoring technologies could be employed? Can the platform be unmanned? How to determine which automated condition monitoring technologies to deploy? Context: New High Speed Rail Project (Rolling Stock) The Challenge: Which automated condition monitoring technologies should be employed?

Why do we have Automated Condition Monitoring? Because we want our assets to be safe, reliable and cost effective We need to manage failures that matter Automated monitoring needs to be better than the alternatives

A framework is needed to determine systematically and objectively whether any monitoring solution is technically feasible and worth doing The operating context will influence whether monitoring is technically feasible and worth doing 1. What do we want the asset to do in its operating context? 2. How can it fail to do what we want it to do? 3. What causes of failures do we need to manage? 4. What happens when each failure occurs? 5. How much does each failure matter? 6. Can we predict or prevent the failure and should we predict or prevent the failure? 7. How can we manage the failure if prediction or prevention is not possible? In order to list the causes of failure we need to define what we mean by failed Managing Reliability is about managing the causes of failure In order to understand the consequences we need to know what happens when the failure occurs We only need to manage failures because they matter (ie they have consequences) Something needs to change as a result of the automated condition monitoring An Automated Monitoring solution can also fail, so may need to be maintained An Automated Monitoring solution must be technically feasible and it must also be worth doing (ie cost effective), compared to the alternatives

A framework is needed to determine systematically and objectively whether any monitoring solution is technically feasible and worth doing 1. What do we want the asset to do in its operating context? 2. How can it fail to do what we want it to do? 3. What causes of failures do we wish to manage? 4. What happens when each failure occurs? 5. How much does each failure matter? 6. Can we predict or prevent the failure and should we predict or prevent the failure? 7. How can we manage the failure if prediction or prevention is not possible? Such a framework already exists

RCM What is it? RCM is a methodology which determines what must be done to ensure that any asset we own or operate continues to do what the users want it to do, in its present operating context The RCM Process consists of 7 master questions 1 Functions 2 3 4 5 6 7 Functional Failure Failure Modes Failure Effects Consequences Proactive Tasks Default Tasks 1. What do we want the asset to do? 2. How can it fail? 3. What causes the functional failures? 4. What happens when a failure occurs? 5. How much does each failure matter? 6. Can we predict or prevent failure and should we be doing so? 7. How should we manage the failure if prediction or prevention is not an option?

Contents moving forward Overview of RCM Process Discussion of Predictive Maintenance Worked example to illustrate how RCM helps determine which automated monitoring solution to use Case Studies showing real life examples of the application of RCM to make such decisions

Example RCM analysis the first 4 columns are an FMEA Function 1 To receive distillate and its impurities into either of two segregated compartments at a rate of up to 3 tonnes per hour, up to 23 tonnes per compartment Functional Failure A Unable to receive at all 1 B Unable to hold 23 tonnes per compart ment Failure Mode Failure Effect Task Interval By 1 Inlet blocked by foreign object 1 2 Drum boot holed due to internal corrosion Drum boot fitted incorrectly The filling side level remains static. Eventually the operator will become aware that the drum has not filled. Upstream flow indication is zero. Upstream process impacts can be minimised by re-routing. Downstream production or supply to boilers lost. Time to clear the blocked inlet up to 4 days. Leakage will occur and the low level alarm will operate sooner than expected; effluent will go off specification and there will be a noticeable odour. Possible fire hazard - if a fire results, then secondary damage may be caused to surrounding assets. Upstream process impacts can be minimised by re-routing. Downstream production or supply to boilers lost. Time to repair up to 2 weeks. Effects follow 1B1 No scheduled maint Visually check drum boot for corrosion Reinforce procedure for correct fitting of drum boot 1 year NDT

Evident Safety / Environment? SAFETY/ENV. On-Condition ES1 Loss of Function Evident? FINANCIAL On-Condition EF1 Hidden Safety / Environment? SAFETY/ENV. On-Condition HS1 FINANCIAL On-Condition HF1 Scheduled Overhaul ES2 Scheduled Overhaul EF2 Scheduled Overhaul HS2 Scheduled Overhaul HF2 Scheduled ES3 Replacement Scheduled Replacement EF3 Scheduled HS3 Replacement Scheduled HF3 Replacement Redesign ES4 Reactive Maintenance EF4 Failure Finding HS4 Failure Finding HF4 Risk Review ES5 Redesign EF5 Redesign HS5 Reactive HF5 Maintenance Outline RCM Decision Diagram Risk Review HS6 Redesign HF6

Evident Safety / Environment? SAFETY/ENV. On-Condition ES1 Loss of Function Evident? FINANCIAL On-Condition EF1 Hidden Safety / Environment? SAFETY/ENV. On-Condition HS1 FINANCIAL On-Condition HF1 Scheduled Overhaul ES2 Scheduled Overhaul EF2 Scheduled Overhaul HS2 Scheduled Overhaul HF2 Scheduled ES3 Replacement Scheduled Replacement EF3 Scheduled HS3 Replacement Scheduled HF3 Replacement Redesign ES4 Risk Review ES5 Reactive Maintenance EF4 Failure Finding For each maintenance task there are two sets of questions that need to be satisfied for the task to be selected EF5 HS5 These relate to whether Redesign the task is: Redesign 1. Technically feasible 2. Worth doing Outline RCM Decision Diagram Risk Review HS4 HS6 Failure Finding Reactive HF5 Maintenance Redesign HF4 HF6

Monitoring systems carry out condition-based maintenance Evident Safety / Environment? SAFETY/ENV. On-Condition ES1 Loss of Function Evident? FINANCIAL On-Condition EF1 Hidden Safety / Environment? SAFETY/ENV. On-Condition HS1 FINANCIAL On-Condition HF1 We often get warning signs that assets are deteriorating toward functional failure An on-condition task may be carried out to predict that a failure is occurring or that it is about to occur Action can then be taken to avoid or minimize the consequences of the failure The usefulness of these warning signs depends to a large extent on how close they are to failure

Assessing Technical Feasibility To assess whether an On-condition task is technically feasible we ask Is there a clear Potential failure condition? What is the lead-time to failure? Is the lead-time usable (i.e. can we inspect within the lead-time)? On-Condition? Is there a clear potential failure? What is it? What is the lead-time to failure? Is this usable? Sh ld thi O C diti T k b d? ES1

Condition-based Maintenance P Detectable Warning sign Performance or Condition (Resistance to Stress) Time A Potential Failure (P) is a warning that a Functional Failure (F) is occuring or is about to occur Condition based maintenance involves checking for potential failures at intervals less that the P-F interval If a potential failure is detected, action can be taken to avoid or eliminate the consequences of failure F Functional Failure Condition based maintenance includes: Human Senses Performance Monitoring Measuring performance to determine deterioration Product Quality Monitoring - Measuring the quality of the product in order to determine the condition of equipment making the product Condition Monitoring the use of specialised equipment to measure the condition of other equipment Dynamic Particle Physical Chemical Thermal Electrical

Benefits of Condition-based Maintenance compared to traditional preventive Less invasive Can be applied to more failure modes, regardless of failure pattern Often can be done while the equipment is running (reducing planned unavailability) Often cheaper Often quicker Maximises asset life

Assessing Worth Doing To assess whether an On-condition task is worth doing we ask Will it reduce the likelihood of failure and impact on costs/ safety/ environment to a tolerable level? Is it cost effective?

The RCM Process includes two types of question to enable us select automated monitoring that benefits the business CAN WE USE IT? SHOULD WE USE IT? TECHNICAL BUSINESS What failure modes are monitored? Is there a clear warning? Does this warning give sufficient time to avoid the consequences of the failure? How reliable is the automated monitoring system? What are the maintenance requirements of the automated monitoring system? Is it cost effective? How likely is the failure to occur? What consequences are avoided? What alternative failure management techniques are available (including traditional maintenance)? What is the ongoing cost for the monitoring (including maintenance)? What is the cost of installation? What is the payback ` The monitoring solution must be technically feasible and it must be worth doing

A simple pumping system P1A Nominal capacity of Pump A: 120 litres/ minute Tank A Vent If level reaches the UHLS (normally open), it shuts off pump until manually reset reset HLS shuts off pump 30,000 L 28,000 L An automated monitoring system is available that monitors vibration continuously This costs $1000 to fit. Should we fit it? Bund Low Level Switch switches on the pump If the level drops to this point, a Low Low Level switch sounds an alarm in the control room If the level drops to this point, an Ultimate Low Low Level Switch shuts off the downstream process 14,000 L 13,000 L 1,000 L Output = 100 litres/ minute Value of the downstream process $15000/hour

Technical feasibility of bearing maintenance The bearing fails at random so Preventive Maintenance is not technically feasible. Conditional Probability of Failure However, before the bearing fails there are warnings (or potential failures) which mean that a predictive, or condition based task can be carried out Predictive/ Condition-based/ On-condition Maintenance Age Performance Or Condition Time The monitoring task needs to be carried out at intervals less than the warning period (or P-F Interval) We need to have enough time to change the consequences Vibration Noise Heat Smoke 6 weeks before functional failure 2 weeks before functional failure 2 days before functional failure F Functional Failure

The RCM FMEA Function Functional Failure Failure Mode Failure Effect 1 To pump solvent from Tank A to the process at a rate of at least 100 litres per minute A Fails to pump any solvent at all 1 Bearing seized Motor trips and pumping stops. Solvent level in the tank falls and the Low Level Alarm sounds in the control room at a level of 13000L. The level continues to fall and the downstream process shuts down when the level reaches 1000L (2 hours after the alarm sounds). (Unplanned) Time to replace the bearing and restart production 5 hours. MTBF of this failure mode is 6 years 3 Options 1. No Scheduled Maintenance 2. Operator checks bearing for audible noise every week. This requires a visit to the remote pump house that takes 15 minutes 3. Fit automatic vibration sensor that triggers an alarm if the bearing suffers from high vibration If a noisy bearing is detected the tank would be filled giving 270 minutes (4.5 hours) buffer. The planned maintenance task would take 4 hours

Evaluation of Two Traditional Options Option Description Cost per year Cost Calculation Conclusion 1 No Scheduled Maintenance $7500 + Bearing + Labour to replace the bearing ($15000/hour x 3 hours lost production)/ 6 years = $7500 2 Operator checks bearing for audible noise every week. $500 + Bearing + Labour to replace the bearing Check takes 20 minutes (1/3 hour). Cost of operators time $30 per hour 50 weeks operation per year Cost of operator checks = 50 x 30 = $500 Lost production is zero Task is worth doing compared to No Scheduled Maintenance 25

An automated monitoring option It is possible to fit automatic vibration sensor that triggers an alarm if the bearing suffers from high vibration It monitors continuously The system costs $1000 to fit Running costs (maintenance and power) estimated at $100 per year Company payback criterion is 3 years 26

Evaluation of Automated Monitoring Options Option Description Cost per year Cost Calculation Conclusion 1 No Scheduled Maintenance $7500 + Bearing + Labour to replace the bearing ($15000/hour x 3 hours lost production) = $7500 2 Operator checks bearing for audible noise every week. $500 + Bearing + Labour to replace the bearing Check takes 20 minutes (1/3 hour). Cost of operators time $30 per hour 50 weeks operation per year Cost of operator checks = 50 x 30 = $500 per year Lost production is zero Task is worth doing compared to No Scheduled Maintenance 3 Fit automatic vibration sensor that triggers an alarm if the bearing suffers from high vibration $0 + Bearing + Labour to replace the bearing Cost of operation per year is $100 Saving per year is ($500 - $100) or $400 Payback period = $1000/400 or 2.5 years Investment does meet payback criterion (in this context) 27

What about other options/ contexts? What if the existing design includes a standby pump? Total Care package from a pump supplier covering all pumps on site? Ultrasound analysis giving an early warning of bearing failure? Use of laser thermometer to measure bearing temperature? Fitting a stand-by pump if there is no standby? Different payback criterion 28

I4 and Condition-based Maintenance Maintenance resulting from Industry 4 developments will be overwhelmingly Condition-based I4 will provide: New warnings (including combinations of symptoms) I4 technologies Earlier warnings need to give More accurate warnings business benefit Automated warnings to replace manual warnings RCM is the Real time information perfect tool for selecting those that do P Detectable Warning sign Performance or Condition (Resistance to Stress) Time F Functional Failure

RCM Case Studies Use of RCM to: - determine which automated monitoring equipment to use on trains - justify relying on instruments (and not carrying out traditional maintenance) Context: Manufacturer of new train has long term franchise to operate and maintain New train at advanced stage of design The Challenges: Pressure for High Reliability & Low Maintenance Costs Where to use automated monitoring systems and to ensure such systems add value How to justifying different approaches to maintenance to a conservative regulator What was done: RCM used on new equipment with team of users and designers to: To determine where to fit monitoring systems That they are technically feasible & worth doing That they have appropriate maintenance That unnecessary maintenance is eliminated To justify strategy to regulator

RCM Case Study Maintenance Cost savings through a move to conditionbased maintenance on a large fleet of mobile trucks Situation: Fleet of large trucks are used in the Oil and Gas Industry These are powered by a large diesel engine and have many hydraulic systems There is considerable redundancy in the context The maintenance strategy is based on manufacturers recommendations. These are entirely time based What we did: Carried out RCM analysis on a typical truck An important part of the study was to gain a thorough understanding of the operating context and where the opportunities for improvement lay Results: Biggest costs are oil, filters, fuel, and tyres Outcomes: RCM recommended a move to condition-based maintenance for oil and filters Data analysis showed a massive opportunity for cost reduction resulting from this move

RCM Case Studies New monitoring equipment adds little or no value if people s behaviour does not change accordingly Context: GT driven centrifugal compressor used to export gas. Any downtime causes massive loss of income Biggest contribution to unavailability is planned shutdown System shut down every 2000 hours Major driver for 2000 hr shutdowns is visual inspection of finger screens for debris build up Debris in finger screens monitored by expensive automatic system. Lesson: Automated equipment adds no value if it does not change the behaviour of the people who operate and maintain equipment Context: Reciprocating Compressors used for condensate recovery If compressor is down, vapour is flared Planned shutdown for 5 days every 6 months according to manufacturer s recommendations Outcomes: Condition based maintenance identified to replace tasks carried out during planned shutdown Chairman s award for environmental improvement massive reduction in flaring Rod-drop system fitted but not used Lessons: RCM can be used to improve existing maintenance through use of monitoring Monitoring system fitted but not being used

RCM Case Studies Use of RCM to justify relying on instruments (and not carrying out traditional maintenance) Use of RCM to determine cost effective maintenance requirements for modern instrumented equipment Context: Seal Oil System on hydrocracker recycle gas compressor at a new refinery Critical from a safety and economic perspective Lots of instrumentation and redundancy Outcomes: Some testing of protection Some condition-based maintenance including calibratons Mainly No Scheduled Maintenance Lessons: It is valid to have a policy of No Scheduled Maintenance under the right circumstances This needs careful consideration and justification Context: Fleet of GTs used offshore to drive rotating equipment New business model means manufacturer pays for maintenance through LTSA PM biggest contribution to unavailability and big cost Annual instrument maintenance/ calibration is the biggest contributor RCM used to determine maintenance requirements 56% reduction in maintenance labour costs on GT Availability improvements of at least 1.5 days per year per GT from planned maintenance alone Lessons: Traditional Maintenance had not kept up with change to monitoring equipment RCM used to derive cost effective maintenance for modern instruments and monitoring equipment