LECTURE 3 MAINTENANCE DECISION MAKING STRATEGIES (RELIABILITY CENTERED MAINTENANCE)

Similar documents
DATA ITEM DESCRIPTION Title: Failure Modes, Effects, and Criticality Analysis Report

Safety-Critical Systems

Hazard Operability Analysis

Understanding safety life cycles

Safety assessments for Aerodromes (Chapter 3 of the PANS-Aerodromes, 1 st ed)

Safety Engineering - Hazard Identification Techniques - M. Jahoda

Using Reliability Centred Maintenance (RCM) to determine which automated monitoring systems to install to new and existing equipment

SIL explained. Understanding the use of valve actuators in SIL rated safety instrumented systems ACTUATION

Using what we have. Sherman Eagles SoftwareCPR.

DETERMINATION OF SAFETY REQUIREMENTS FOR SAFETY- RELATED PROTECTION AND CONTROL SYSTEMS - IEC 61508

PROCEDURE. April 20, TOP dated 11/1/88

Marine Risk Assessment

Reliability of Safety-Critical Systems Chapter 3. Failures and Failure Analysis

Aeronautical studies and Safety Assessment

Technical Standards and Legislation: Risk Based Inspection. Presenter: Pierre Swart

Section 1: Multiple Choice

Hazard Identification

Every things under control High-Integrity Pressure Protection System (HIPPS)

Workshop Information IAEA Workshop

Safe management of industrial steam and hot water boilers A guide for owners, managers and supervisors of boilers, boiler houses and boiler plant

Enhancing NPP Safety through an Effective Dependability Management

FP15 Interface Valve. SIL Safety Manual. SIL SM.018 Rev 1. Compiled By : G. Elliott, Date: 30/10/2017. Innovative and Reliable Valve & Pump Solutions

Requirements for Reduced Supervision of Power Plants, Thermal Liquid Heating Systems, and Heating Plants

2600T Series Pressure Transmitters Plugged Impulse Line Detection Diagnostic. Pressure Measurement Engineered solutions for all applications

MDEP Common Position No AP

PROCEDURES FOR REPAIRS TO ASME NV STAMPED PRESSURE RELIEF DEVICES OF NUCLEAR SAFETY RELATED PRESSURE RELIEF VALVES

Lecture 04 ( ) Hazard Analysis. Systeme hoher Qualität und Sicherheit Universität Bremen WS 2015/2016

Hydraulic (Subsea) Shuttle Valves

Process Control Loops

Selecting Maintenance Tactics Section 4

How to Define Your Systems and Assets to Support Reliability. How to Define Your Failure Reporting Codes to Support Reliability

Workshop Information IAEA Workshop

PREDICTING HEALTH OF FINAL CONTROL ELEMENT OF SAFETY INSTRUMENTED SYSTEM BY DIGITAL VALVE CONTROLLER

High Integrity Pressure Protection Systems HIPPS

The Best Use of Lockout/Tagout and Control Reliable Circuits

L&T Valves Limited SAFETY INTEGRITY LEVEL (SIL) VERIFICATION FOR HIGH INTEGRITY PRESSURE PROTECTION SYSTEM (HIPPS) Report No.

Event tree analysis. Prof. Enrico Zio. Politecnico di Milano Dipartimento di Energia. Prof. Enrico Zio

Pneumatic QEV. SIL Safety Manual SIL SM Compiled By : G. Elliott, Date: 8/19/2015. Innovative and Reliable Valve & Pump Solutions

Section 1: Multiple Choice Explained EXAMPLE

FMEA- FA I L U R E M O D E & E F F E C T A N A LY S I S. PRESENTED BY: AJITH FRANCIS

Purpose. Scope. Process flow OPERATING PROCEDURE 07: HAZARD LOG MANAGEMENT

Solenoid Valves used in Safety Instrumented Systems

Employ The Risk Management Process During Mission Planning

Risk Management Qualitatively on Railway Signal System

Module No. # 01 Lecture No. # 6.2 HAZOP (continued)

Safety Risk Assessment Worksheet Title of Risk Assessment Risk Assessment Performed By: Date: Department:

Solenoid Valves For Gas Service FP02G & FP05G

Combining disturbance simulation and safety analysis techniques for improvement of process safety and reliability

Hydro Plant Risk Assessment Guide

SEMS II: BSEE should focus on eliminating human error

Safety Management in Multidisciplinary Systems. SSRM symposium TA University, 26 October 2011 By Boris Zaets AGENDA

(C) Anton Setzer 2003 (except for pictures) A2. Hazard Analysis

FUNDAMENTAL SAFETY OVERVIEW VOLUME 2: DESIGN AND SAFETY CHAPTER P: REFERENCE OPERATING CONDITION STUDIES (PCC)

EL-O-Matic E and P Series Pneumatic Actuator SIL Safety Manual

Risks Associated with Caissons on Ageing Offshore Facilities

Introduction to HAZOP Study. Dr. AA Process Control and Safety Group

Policy for Evaluation of Certification Maintenance Requirements

Bespoke Hydraulic Manifold Assembly

Industrial Risk Management

Eutectic Plug Valve. SIL Safety Manual. SIL SM.015 Rev 0. Compiled By : G. Elliott, Date: 19/10/2016. Innovative and Reliable Valve & Pump Solutions

NUBIKI Nuclear Safety Research Institute, Budapest, Hungary

Raw Material Spill. Lessons Learned. Volume 05 Issue USW

Reliability of Safety-Critical Systems Chapter 4. Testing and Maintenance

This report includes:

3. Real-time operation and review of complex circuits, allowing the weighing of alternative design actions.

Session One: A Practical Approach to Managing Safety Critical Equipment and Systems in Process Plants

Ultima. X Series Gas Monitor

Reliability of Safety-Critical Systems Chapter 10. Common-Cause Failures - part 1

Life Extension of Mobile Offshore Units

CASE STUDY. Compressed Air Control System. Industry. Application. Background. Challenge. Results. Automotive Assembly

SPR - Pneumatic Spool Valve

North Coast Outfitters, LTD. Model SR901RT Multi-Purpose Utility Table SAFETY ASSESSMENT REPORT (SAR)

Hazard analysis. István Majzik Budapest University of Technology and Economics Dept. of Measurement and Information Systems

OIL SUPPLY SYSTEMS ABOVE 45kW OUTPUT 4.1 Oil Supply

CHAPTER 4 FMECA METHODOLOGY

Review and Assessment of Engineering Factors

Nitrogen System Contamination

Introduction. Seafood HACCP Alliance Training Course 8-1

Implementing IEC Standards for Safety Instrumented Systems

Reliability Analysis Including External Failures for Low Demand Marine Systems

Worker Seriously Injured Servicing a Plunger Lift System

Quantitative Risk Analysis (QRA)

Knowledge, Certification, Networking

Managing for Liability Avoidance. (c) Lewis Bass

CHEMICAL ENGINEEERING AND CHEMICAL PROCESS TECHNOLOGY Vol. IV - Process Safety - R L Skelton

A review of best practices for Selection, Installation, Operation and Maintenance of Gas meters for Flare Applications used for Managing facility

(DD/MMM/YYYY): 10/01/2013 IP

Modelling Today for the Future. Advanced Modelling Control Techniques

RELIABILITY OF ENERGY SYSTEMS

GAS FUEL VALVE FORM AGV5 OM 8-03

Steam generator tube rupture analysis using dynamic simulation

Advanced LOPA Topics

Safety Standards Acknowledgement and Consent (SSAC) CAP 1395

Integration of safety studies into a detailed design phase for a navy ship

Courses of Instruction: Controlling and Monitoring of Pipelines

Unattended Bleeder Valve Thaws, Causing Fire

Ingersoll Rand. X-Series System Automation

This manual provides necessary requirements for meeting the IEC or IEC functional safety standards.

Safety of railway control systems: A new Preliminary Risk Analysis approach

C5: Control Emergencies and Critical Situations

Transcription:

LECTURE 3 MAINTENANCE DECISION MAKING STRATEGIES (RELIABILITY CENTERED MAINTENANCE) Politecnico di Milano, Italy piero.baraldi@polimi.it 1

Types of maintenance approaches Intervention Unplanned Planned Corrective Replacement or repair of failed units Scheduled Replacement or Repair following a predefined schedule Conditionbased Monitor the health of the system and then decide on repair actions based on the degradation level assessed Predictive Predict the Remaining Useful Life (RUL) of the system and then decide on repair actions based on the predicted RUL 2 2

3 decision making strategies Risk-Based Reliability Centered 3

4 RELIABILITY-CENTRED MAINTENANCE 4

Reliability-Centred (RCM) What is it? A systematic approach for establishing maintenance programs intervention approaches: Corrective maintenance Planned maintenance (scheduled, condition-based) Primary objective Determine the combination of maintenance tasks which will significantly reduce the major contributors to unreliability and maintenance cost in light of the consequences of failures 5

The RCM Method Focus on system functionality Find the most important functions of the system Avoid and remove maintenance actions which are not strictly necessary When a maintenance plan already exists, the results of RCM is usually the elimination of inefficient preventive maintenance tasks 6

RCM Experience A wide range of companies have reported success by using RCM, that is, cost reductions while maintaining or improving operations regularity: Aircraft industry. RCM is standard procedure for development of new commercial aircrafts Military forces (especially in the US) Nuclear power stations (especially in the US and in France) Oil companies. Most of the oil companies in the North Sea are using RCM Commercial shipping 7

Main Steps of a RCM Analysis 1. Study preparation 2. System selection and definition 3. Functional failure analysis (FFA) 4. Critical item selection 5. FMECA 6. Selection of maintenance actions 7. Determination of maintenance intervals 8. Preventive maintenance comparison analysis 9. In-service data collection and updating 8

1. Study Preparation Form RCM project group (Multi-disciplinarity) Define and clarify objectives and scope of work Identify requirements, policies, and acceptance criteria with respect to the safety and environmental protection Provide drawings and process diagrams (P&ID, ) Check discrepancies between as-built documentation and the real plant Define limitations for the analysis 9

2. System Selection and Definition A standby valve is a maintainable item The valve actuator is not a maintainable item 10

RCM Steps 3: Functional Failure Analysis 11 identify system functions identify functional failures judge functional failure criticality Functional Failure Analysis perform FMECA on MSI List of the dominant failure modes 11

3. Functional Failure Analysis Objectives: Identify and describe the system s required functions and performance criteria Describe input interfaces required for the system to operate Identify the ways in which the system might fail to function Pumping system To pump a fluid Fluid Containment 12

3. Functional Failure Analysis The criticality of functional failures must be judged on plant level and should be ranked with respect to: S = Safety of Personnel E = Environment Impact A = Production Availability C = Material Loss The consequences may be ranked as: H = High M = Medium L = Low N = Negligible 13

RCM Step 4: Critical Item Selection 14 identify system functions identify functional failures judge functional failure criticality Functional Failure Analysis Critical Item Selection Functional Significant Items (FSI) Cost Significant Items (FSI) + = Significant Items (MSI) List of the dominant failure modes 14

4. Critical Item Selection 15

RCM Step 5: FMECA 16 identify system functions identify functional failures judge functional failure criticality Functional Failure Analysis Critical item selection Functional Significant Items (FSI) Cost Significant Items (FSI) + = Significant Items (MSI) perform FMECA on MSI List of the dominant failure modes 16

6. Failure Modes, Effects and Criticality Analysis Objective: identify the dominant failure modes of the MSIs identified in step 4 This step is performed by filling-in a FMECA sheet 17

18 FAILURE MODES, EFFECTS AND CRITICALITIES ANALYSIS (FMECA) 18

FMECA Qualitative Inductive AIM: Identification of those component failure modes which could fail the item 19 19

FMECA: Procedure steps 1. For each item identify its operation modes (start-up, regime, shut-down, maintenance, etc.) and configurations (valves open or closed, pumps on or off, etc.); 2. For each item in each of its operation modes, compile a FMECA table 20 20

FMECA TABLE FUNCTION: OPERATION MODE: compone nt Failure mode Effect on other functional ity Effects on other items Effects on plant Probability* Severity + Criticality Detection methods Protectio ns and mitigatio n Descriptio n Failure modes relevant for the operationa l mode indicated Effects on the functional ity of the item Effects of failure mode on adjacent item and surroundi ng environm ent Effects on the functionali ty and availability of the entire plant Probability of failure occurrence (sometimes qualitative) Worst potential conseque nces (qualitativ e) Criticality rank of the failure mode on the basis of its effects and probabilit y (qualitativ e estimatio n of risk) Methods of detection of the occurren ce of the failure event Protectio ns and measure s to avoid the failure occurren ce 21 21

FMECA TABLE SUBSYSTEM: OPERATION MODE: component Functions PROCESS SHUTDOWN VALVE Shutdown the process (Designed with a closing time of 10s) 22 22

FMECA TABLE SUBSYSTEM: OPERATION MODE: Component Functions Failure Modes PROCESS SHUTDOWN VALVE Shutdown the process (Designed with a closing time of 10s) Close too slowly (> 14s) Close too fast (<6s) 23 23

FMECA TABLE SUBSYSTEM: OPERATION MODE: component Failure mode Effects on other items Effects on subsystem Effects on plant Probability* Description Failure modes relevant for the operational mode indicated Effects of failure mode on adjacent components and surrounding environment Effects on the functionality of the subsystem Effects on the functionality and availability of the entire plant Probability of failure occurrence (sometimes qualitative) Very unlikely: once per 1000 year or seldom Remote: Once per 100 year Occasional: Once per 10 years Probable: Once per year Frequent: Once per month or more often 24 24

FMECA TABLE SUBSYSTEM: OPERATION MODE: other component Failure mode Effects on components Effects on subsystem Effects on plant Probability* Severity + Criticality Description Failure modes relevant for the operational mode indicated Effects of failure mode on adjacent components and surrounding environment Effects on the functionality of the subsystem Effects on the functionality and availability of the entire plant Probability of failure occurrence (sometimes qualitative) Worst potential consequences (qualitative) Criticality rank of the failure mode on the basis of its effects and probability (qualitative estimation of risk) Safe = no relevant effects Marginal = Partially degradated system but no damage to humans Critical = system damage and damage also to humans. If no protective actions are undertaken the accident could lead to loss of the system and serious consequences on the humans Catastrophic = Loss of the system and serious consequences on humans 25 25

FMECA Table 26 component Failure Effects on Effects on subsystem mode other SUBSYSTEM: components Failure Effects of OPERATION MODE: Description modes relevant for the operational mode indicated failure mode on adjacent components and surrounding environment Effects on the functionalit y of the subsystem Effects on plant Effects on the functionality and availability of the entire plant Probability* Criticality+ Detection methods Probability of failure occurrence (sometimes qualitative) Criticality rank of the failure mode on the basis of its effects and probability (qualitativ e estimation of risk) Methods of detection of the occurrenc e of the failure event Protections and mitigation Protections and measures to avoid the failure occurrence Remarks Remarks and suggestio ns on the need to consider the failure mode as accident initiator Evident Failure (detected instantaneously) e.g. spurious stop of a running pump Hidden Failure (can be detected only during testing of the item) e.g. fail to start of a standby pump 26

Exercise: Domestic Hot Water 27 27

Example Boiler System: FMECA (1) Component Failure mode Detection methods Effect on whole system Compensating provision and remarks Critically class Failure frequency Jammed open Observe at pressure relief valve operation of TS controller; gas flow due to hot water loss Shut off water supply, reseal or replace relief valve Safe Likely Pressure relief valve (V04) Jammed close Manual testing No consequences. If combined with other component failure: rupture of container or pipes Periodic inspection; replacement Critical Rare Gas valve (V03) Jammed open Water at faucet too hot; pressure relief valve open (observation) Burner continues to operate, pressure relief valve opens Open hot water faucet to relieve pressure. Shut off gas supply. Pressure relief valve compensates. IE1 Critical Likely Jammed close Observe at output (water temperature too low) Burner ceases to operate Replacement Safe Negligible 28 28

Example Boiler System 2: FMECA (2) Component Failure mode Detection methods Effect on whole system Compensating provision and remarks Critically class Failure frequency Temperature measuring and comparing device (Tsc01) Fail to react to temperature rise above preset level Fail to react to temperature drop below preset level Observe at output (water at faucet too hot); Pressure relief valve opens Observe at output (water at faucet too cold) Controller, gas valve, burner continue to function on. Pressure relief valve opens Controller, gas valve, burner continue to function off. Pressure relief valve compensates. Open hot water faucet to relieve pressure. Shut off gas supply. IE2 Critical Negligible replacement Safe Negligible 29 29

RCM Steps 3-5 30 identify system functions identify functional failures judge functional failure criticality Functional Failure Analysis Critical item selection Functional Significant Items (FSI) Cost Significant Items (FSI) + = Significant Items (MSI) perform FMECA on MSI List of the dominant failure modes 30

6: RCM Decision Logic Input to RCM Decision logic: the dominant failure modes Identified in the previous step (FMECA) Condition Based Scheduled Scheduled Condition Based Corrective 31

6. Scheduled On-Condition Task 32 There are three criteria that must be met for an on-condition task to be applicable: 1. It must be possible to detect reduced failure resistance for a specific failure mode (e.g., degradation index, d) 2. It must be possible to define a potential failure condition that can be detected by an explicit task (e.g. threshold for the detection, d detection ) 3. There must be a reasonable consistent age interval between the time of potential failure (t detect ) is detected and the time of functional failure (t failure ) d failure d detection t detect t failure t 32

6: RCM Decision Logic: Scheduled Overhaul Input to RCM Decision logic: the dominant failure modes Identified in the previous step (FMECA) Condition Based Scheduled Scheduled Condition Based Corrective 33

6. Scheduled Overhaul 34 An overhaul task is considered applicable to an item only if the following criteria are met: 1. There must be an identifiable age at which there is a rapid increase in the items failure rate function. 2. A large proportion of the items must survive to that age. 3. It must be possible to restore the original failure resistance of the item by reworking it. λ(t) t 34

6: RCM Decision Logic: Scheduled Replacement Input to RCM Decision logic: the dominant failure modes Identified in the previous step (FMECA) Condition Based Scheduled Scheduled Condition Based Corrective 35

6. Scheduled replacement 36 A scheduled replacement task is applicable only under the following circumstances: 1. The item must be subject to a critical failure. 2. The item must be subject to a failure that has major potential consequences. 3. There must be an identifiable age at which the item shows a rapid increase in the failure rate function. 4. A large proportion of the items must survive to that age. 36

6: RCM Decision Logic: Scheduled Functional Test Input to RCM Decision logic: the dominant failure modes Identified in the previous step (FMECA) Condition Based Scheduled Scheduled Condition Based Corrective 37

6. Scheduled function test 38 A scheduled function test task is applicable to an item under the following conditions: 1. The item must be subject to a functional failure that is not evident to the operating crew during the performance of normal duties. 2. The item must be one for which no other type of task is applicable and effective. 38

6: RCM Decision Logic: Run To Failure Input to RCM Decision logic: the dominant failure modes Identified in the previous step (FMECA) Condition Based Scheduled Scheduled Condition Based Corrective 39

6. Run to failure 40 Run to failure is a deliberate decision to run to failure because the other tasks are not possible or the economics are less favorable. Run to failure maintenance is generally considered to be the most expensive option, and should only be used on low-cost and easy to replace components that are not critical to operations. 40

7. Determination of Intervals Scheduled tasks are to be performed at regular intervals. To determine the optimal interval is a very difficult task that has to be based on information about: the failure rate function, the likely consequences and costs of the failure the PM task is supposed to prevent, the cost and risk of the PM task 41

7. Determination of Intervals An opinion: The RCM Handbook; Naval Sea Systems Command, S9081-AB-GIB 010/MAINT, US Dept. of Defense, Washington DC 20301, 1983: The best thing you can do if you lack good information about the effect of age on reliability is to pick a periodicity that seems right. Later, you can personally explore the characteristic of the hardware at hand by periodically increasing the periodicity and finding out what happens 42

() Model Granularity Prater's principle of "optimal sloppiness" predictive power level of detail ---> The granularity of the model is determined by the problem and the availability / accuracy of the data 43

7. Determination of Intervals Scheduled tasks are to be performed at regular intervals. To determine the optimal interval is a very difficult task that has to be based on information about: the failure rate function, the likely consequences and costs of the failure the PM task is supposed to prevent, the cost and risk of the PM task In practice the various maintenance tasks have to be grouped into maintenance packages that are carried out at the same time, or in a specific sequence The maintenance intervals can therefore not be optimized for each single item. The whole maintenance package has, at least to some degree, to be treated as an entity 44

8. Planned (PM) Comparison Analysis Each maintenance task selected must meet two requirements: 1. It must be applicable: it can prevent a failure, reduce the probability of the occurrence of a failure to an acceptable level reduce the impact of a failure 2. It must be cost-effective (i.e., the task must not cost more than the failures it is going to prevent) Cost of Failure Cost of PM 45

8. PM Comparison Analysis: Cost of a PM Task The risk/cost related to maintenance induced failures The risk the maintenance personnel is exposed to during the task The risk of increasing the likelihood of failure of another item while the one is out of service The use and cost of physical resources The unavailability of physical resources elsewhere while in use on this task Production unavailability during maintenance Unavailability of protective functions during maintenance 46

8. PM Comparison Analysis: Cost of a Failure The consequences of the failure in terms of: loss of production possible violation of laws or regulations, reduction in plant or personnel safety damage to other equipment The consequences of not performing the PM task even if a failure does not occur (e.g., loss of warranty) Increased premiums for emergency repairs (such as overtime, expediting costs, or high replacement power cost) 47

Updating Process Short-term interval adjustments Medium-term task evaluation Long-term revision of the initial strategy goals - Reference Plan activities System results 48

RCM Comments General issues: maintenance people often rely on manufacturer s recommendations and end up with too frequent maintenances Difficult task to be dynamically based on the information available at the time, e.g. the knowledge of the failure rate value, the probable consequences and costs of the failure that PM is supposed to prevent, the costs and risks of PM Most of the models require information not available. This calls for expert opinion elicitation properly supported by sensitivity and uncertainty analysis 49