How to Define Your Systems and Assets to Support Reliability. How to Define Your Failure Reporting Codes to Support Reliability

Similar documents
CHAPTER 4 FMECA METHODOLOGY

Reliability Engineering. Module 3. Proactive Techniques - Definitions

DATA ITEM DESCRIPTION Title: Failure Modes, Effects, and Criticality Analysis Report

Analysis of Instrumentation Failure Data

Understanding safety life cycles

Determining Occurrence in FMEA Using Hazard Function

LECTURE 3 MAINTENANCE DECISION MAKING STRATEGIES (RELIABILITY CENTERED MAINTENANCE)

Phase B: Parameter Level Design

Chapter 5: Methods and Philosophy of Statistical Process Control

Federal Aviation Administration Safety & Human Factors Analysis of a Wake Vortex Mitigation Display System

FMEA- FA I L U R E M O D E & E F F E C T A N A LY S I S. PRESENTED BY: AJITH FRANCIS

The Path to LUBRICATION EXCELLENCE

Selecting Maintenance Tactics Section 4

Reliability engineering is the study of the causes, distribution and prediction of failure.

The Criticality of Cooling

Managing for Liability Avoidance. (c) Lewis Bass

CASE STUDY. Compressed Air Control System. Industry. Application. Background. Challenge. Results. Automotive Assembly

Employ The Risk Management Process During Mission Planning

Monitoring transformers with infrared cameras

COMPLETION OF PROCEDURE ASSESSMENT FORM (COSHH RELATED) GUIDANCE NOTES (Version 3)

Hazard Assessment & Control. Faculty of Veterinary Medicine

CASE STUDY ON RISK ASSESSMENTS FOR CROSS CONTAMINATION. Stephanie Wilkins, PE EMA Workshop June 2017

Questions & Answers About the Operate within Operate within IROLs Standard

Sharing practice: OEM prescribed maintenance. Peter Kohler / Andy Webb

Application of FMCA as a Tool for Risk Assessment

Hazard Identification

Notes on Risk Analysis

Introduction. Seafood HACCP Alliance Training Course 8-1

Safety-Critical Systems

Safety Critical Systems

Procedure: Work health and safety hazard management

Safety assessments for Aerodromes (Chapter 3 of the PANS-Aerodromes, 1 st ed)

2600T Series Pressure Transmitters Plugged Impulse Line Detection Diagnostic. Pressure Measurement Engineered solutions for all applications

Using Reliability Centred Maintenance (RCM) to determine which automated monitoring systems to install to new and existing equipment

The WIRTGEN GROUP User Magazine // N o 05. WIRTGEN GROUP solutions for automation and process optimization: On the leading edge

Biomedical Laboratory: Its Safety and Risk Management

COMMON MISUNDERSTANDINGS ABOUT THE PRACTICAL APPLICATION OF IEC 61508

Reliability Assessment of the Whistler Propane Vaporizers

SYSTEM SAFETY REQUIREMENTS

TEMPLE UNIVERSITY ENVIRONMENTAL HEALTH AND RADIATION SAFETY

Job Hazard Analysis (JHA) What is Job Hazard Analysis (JHA)?

PSM TRAINING COURSES. Courses can be conducted in multi-languages

Failure Modes And Effects Analysis Fmea Tool

Module No. # 04 Lecture No. # 3.1 Case studies (continued) (Refer Slide Time: 00:10)

Advanced Test Equipment Rentals ATEC (2832) OMS 600

Param Express. Key Activities Concluded. Watch Out For

Aeronautical studies and Safety Assessment

The Best Use of Lockout/Tagout and Control Reliable Circuits

Accident Investigation and Hazard Analysis

Pressure Gauge Failure Causes Release

Safety Risk Assessment Worksheet Title of Risk Assessment Risk Assessment Performed By: Date: Department:

A New Approach! Analyzing and Preventing Slips in the Workplace

Ins$tu$onalize Reliability Improvement

a. identify hazardous conditions and potential accidents; b. provide information with which effective control measures can be established;

Hazard Recognition. Leader s Guide and Quiz

HS329 Risk Management Procedure

Raw Material Spill. Lessons Learned. Volume 05 Issue USW

Challenges in Ship Design to Maintain Thrusters inside Ship

Exercise Quality Management

Scrum Portfolio jumshat.com

Case Study: Expanding Your Bank Brand with an Off-Premise ATM Partnership

Flammability Chart. Gas Formula Ignition

RISK ASSESSMENT FORM Project / Work Description: Handling of furniture.

Leader s Guide ERI Safety Videos

Marine Risk Assessment

Failure Mode and Effect Analysis (FMEA) for a DMLC Tracking System

Incorrect Relief Valve Material Causes Release

Working Alone. Latest revised date: October 26, 2011 Page 1 of 7 Prepared by: Department of Health & Safety S-014

Emerging Crash Trend Analysis. Mark Logan Department of Main Roads, Queensland. Patrick McShane Queensland Transport

PRO-ROD TM COILED ROD. Reduce Maintenance Increase Production Enhance Profit

Risk Management Qualitatively on Railway Signal System

Operator Exposed to Chlorine Gas

A study on the relation between safety analysis process and system engineering process of train control system

BSR GPTC Z TR GM References and Reporting Page 1 of 8

PIQCS HACCP Minimum Certification Standards

It s Just Calibration How Hard Can It Be?

Energy Control. Suite 2A, 55 Frid Street Hamilton, ON L8P 4M3 office: cell:

THE DELIVERY OF GAS SUPPLY

Solenoid Valves used in Safety Instrumented Systems

2018 NFPA 70E Significant Changes 7/27/2017. EFCOG Electrical Safety Task Group Workshop, July 2017

Large Valve Causes Back Injury

Student Population Projections By Residence. School Year 2016/2017 Report Projections 2017/ /27. Prepared by:

Golf. By Matthew Cooke. Game Like Training

Accident/Incident Reporting and Investigation Procedures

The RCM Analyst - Beyond RCM

The modern, fast and easy to use risk analysis tool. Advanced Features. Using HAZID in BowTie Pro

Business Plan Presentation

To comply with the OHS Act, the responsible manager must carry out and document the following:

Bringing Pressure Under Control: Harnessing Control Valves for Intelligent Water Networks. A Report on Best Practice from TALIS

Development and Evolution of an Asset Management Organization

TEMPLE UNIVERSITY ENVIRONMENTAL HEALTH AND RADIATION SAFETY

Instrument Craftsman Receives Caustic Burn to Ear

Model EXAxt AV550G. Averaging Oxygen Analyzer

FMEA What s the Worst That Could Happen?

Fail Operational Controls for an Independent Metering Valve

ESSENTIAL SAFETY RESOURCES

sdfghjklzxcvbnmqwertyuiopasdfghjklzxcvbnmqw ertyuiopasdfghjklzxcvbnmqwertyuiopasdfghjklzx

Technical vs. Process Commissioning Final Exam: Functional Performance Testing

CLEAR COLLISION LEAD EVALUATE ACT RE-OPEN KEEPING TRAFFIC MOVING. CFOATechRescue ConfJuly2013v0.1

Unattended Bleeder Valve Thaws, Causing Fire

Transcription:

BACKFED RELIABILITY

How to Define Your Systems and Assets to Support Reliability How to Define Your Failure Reporting Codes to Support Reliability How to Generate Risk Prioritization Numbers (RPN) from Historic Failure Data How to Utilize RPN s to Identify and Prioritize the Most Impactful Work

Associate Director Maintenance Systems Greater Toronto Airport Authority Scott has 38 years experience in the aviation/airports industry where he has touched on almost all facets of the business from Professional Pilot, to Regulator and is currently the Associate Director Maintenance Systems in the Facilities Department with the Greater Toronto Airports Authority.

Associate Director Maintenance Systems Greater Toronto Airport Authority Scott s team is finalizing the transition from a Home Grown CMMS to Maximo (V) 7.6. From the beginning, we have been looking to have Maximo pave the way for us to be able to leverage our data and information in order to make better decisions on our assets towards the goal of achieving a maintenance program with its foundations anchored in Reliability Centered Maintenance principles.

Director of EAM Services EDI Tim has worked on EAM Implementations in multiple industries for over 20 years. Prior to joining EDI, Tim served as Sr. Associate Director of Maintenance Reliability for a multinational manufacturer with more than 47,000 employees. Since 1993, Tim has been a leading advocate and innovator in the world of asset management and reliability.

Experts in Maximo implementations in asset intensive process environments Assessment, consulting, and solution delivery from Concept to Go-Live Services delivered worldwide across multiple industries Developers of several Maximo addons and industry-centric solutions

To keep assets performing their function? To minimize risk to the business?

Previous Risk Adverse Failure Prevention Response to Failure PM Costs driven by maintenance Future Managed Risk Failure Management Understand Failure Cause Cost controlled maintenance

Perception: Requires Significant Effort to: 1. Collect Data 2. Analyze Data 3. Develop Mitigation Strategies 4. Implement Mitigation Strategies No Obvious Place to Start!

Perception: Translation to Management: Requires Significant Effort to: Requires Significant Money 1. Collect Data Difficult to Determine 2. Analyze Data Specific ROI and Payback 3. Develop Mitigation at Start Strategies 4. Implement Mitigation Strategies No Obvious Place to Start!

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work This is the Backfed Part!

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work *No Effort Intensive Wide-Reaching Initiatives or Studies to Implement!

To achieve RCM, it is critical to understand the requirements of an asset and how it fails Failure is simply the inability of an asset to perform a required function regardless of cause Severity of Failure is based on this inability not on the component failure that caused it Understanding Asset Capability is the First Requirement for Achieving Reliability

Required function To be capable of avoid subjective and nebulous descriptions i.e. effective, proper, quality Functional Failure A particular way in which failures occur, independent of the reason or cause for the failure, often referred to as the problem. Functional Failure description starts with the words Incapable of (required function)

Failure Mode Specific events or conditions related to a piece of equipment or a component that lead to a functional failure. (i.e. loss of power, broken coupling, isolation valve closed) Failure Code A set of codes used to document a problem (functional failure), the cause for the failure (failure mode), and the remedy applied to correct the failure, these codes are commonly referred to as PCRs

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work

For Any Element of a System, You Should Be Able To Easily: Identify the Function(s) the Element Supports Identify Where The Element Is Installed; What It Is Connected To Identify If The Element is Repaired, Replaced, Calibrated, etc.

Minimum Requirements of Location Hierarchy: Establish Function-based Location Hierarchy Establish Clear and Consistent Definition of An Asset (vs. System or Component) Maintain 1:1 Relationship Between Asset and Location Capture Appropriate Attribution on Location, Asset and Spare Part Records

Collect/Cleanse Data Using SCBDs as Your Guide Apply Definitions Above Along With Physical Structure (P&IDs, for example) to Build System Classification and Boundary Diagrams (SCBDs) for Each System Identify All Applicable Functional Systems Establish Definitions for Systems (Locations), Assets and Components (Items)

Air Handler System (AHS) Chilled Water Hot Water FE TE FE PE TE Outside Air Primary Filter Reclaim Coil Dampers Reclaim Pump FE Economizer Motor Filter Bank TE Cooling Coil Air Box Heating Coil Reheat Coil Supply Fan Supply Air Air Box Reheat Coil Motor Air Box Reheat Coil Exhaust Fan Exhaust Fan FE FE Return Fan Reclaim Coil Return Air

LOCATION ASSET ITEM CORP HQ BLDG 1 HVAC Name/Identifier CORPHQ B01 B01301 Description Corporate Headquarters Building 1 HVAC System AHS B0130101 Air Handler System 1 AHU B0130101AHU01 Air Handler Unit 1 Exhaust Fan Exhaust Flow Exhaust Pressure Air Box AHU Exhaust Fan Flow Sensor VAV Box Supply Fan Return Fan Cooling Coil Heating Coil Cont Valve Cont Valve Reheat Coil Damper Cont Valve #113443 P287465 P647879 P348765 P498746 P784726 P649837 B0130101EF01 #113444 B0130101FE01 #113445 B0130101PE01 B0130101AB01 #113450 Reheat Coil Damper Control Valve PACE Air Handler Supply Fan Return Fan Cooling Coil Heating Coil Cont Valve Cont Valve Etc. Exhaust Fan Serving Room 2301 Exhaust Fan Exhausted Air Flow Room 2301 Air Flow Sensor Supply Air Pressure Etc. VAV Box Serving Room 1439 VAV Box Reheat Coil Damper Control Valve Etc.

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work

Define Failure It must have a direct relationship to business risk Define Risk of Failure Objective impact of failure to business Measuring Failure Relationship of Problem and Severity Detectability Relationship to the Cause

Common Issues with Failure Codes Problems are Causes and Causes are Remedies Problems don t correlate with severity of failure Remedies are single word instead of action description

Common Issues with Failure Codes Problems are Causes and Causes are Remedies Problems don t correlate with severity of failure Remedies are single word instead of action description Objectives for Failure Codes Problem and Failure should be synonymous Failures should be at a system level Cause should identify component Remedy should be action performed on component

Use This Problem Inability to supply air at pressure set point Cause Primary filter Remedy Replaced primary filters Avoid This Filter plugged Primary filter loaded Replaced This strategy puts Severity on the Problem And detectability on the Cause

Most EAMs Have Ability To Create Multiple Problem, Cause, Remedy Hierarchies Associate These Hierarchies to Locations and Assets Capture Problem, Cause and Remedy on Work Order During Work Identification, Planning, Execution or Review

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work

The purpose for the RPN is for calculating potential risk in FMEAs on functional failures first to determine if variation exists, and then causes so the appropriate maintenance strategy is deployed * Note: Cause RPNs are calculated only if there is variation on the functional failure

The common application for determining risk is the RPN (Risk Priority Number) To understand the application of the RPN, it is necessary to understand the differences between problem, cause, and remedy

When the RPN is calculated on the functional failure, there can be a history of many causes (failure modes) resulting in the same functional failure. It is common to apply an RPN for each cause relating to a functional failure when performing the FMEA

In the Backfed process, the RPN is calculated on post failures rather than on projected failures as in the FMEA The RPN is calculated by first assigning values to these three components of risk Severity What is the impact of the failure to the business Occurrence How frequently has this failure occurred Detectability Is the failure detectable prior to failure

Rank Description Severity The worst case consequences of the failure mode Personnel Injury or Illness 10 Severe Fatality Environmental Impact Long term (5 years or more) environmental damage; clean up requiring >$1 million dollars to correct or in penalties Equip Loss & Penalty Loss Greater than $25K in damages Function Downtime Greater than one day 7 Major one injuries or illnesses requiring hospitalization or > one OSHA recordable injury Long term (1-5 years) environmental damage; clean up requiring >$250K -$1 million to correct or in penalties $10K to $25K in damages One day to one shift 5 Moderate One OSHA recordable injury or illness Short term (up to 1 year) environmental damage; clean up requiring >$100K to $250K to correct or in penalties $5K to $10K in damages One shift to two hours 3 Minor One minor (first aid classification) injury or illness Minor environmental damage readily restored or requiring $1K - $100K to correct or in penalties $1K to $5K in damages One hour to two hours 1 Insignificant No injury or illness Minor environmental damage readily restored or requiring <$1K to correct or in penalties Less than $1K in damages Less than one hour

Rank Descrip -tion The worst case consequences of the failure mode Personnel Injury or Illness 10 Severe Fatality 7 Major one injuries or illnesses requiring hospitalization or > one OSHA recordable injury Environmental Impact Long term (5 years or more) environmental damage; clean up requiring >$1 million dollars to correct or in penalties Long term (1-5 years) environmental damage; clean up requiring >$250K -$1 million to correct or in penalties Equip Loss & Penalty Loss Greater than $25K in damages $10K to $25K in damages Function Downtime Greater than one day One day to one shift

Rank Description 5 Moderate 3 Minor The worst case consequences of the failure mode Personnel Injury or Illness One OSHA recordable injury or illness One minor (first aid classificati on) injury or illness Environmental Impact Short term (up to 1 year) environmental damage; clean up requiring >$100K to $250K to correct or in penalties Minor environmental damage readily restored or requiring $1K - $100K to correct or in penalties Equip Loss & Penalty Loss $5K to $10K in damages $1K to $5K in damages Function Downtime One shift to two hours One hour to two hours

Rank The worst case consequences of the failure mode Description 1 Insignificant Personnel Injury or Illness No injury or illness Environmental Impact Minor environmental damage readily restored or requiring <$1K to correct or in penalties Equip Loss & Penalty Loss Less than $1K in damages Function Downtime Less than one hour

Detection The ability of current controls to identify the failure mode prior to failure Rank Description 10 100% Uncertain 7 Remote Chance the current controls will discover the event prior to the consequences Existing controls cannot detect the failure. No controls are in place Remote chance the controls will detect the failure. A control may be in place, but is untested or unreliable 5 Moderate A moderate chance that the control will detect the failure 3 High Very likely that the control will detect the failure 1 Very High The control will detect the failure in almost every instance

The ability of current controls to identify the failure mode prior to failure Rank 10 Description 100% Uncertain Chance the current controls will discover the event prior to the consequences Existing controls cannot detect the failure. No controls are in place 7 Remote 5 Moderate Remote chance the controls will detect the failure. A control may be in place, but is untested or unreliable A moderate chance that the control will detect the failure

The ability of current controls to identify the failure mode prior to failure Rank Description Chance the current controls will discover the event prior to the consequences 3 High Very likely that the control will detect the failure 1 Very High The control will detect the failure in almost every instance

Occurrence The occurrence of the failure mode (with current controls in place) Rank Description Likelihood 10 Frequent Failure is almost inevitable. Consistent failure observed. Frequency of failure mode over the expected life of the system is closest to*: 4 or more 7 Likely 5 Occasional 3 Unlikely Failure is likely and will occur in most circumstances. Repeated failures observed. Failure is probable at some time and has been observed. 2 Failure could occur at some time. Only isolated incidents observed. 1 3 1 Remote Failure is extremely unlikely, no history of failure. 0 * =(# of failures since the system was installed/years the system has been installed)*expected life of the system

The occurrence of the failure mode (with current controls in place) Rank Description Likelihood 10 Frequent 7 Likely Failure is almost inevitable. Consistent failure observed. Failure is likely and will occur in most circumstances. Repeated failures observed. Frequency of failure mode over the expected life of the system is closest to*: 4 or more 3 * =(# of failures since the system was installed/years the system has been installed)*expected life of the system

The occurrence of the failure mode (with current controls in place) Rank Description Likelihood 5 Occasional 3 Unlikely 1 Remote Failure is probable at some time and has been observed. Failure could occur at some time. Only isolated incidents observed. Failure is extremely unlikely, no history of failure. * =(# of failures since the system was installed/years the system has been installed)*expected life of the system Frequency of failure mode over the expected life of the system is closest to*: 2 1 0

Once each of the components of risk have been assigned a value, these values are multiplied together to create the RPN Severity x Occurrence x Detectability = RPN The RCM standard for the RPN threshold is 200. However any time Severity is a level 10 there should be an investigation Unlike in the FMEA, Detectability is determined on whether failure can be predicted prior to failure rather than after

Add Table to Assets and Locations That Allows You ToStore: Severity Rating of Each Possible Problem Detectability Rating of Each Possible Cause Occurrence Rating of Each Possible Problem* Risk Priority Number of Each Possible Problem* Aggregated Risk Priority Number for Asset or Location* (*calculated see next slide)

Add Automation Scripts to Automatically Calculate and Update the Following Attributes from Historical Data: Occurrence Risk Priority Numbers for Specific Problems Aggregated Risk Priority Numbers for Assets and Locations

1. Establish Functional Systems 2. Define Functional Failure Codes 3. Calculate RPNs for Failures 5. Execute Projects 4. Utilize RPNs to Identify Work

Pareto Chart on Schedule Impacting Issues Maximo identifies the highest risk issues based on the use of the PCRs and occurrence of failure Total Issues to >300 RPN Issues by Month Maximo will monitor the effectiveness of the CAPA and the issue will move down Pareto Chart accordingly All Issues are tracked by Month and compared against number of issues with RPN > 300 xxxx Xxxx Xxxx Xxxx Xxxx Xxxx Xxxx xxxx Reliability Tactical Implementation Plan (TIP) CAPAs approved will be planned into CE work orders and KPI will track % complete of project RCA is performed on > 300 issues and CAPA developed. KPI is developed to trend known risk against projected post CAPA risk xx xx xx xx xx xx xx >300 RPN Issues to Root Cause Identified by Month

Create Reports and KPIs to Enable the Continuous RCM Evaluation Process: Pareto for Highest Risk Issues Across Organization KPI for Percentage of Issues That Are Greater Than a Threshold RPN KPI for Tracking Progress of Approved CAPA Work Orders

Removing emotion from risk The importance of separating subjective from objective Using the RPN on post failures The role of the issues log is it in the EAM? Using the PCRs to assign the RPN

Common Pitfalls : Trying to approve adequate staffing for full RCM Program Too many processes, analysis, failure classes, etc Digging too deep, not staying with reasonable and likely Not achieving proper client buy-in

Ingredients for Success: Create a system that all personnel will use to capture issues (Functional Failures) Utilize current staffing and concentrate on one phase at a time 3 5 failure classes at a maximum to start Concentrate only on the KBAs (known bad actors) Perform road shows let the client know what you re doing

(727)290-1171 tconrad@edatai.com www.edatai.com