Reliability of Safety-Critical Systems Chapter 4. Testing and Maintenance

Similar documents
Reliability of Safety-Critical Systems Chapter 3. Failures and Failure Analysis

Reliability of Safety-Critical Systems Chapter 10. Common-Cause Failures - part 1

Reliability of Safety-Critical Systems 5.1 Reliability Quantification with FTs

Solenoid Valves used in Safety Instrumented Systems

Every things under control High-Integrity Pressure Protection System (HIPPS)

FP15 Interface Valve. SIL Safety Manual. SIL SM.018 Rev 1. Compiled By : G. Elliott, Date: 30/10/2017. Innovative and Reliable Valve & Pump Solutions

Pneumatic QEV. SIL Safety Manual SIL SM Compiled By : G. Elliott, Date: 8/19/2015. Innovative and Reliable Valve & Pump Solutions

Solenoid Valves For Gas Service FP02G & FP05G

DeZURIK. KGC Cast Knife Gate Valve. Safety Manual

Eutectic Plug Valve. SIL Safety Manual. SIL SM.015 Rev 0. Compiled By : G. Elliott, Date: 19/10/2016. Innovative and Reliable Valve & Pump Solutions

Hydraulic (Subsea) Shuttle Valves

DeZURIK. KSV Knife Gate Valve. Safety Manual

Neles trunnion mounted ball valve Series D Rev. 2. Safety Manual

DeZURIK Double Block & Bleed (DBB) Knife Gate Valve Safety Manual

Valve Communication Solutions. Safety instrumented systems

Bespoke Hydraulic Manifold Assembly

SPR - Pneumatic Spool Valve

This manual provides necessary requirements for meeting the IEC or IEC functional safety standards.

Section 1: Multiple Choice

Jamesbury Pneumatic Rack and Pinion Actuator

TRI LOK SAFETY MANUAL TRI LOK TRIPLE OFFSET BUTTERFLY VALVE. The High Performance Company

RESILIENT SEATED BUTTERFLY VALVES FUNCTIONAL SAFETY MANUAL

High performance disc valves Series Type BA, BK, BW, BM, BN, BO, BE, BH Rev Safety Manual

Safety manual for Fisher GX Control Valve and Actuator

L&T Valves Limited SAFETY INTEGRITY LEVEL (SIL) VERIFICATION FOR HIGH INTEGRITY PRESSURE PROTECTION SYSTEM (HIPPS) Report No.

Safety Manual. Process pressure transmitter IPT-1* 4 20 ma/hart. Process pressure transmitter IPT-1*

Safety Manual VEGAVIB series 60

Failure Modes, Effects and Diagnostic Analysis

Safety Manual OPTISWITCH series relay (DPDT)

Safety Manual VEGAVIB series 60

SIL explained. Understanding the use of valve actuators in SIL rated safety instrumented systems ACTUATION

Failure Modes, Effects and Diagnostic Analysis

Neles ValvGuard VG9000H Rev 2.0. Safety Manual

Reliability Analysis Including External Failures for Low Demand Marine Systems

The Key Variables Needed for PFDavg Calculation

Implementing IEC Standards for Safety Instrumented Systems

Proof Testing A key performance indicator for designers and end users of Safety Instrumented Systems

Failure Modes, Effects and Diagnostic Analysis

THE IMPROVEMENT OF SIL CALCULATION METHODOLOGY. Jinhyung Park 1 II. THE SIL CALCULATION METHODOLOGY ON IEC61508 AND SOME ARGUMENT

Failure Modes, Effects and Diagnostic Analysis. Rosemount Inc. Chanhassen, Minnesota USA

Failure Modes, Effects and Diagnostic Analysis

SIL Safety Manual. ULTRAMAT 6 Gas Analyzer for the Determination of IR-Absorbing Gases. Supplement to instruction manual ULTRAMAT 6 and OXYMAT 6

Failure Modes, Effects and Diagnostic Analysis

DETERMINATION OF SAFETY REQUIREMENTS FOR SAFETY- RELATED PROTECTION AND CONTROL SYSTEMS - IEC 61508

High Integrity Pressure Protection Systems HIPPS

Session One: A Practical Approach to Managing Safety Critical Equipment and Systems in Process Plants

innova-ve entrepreneurial global 1

Section 1: Multiple Choice Explained EXAMPLE

Rosemount 2130 Level Switch

Failure Modes, Effects and Diagnostic Analysis

Achieving Compliance in Hardware Fault Tolerance

Failure Modes, Effects and Diagnostic Analysis

Failure Modes, Effects and Diagnostic Analysis

PROCESS AUTOMATION SIL. Manual Safety Integrity Level. Edition 2005 IEC 61508/61511

Understanding safety life cycles

Failure Modes, Effects and Diagnostic Analysis

Partial Stroke Testing. A.F.M. Prins

AUTHOR(S) CLIENT(S) Multiclient - PDS Forum CLASS. THIS PAGE ISBN PROJECT NO. NO. OF PAGES/APPENDICES

Ultima. X Series Gas Monitor

Failure Modes, Effects and Diagnostic Analysis. Rosemount Inc. Chanhassen, MN USA

SIL Safety Manual for Fisherr ED, ES, ET, EZ, HP, or HPA Valves with 657 / 667 Actuator

Failure Modes, Effects and Diagnostic Analysis

EMERGENCY SHUT-DOWN RELIABILITY ADVANTAGE

Safety Manual VEGASWING 61, 63. NAMUR With SIL qualification. Document ID: 52084

Failure Modes, Effects and Diagnostic Analysis

PREDICTING HEALTH OF FINAL CONTROL ELEMENT OF SAFETY INSTRUMENTED SYSTEM BY DIGITAL VALVE CONTROLLER

Understanding the How, Why, and What of a Safety Integrity Level (SIL)

EL-O-Matic E and P Series Pneumatic Actuator SIL Safety Manual

Failure Modes, Effects and Diagnostic Analysis

PL estimation acc. to EN ISO

Impact of Common Cause Failure on Reliability Performance of Redundant Safety Related Systems Subject to Process Demand

Vibrating Switches SITRANS LVL 200S, LVL 200E. Safety Manual. NAMUR With SIL qualification

Continuous Gas Analysis. ULTRAMAT 6, OXYMAT 6 Safety Manual. Introduction 1. General description of functional safety 2

A quantitative software testing method for hardware and software integrated systems in safety critical applications

Instrumented Safety Systems

Commissioning and safety manual

H250 M9 Supplementary instructions

Rosemount 2120 Level Switch

Advanced LOPA Topics

Proposed Abstract for the 2011 Texas A&M Instrumentation Symposium for the Process Industries

What is Good Practice for the Proof Testing of Safety Instrumented Systems of Low Safety Integrity?

YT-3300 / 3301 / 3302 / 3303 / 3350 / 3400 /

YT-300 / 305 / 310 / 315 / 320 / 325 Series

Special Documentation Proline Promass 80, 83

SIL Allocation. - Deterministic vs. risk-based approach - Layer Of Protection Analysis (LOPA) overview

DATA ITEM DESCRIPTION Title: Failure Modes, Effects, and Criticality Analysis Report

FULL STAINLESS STEEL EXPLOSION-PROOF SOLUTIONS OIL & GAS I OFFSHORE AND ONSHORE

Workshop Information IAEA Workshop

Safety Integrity Verification and Validation of a High Integrity Pressure Protection System (HIPPS) to IEC 61511

Failure Modes, Effects and Diagnostic Analysis

Knowledge, Certification, Networking

Hazard Identification

The Risk of LOPA and SIL Classification in the process industry

CHANGE HISTORY DISTRIBUTION LIST

Functional safety. Functional safety of Programmable systems, devices & components: Requirements from global & national standards

2600T Series Pressure Transmitters Plugged Impulse Line Detection Diagnostic. Pressure Measurement Engineered solutions for all applications

PROCEDURE. April 20, TOP dated 11/1/88

ENS-200 Energy saving trainer

Session: 14 SIL or PL? What is the difference?

Safety Critical Systems

Transcription:

Reliability of Safety-Critical Systems Chapter 4. Testing and Maintenance Mary Ann Lundteigen and Marvin Rausand mary.a.lundteigen@ntnu.no RAMS Group Department of Production and Quality Engineering NTNU (Version 1.1 per July 2015) Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 1 / 26

Reliability of Safety-Critical Systems Slides related to the book Reliability of Safety-Critical Systems Theory and Applications Wiley, 2014 Theory and Applications Marvin Rausand Homepage of the book: http://www.ntnu.edu/ross/ books/sis Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 2 / 26

Learning objectives The main purpose of this presentation, is to: Introduce different types of testing of safety instrumented systems (SIS) Relate testing to reliability of SIS Discuss some underlying assumptions associated with repair of failures Testing and repair are part of the maintenance of a SIS. The main focus is here on the operational phase. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 3 / 26

Definition of testing and maintenance Suitable definitions aimed to use in relation to SIS are: Testing: Execution of a function of a system, subsystem, or channel to confirm that the function can be performed according to the requirements stated in the safety requirement specification (SRS). Maintenance: The combination of all technical and administrative actions, including supervision actions, intended to retain an item in, or restore it to, a state which it can perform a required function (IEC 191-07-01). Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 4 / 26

Why are testing and maintenance important? Testing and maintenance are important in the context of SIS reliablity because: Some SISs are dormant (passive) while in normal operation, and the ability to function must be checked from time to time. The state of each redundant channels may not necessarily be confirmed in relation to demands We often strive for constant failure rates, meaning that the SIS components should always be in their useful life period (and replaced before entering the wearout phase) Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 5 / 26

Testing and maintenance procedures Testing and maintenance procedures: Describes how and when the tests and carry out the maintenance. Give guidance on how to report different types of failures, so that they are linked to the correct lists for further analyses and treatment Procedures give often reference to the equpiment safety (or operation) manual, which will give more specific details about methods, constraints, and tools. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 6 / 26

Several types of tests In relation to a SIS, we may distinguish between the following types of tests in the operational phase: Proof tests Function test Diagnostic tests Partial proof tests Staggered tests The tests are discussed in more detail in the following. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 7 / 26

Proof test Proof test: A carefully planned periodic test, which is designed to reveal all DU faults of each channel in a safety loop. A proof test will be carried out according to a specified proof test interval, determined with basis in regulatory requirements, manufacturer s recommendations, and reliability analyses. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 8 / 26

Proof test The main purpose of a proof test is to reveal all DU failures, but this is not always possible. We may therefore distinguish between: Full proof test: A proof test where it is planned to reveal all DU failures are revealed. This is an ideal, but often unrealistic situation. Nevertheless, we often assume that a proof test is full (perfect) in reliability assessments. Imperfect /partial proof test: A proof test where not all DU failures are revealed, due to planned restrictions in test scope (partial test) or constraints about the test due to test conditions and/or inadequacies in how the test is carried out (imperfect test). Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 9 / 26

Proof test process Proof test No DU fault revealed State = Unknown DU fault revealed State = DU fault State = As-good-as-new Repair Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 10 / 26

Practical implementation A proof test is often split into sub-tests of practical reasons. For example, at regular intervals there may be campaigns to test all pressure transmitters at the facility, or to function test all emergency shutdown valves to check closure and response times. A proof test procedure will give the specific details about the test to be carried out, including Preparation and isolation necessary to carry out the test Functionality to be tested (for example alarm functions, individual signals, voting, response to specified fault conditions, such as power failure) Tools and equipment needed in relation to the test Restoration of the item, to ensure that it is properly put back into operation Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 11 / 26

Proof test and its effects on operation Many safety functions will, when executed, stop the production, either directly by closing valves, or indirectly by isolating power to equipment. Testing of input elements (sensor), including its tie-in to the logic solver, may be carried out without disturbing the operation, as long as redundant input elements are available, or compensating measures are implemented. Testing of final elements (valves, breakers) is more challenging, as the complete functionality may interact directly with the operation Temporary stops of production may cause problems, both operational and also safety-wise It is often a motivation to extend intervals between proof tests to avoid these operational challenges, as long as the safety is not compromised. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 12 / 26

Imperfect proof test There are many reasons why a proof test is not full (or perfect). The test procedure may be deficient, or the test tool inadequate, leaving failures unrevealed. It is not safe or practical to test under the same conditions as demands. Who would like to test fire detectors on-board an offshore facility by initiating a fire? Faults may be introduced in a proof test, due to the stresses involved, but also due to errors made while executing the test, the repair, or in the restoration. The imperfectness of a proof test may be modeled by the proof test coverage. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 13 / 26

Proof test coverage Proof test coverage (PTC): The conditional probability that a DU fault will be revealed in a proof test, given that the fault is present when initiating the proof test. Some attributes of PTC: The PTC will be from 0-100%. A full (perfect) proof test has a PTC of 100% An imperfect proof test will have a PTC < 100% The manufacturers may suggest a PTC of their equipment, but this value may need to be verified against the actual test conditions. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 14 / 26

Mean test time and mean repair time Mean test time (MTT): The mean time (spent) to perform the proof test. It covers the time from where the item is unable to carry out its function due to the test, and to the point in time it has been put into operation again. Mean repair time (MRT): The mean time (required) to repair a failure found during a proof test. It covers the time from where the fault was revealed and until the item has been repaired and restored to a functioning state. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 15 / 26

Function test The terms function test and functional test are often used with the same meaning as a proof test Yet, some make a point that a function test can in some cases be a test of the overall functionality of the SIF, and that the state of each redundant channel is not confirmed.. Example In the oil and gas industry, the same shutdown valve may get a signal from both the emergency shutdown system (ESD) and the process shutdown system (PSD) via separate solenoid valves. It is not always straight forward to determine if both solenoid valves responded, or just one of them. Sometimes, a delay is implemented in the PSD logic to separate the signals, but this is not always a welcomed solution. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 16 / 26

Partial proof test Partial proof test: A carefully planned periodic test, which is designed to reveal some specific DU faults without any significant disturbance of the EUC. In relation to this type of test, we may introduce Partial proof test coverage, which is the percentage (%) of DU faults that can be revealed by a partial proof test. Partial proof test interval, which is the planned time between partial proof tests. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 17 / 26

Imperfect versus partial proof test In the textbook, it is made a distinction in the meaning of an imperfect proof test and a partial proof test: An imperfect proof test relates to inadequacies in how the test is carried out or due to the differences between the real demand conditions and the test conditions. These inadequacies and differences may or may not be planned for A partial proof test relates to a planned proof test with a constraint scope for what type of DU faults to reveal, to ensure that the EUC is not subject to significant disturbances Consequently, we may talk about imperfect test in relation to full proof tests as well as partial proof tests. The tests are discussed in more detail in the following. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 18 / 26

Imperfect versus partial proof test Proof test Full Partial Perfect Imperfect Perfect Imperfect Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 19 / 26

Diagnostic tests A certain fraction of dangerous and safe failures may be detected by diagnostic tests. Diagnostic test: An automatic partial test that uses built-in self-test features to detect faults. In response to detected safe or dangerous fault (DD), the item or system may either (or both): Raise an alarm in the control room, to notify operators about the fault and the need for corrective actions Initiate an immediate action to bring the EUC to a safe state Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 20 / 26

Diagnostic test coverage A certain fraction of dangerous and safe failures is often defined as diagnostic coverage (DC). Here, we place the focus on the DC of dangerous failures, DC D. DC D : The conditional probability that a dangerous fault is detected by the diagnostic test, given that a dangerous (D) fault is present when the diagnostic test is initiated. The DC D may be calculated as: CD D = λ DD λ D = λ DD λ DU + λ DD Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 21 / 26

DC categories IEC 61508-2 classifies DC into four categories: Category DC interval 1 0%-60% 2 60%-90% 3 90%-99% 4 99% Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 22 / 26

Related terms In relation to diagnostic tests, it is useful to introduce the following terms: Diagnostic test frequency, which is the regular interval of executing the diagnostic test. Diagnostic tests may run more or less frequently, and IEC 61508 specifies that the detection plus restoration must be less than the process safety time to be credited in e.g., the calculation of the safe failure fraction (SFF) An indication what is regarded as a suitable diagnostic test interval is indicated in relation to the methodology for determining the beta factor in IEC 61508-6. Here, the credit of diagnostic test is reduced if the diagnostic test interval is greater than 2 days, and increased if it is less than 2 hours Mean time to restoration (MTTR), which is the time to detect the fault plus the time needed to repair and fully restore the item to a functioning state. Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 23 / 26

Staggered tests It is often assumed that redundant channels are tested simultaneously. In practice, the test is often sequential, but the time between each test is negligible An alternative testing strategy to simultaneous testing of redundant channels, is staggered testing Staggered test: A test where where two redundant channels are tested with the same proof test interval, but at different point in time. The main purpose of staggered testing is to reduce dependency due to the test itself Some find staggered testing to be a strategy of theoretical interest, rather than a practical testing strategy Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 24 / 26

Pros and cons of different testing strategies Test method Pros Cons Simultaneous test Time efficient, as resources have already been allocated Unprotected while the test is ongoing Sequential test Partial protection More time consuming Staggered test Partial protection. Improved reliability (in theory, due to reduced dependency from the test itself) More time consuming, and difficult to manage with respect to planning and scheduling Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 25 / 26

Classification of tests Manual versus automatic tests: Proof tests (full, partial, and staggered) are often manual or manually initiated Diagnostic tests are usually automatic Online versus offline tests: Diagnostic tests are often online Proof tests may be online or offline tests Mary Ann Lundteigen (RAMS Group) Reliability of Safety-Critical Systems (Version 1.1) 26 / 26