Real-Time & Embedded Systems

Similar documents
Safety Critical Systems

Using what we have. Sherman Eagles SoftwareCPR.

Basic STPA Tutorial. John Thomas

STPA Systems Theoretic Process Analysis John Thomas and Nancy Leveson. All rights reserved.

Three Approaches to Safety Engineering. Civil Aviation Nuclear Power Defense

SIL explained. Understanding the use of valve actuators in SIL rated safety instrumented systems ACTUATION

Understanding safety life cycles

Well-formed Dependency and Open-loop Safety. Based on Slides by Professor Lui Sha

Valve Communication Solutions. Safety instrumented systems

Systems Theoretic Process Analysis (STPA)

Every things under control High-Integrity Pressure Protection System (HIPPS)

SIL Safety Manual. ULTRAMAT 6 Gas Analyzer for the Determination of IR-Absorbing Gases. Supplement to instruction manual ULTRAMAT 6 and OXYMAT 6

Failure Management and Fault Tolerance for Avionics and Automotive Systems

CT433 - Machine Safety

PROCEDURE. April 20, TOP dated 11/1/88

PL estimation acc. to EN ISO

D-Case Modeling Guide for Target System

Safety-Critical Systems

Hazard analysis. István Majzik Budapest University of Technology and Economics Dept. of Measurement and Information Systems

Critical Systems Validation

Partial Stroke Testing. A.F.M. Prins

THE CANDU 9 DISTRffiUTED CONTROL SYSTEM DESIGN PROCESS

Solenoid Valves used in Safety Instrumented Systems

Hydraulic (Subsea) Shuttle Valves

Basic STPA Exercises. Dr. John Thomas

Safety-Critical Systems. Rikard Land

Proposed Abstract for the 2011 Texas A&M Instrumentation Symposium for the Process Industries

Section 1: Multiple Choice

Systems Theoretic Process Analysis (STPA)

Safety Manual VEGAVIB series 60

Section 1: Multiple Choice Explained EXAMPLE

EUROPEAN GUIDANCE MATERIAL ON INTEGRITY DEMONSTRATION IN SUPPORT OF CERTIFICATION OF ILS AND MLS SYSTEMS

4. Hazard Analysis. Limitations of Formal Methods. Need for Hazard Analysis. Limitations of Formal Methods

Safety Manual. Process pressure transmitter IPT-1* 4 20 ma/hart. Process pressure transmitter IPT-1*

Safety Manual VEGAVIB series 60

EMERGENCY SHUT-DOWN RELIABILITY ADVANTAGE

XVII Congreso de Confiabilidad

FP15 Interface Valve. SIL Safety Manual. SIL SM.018 Rev 1. Compiled By : G. Elliott, Date: 30/10/2017. Innovative and Reliable Valve & Pump Solutions

Introduction to Machine Safety Standards

Eutectic Plug Valve. SIL Safety Manual. SIL SM.015 Rev 0. Compiled By : G. Elliott, Date: 19/10/2016. Innovative and Reliable Valve & Pump Solutions

A systematic hazard analysis and management process for the concept design phase of an autonomous vessel.

The Safety Case. Structure of Safety Cases Safety Argument Notation

FMEA- FA I L U R E M O D E & E F F E C T A N A LY S I S. PRESENTED BY: AJITH FRANCIS

Pneumatic QEV. SIL Safety Manual SIL SM Compiled By : G. Elliott, Date: 8/19/2015. Innovative and Reliable Valve & Pump Solutions

C. Mokkapati 1 A PRACTICAL RISK AND SAFETY ASSESSMENT METHODOLOGY FOR SAFETY- CRITICAL SYSTEMS

Implementing IEC Standards for Safety Instrumented Systems

Basic Design for Safety Principles

1309 Hazard Assessment Fundamentals

New Thinking in Control Reliability

High Integrity Pressure Protection Systems HIPPS

Bespoke Hydraulic Manifold Assembly

DATA ITEM DESCRIPTION Title: Failure Modes, Effects, and Criticality Analysis Report

Ch.5 Reliability System Modeling.

Failure modes and models

Solenoid Valves For Gas Service FP02G & FP05G

Failure Mode and Effect Analysis (FMEA) for a DMLC Tracking System

(C) Anton Setzer 2003 (except for pictures) A2. Hazard Analysis

A study on the relation between safety analysis process and system engineering process of train control system

Lecture 04 ( ) Hazard Analysis. Systeme hoher Qualität und Sicherheit Universität Bremen WS 2015/2016

Instrumented Safety Systems

The Safety Case. The safety case

OIL & GAS. MTS DP Committee. Workshop in Singapore Session 4 Day 2. Unwanted Thrust

Table 1: Safety Function (SF) Descriptions

FIRE PROTECTION. In fact, hydraulic modeling allows for infinite what if scenarios including:

Hazard Identification

Session One: A Practical Approach to Managing Safety Critical Equipment and Systems in Process Plants

PROCESS AUTOMATION SIL. Manual Safety Integrity Level. Edition 2005 IEC 61508/61511

Implementing Emergency Stop Systems - Safety Considerations & Regulations A PRACTICAL GUIDE V1.0.0

THE IMPROVEMENT OF SIL CALCULATION METHODOLOGY. Jinhyung Park 1 II. THE SIL CALCULATION METHODOLOGY ON IEC61508 AND SOME ARGUMENT

Why do I need dual channel safety? Pete Archer - Product Specialist June 2018

The Best Use of Lockout/Tagout and Control Reliable Circuits

18-642: Safety Plan 11/1/ Philip Koopman

SPR - Pneumatic Spool Valve

DeZURIK Double Block & Bleed (DBB) Knife Gate Valve Safety Manual

Fail Operational Controls for an Independent Metering Valve

Electrical, electronic and control engineering at the operational level

AUSTRALIA ARGENTINA CANADA EGYPT NORTH SEA U.S. CENTRAL U.S. GULF. SEMS HAZARD ANALYSIS TRAINING September 29, 2011

Introducing STAMP in Road Tunnel Safety

Guidelines on Surveys for Dynamic Positioning System

Commissioning and safety manual

Safety Manual OPTISWITCH series relay (DPDT)

Workshop Information IAEA Workshop

Adaptability and Fault Tolerance

RESILIENT SEATED BUTTERFLY VALVES FUNCTIONAL SAFETY MANUAL

Probability Risk Assessment Methodology Usage on Space Robotics for Free Flyer Capture

CASE STUDY. Compressed Air Control System. Industry. Application. Background. Challenge. Results. Automotive Assembly

Minimum standard of competence for electro-technical officers (STCW Reg III/6) Electrical, electronic and control engineering at the operational level

ANNUAL IDOL COMPLIANCE TRAINING

HAZARD ANALYSIS PROCESS FOR AUTONOMOUS VESSELS. AUTHORS: Osiris A. Valdez Banda Aalto University, Department of Applied Mechanics (Marine Technology)

DeZURIK. KGC Cast Knife Gate Valve. Safety Manual

Gas Network Craftsperson

Failure Modes, Effects and Diagnostic Analysis

LECTURE 3 MAINTENANCE DECISION MAKING STRATEGIES (RELIABILITY CENTERED MAINTENANCE)

Failure Modes, Effects and Diagnostic Analysis. Rosemount Inc. Chanhassen, Minnesota USA

User manual CF8-D/W-IN

CHAPTER 28 DEPENDENT FAILURE ANALYSIS CONTENTS

DeZURIK. KSV Knife Gate Valve. Safety Manual


Safety Management in Multidisciplinary Systems. SSRM symposium TA University, 26 October 2011 By Boris Zaets AGENDA

YT-3300 / 3301 / 3302 / 3303 / 3350 / 3400 /

Transcription:

Real-Time & Embedded Systems Agenda Safety Critical Systems Project 6 continued

Safety Critical Systems Safe enough looks different at 35,000 feet. Bruce Powell Douglass The Air Force has a perfect operating record everything we put in the air has come back down. - Unknown

Ubiquity of Control Systems Electro-mechanical devices are migrating to softwaredriven systems Automobiles Planes Home Appliances Medical Equipment Nuclear Power Plants

Software Failures Therac-25 Radiation therapy device Software-driven Bugs allowed massive radiation overdoses Killed 3 people, contributed to the death of a fourth

Software Failures Patriot Missiles Clock drift reduced their effectiveness from 95% to 13% Allowed a SCUD missile through defense perimeter Killed 29, injured 97 Aegis tracking system Failure contributed to shooting down an Iranian Airline flight 290 lives lost

Software Failures 8080-based factory control software Mistakenly stacked large boulders 80 feet high Crushed cars and damaged a building Robotics Stray EM interference blamed for 19 deaths Cardiac pacemakers Low-energy radiation reprogrammed Caused several deaths

Software Failures Medical Database Software Incorrectly informed woman she had incurable syphilis and had passed it on to her children She strangled one, attempted to kill another and herself Sunlight Filtering Software Failed to remove false missile detections based on sunlight reflecting off clouds A Soviet Commander averted nuclear war based on a funny feeling in my gut.

Terms Reliability the measure of up-time, or availability of a system The probability that a task will complete before the system fails Measured in Mean Time Between Failures (MTBF) Security permitting access to only authorized and authenticated persons of systems Safety does not incur too much risk to person or property Risk the chance that something bad will happen Common-mode failure a single failure results in the failure of multiple control paths

Fundamental Hazards Release of energy Release of toxins Interference of life-support functions Supplying misleading information to safety personnel or control systems Failure to alarm when hazardous conditions arise Failure to limit or act when unwanted events occur, inputs are flawed or outputs are outside correct levels

System Issues Safety is a system issue Multiple solutions may address a concern Interlocks Redundant hardware Redundant software The interaction of the components determines the safety of the system

Software Failures Software does not fail Failures represent a change in the capability of the system Broken switch Failed component Bad sensor If software does something wrong, it does it every time! Software may respond poorly to failures

Single-point Failures A device is considered safe if a single failure in the system does not result in an unsafe condition Single-point assessments tree:

Fail-Safe State A condition a safety-critical system must attain with an unrecoverable fault. Emergency Stop Partial Shutdown Hold Manual Control Restart Driven by the problem domain needs

Fail-Safe states An airliner jet engine fails? Unmanned space vehicle launch? Attended medical devices? Hazardous area robotics? Unmanned aircraft control failure? Cruise ship rudder failure?

Achieving Safety Separation of safety channels from non-safety channels Firewall pattern Any component failure in the channel fails the entire channel Isolation of safety systems from non-safety systems is common and justifiable Redundancy Small or large scale Homogenous or diverse

Achieving Safety Homogenous Channels are replicated verbatim Detects only faults, not errors Inexpensive Diverse A different channel is implemented Detects faults and errors More expensive

Achieving Safety Diverse redundancy is stronger Protects against systemic faults / errors Data corruption detection Parity bit Hamming codes (parity bits) Checksums CRCs Redundant storage

Achieving Safety Reasonableness checks A second algorithm validating the results of the first Usually much simpler Feedback error detection Identify potential fault conditions May cause a fail-safe transition Feedback error correction Identify and correct potential fault conditions Attempts to keep the system operating, and may reduce capability

Safety Architectures Single-Channel Protected Design A single flow of control A break in the channel induces a failure Safeguards are added to ensure correct fail-safe behavior A single point of failure Multi-channel Voting Pattern An odd number of redundant channels Each channel votes on the task Majority rules Homogenous or diverse

Safety Architectures Homogenous Redundancy Pattern Identical channels run in parallel If an odd number of channels: Majority channels detect and correct minority channels Must be fully redundant Inexpensive to implement Detects only faults, not errors May be expensive due to redundant hardware

Safety Architectures Diverse Redundancy Pattern Redundant, but uniquely implemented channels Different but equal Lightweight redundancy Separation of monitoring and actuation

Safety Architectures Watchdog Pattern A secondary process monitors the primary process Primary process periodically feeds the secondary process Secondary process can alarm or restart should the primary process fail May include a periodic test suite

Safety Architectures Safety Executive Pattern A centralized coordinator for monitoring safety A really smart watchdog Watchdog timeouts Software error assertions Continuous or periodic built-in tests Faults indentified by monitors

Safety Architecture Monitor-actuator pattern Separation of algorithms Actuation performs the actions Monitoring tracks the actions Additional cost and complexity

Eight Steps to Safety Identify the hazards Determine the risks Dfine the safety measures Create safe requirements Create safe designs Implement safety Assure the safety process Test, test, test (Peer Reviews!)

Identify the Hazards Identify the hazard Determine the level of risk Determine the tolerance time Determine the source of the hazrd: The fault leading to the hazard The likelihood of the fault The fault detection time The means by which the hazard is handled: The means The fault reaction (exposure time)

Identify the Hazards Patient Ventilator Example:

Fault Analysis Fault-tree analysis (FTA) Identify the hazards Work backward from the hazard to identify the causal conditions Diagram with a boolean flow chart UML Activity diagram Failure mode effect analysis (FMEA) Identify potential faults Work forward to the consequences

Determine the Risks FDA levels of concern Minor not expected to result in injury or death Moderate results in minor to moderate injury Major result in major injury or death German TUV characterization (S) Severity of the risk (E) Duration of the period of exposure (G) Prevention of the danger (W) Probability of occurrence

Determine the Risks German TUV characterization

Determine the Risks German TUV Example

Define the Safety Measure Obviation make the hazard physically impossible Education User training Alarming Announce the haard so action can be taken Interlocks removed via secondary device or logic to interceded Internal Checking the system detects and handles the malfunction prior to an incident Safety Equipment goggles, gloves, etc Restriction of access access to potential hazards is restricted to trained personnel Labeling High Voltage, do not touch

Create Safe Requirements Consider the requirements from a safety perspective Specify the negations The system shall not move hardware before user input

Create Safe Designs Work from safe requirements Adopt a safe architecture Revisit, revise the hazard analysis during development Select measures that provide appropriate levels of detection and correction Ensure independent channels lack common-mode failures Adopt consistent strategies for handling faults Include POST and periodic run-time tests

Implementing Safety Language Choice Strong compile-time checking Strong run-time checking Support for encapsulation and abstration (but not just because ) Exception handling Safe language constructs Void*?

Assure the Safety Process Continuously track against hazard analysis Utilize peer reviews to assure quality Verify design adherence Verify coding standards Identify how each hazard is handled

Test, test, test Black box testing White box testing Monkey testing Fault seeding Load testing Simulations System testing Unit testing