Systems Theoretic Process Analysis (STPA)

Similar documents
Systems Theoretic Process Analysis (STPA)

Basic STPA Tutorial. John Thomas

STPA Systems Theoretic Process Analysis John Thomas and Nancy Leveson. All rights reserved.

Basic STPA Exercises. Dr. John Thomas

STAMP/STPA Beginner Introduction. Dr. John Thomas System Engineering Research Laboratory Massachusetts Institute of Technology

Performing Hazard Analysis on Complex, Software- and Human-Intensive Systems

Introducing STAMP in Road Tunnel Safety

Safety-Critical Systems

Failure Management and Fault Tolerance for Avionics and Automotive Systems

Missing no Interaction Using STPA for Identifying Hazardous Interactions of Automated Driving Systems

STPA: A New Hazard Analysis Technique. (System-Theoretic Process Analysis)

Using STPA in the Design of a new Manned Spacecraft

Three Approaches to Safety Engineering. Civil Aviation Nuclear Power Defense

Modeling and Hazard Analysis Using Stpa

A systematic hazard analysis and management process for the concept design phase of an autonomous vessel.

An STPA Primer. Version 1, August 2013

Every things under control High-Integrity Pressure Protection System (HIPPS)

The Relationship Between Automation Complexity and Operator Error

Go around manoeuvre How to make it safer? Capt. Bertrand de Courville

Safety-Critical Systems. Rikard Land

FLIGHT CREW TRAINING NOTICE

HAZARD ANALYSIS PROCESS FOR AUTONOMOUS VESSELS. AUTHORS: Osiris A. Valdez Banda Aalto University, Department of Applied Mechanics (Marine Technology)

FLIGHT TEST RISK ASSESSMENT THREE FLAGS METHOD

An STPA Tool. Dajiang Suo, John Thomas

Using what we have. Sherman Eagles SoftwareCPR.

Lecture 1 Temporal constraints: source and characterization

Real-Time & Embedded Systems

OBJECTIVE 6: FIELD RADIOLOGICAL MONITORING - AMBIENT RADIATION MONITORING

etcadvancedpilottraining.com Upset Prevention and Recovery Training (UPRT) Altitude Awareness Training Situational Awareness (SA)

Accident Investigation and Hazard Analysis

So it s Reliable but is it Safe? - a More Balanced Approach To ATM Safety Assessment

Flight Dynamics II (Stability) Prof. Nandan Kumar Sinha Department of Aerospace Engineering Indian Institute of Technology, Madras

Safety Risk Assessment Worksheet Title of Risk Assessment Risk Assessment Performed By: Date: Department:

Purpose. Scope. Process flow OPERATING PROCEDURE 07: HAZARD LOG MANAGEMENT

Advanced LOPA Topics

Hazard Identification

User Manual. Heads-Up Display (HUD) DiveCAN. Mechanical Button Version

UNIVERSITY OF WATERLOO

Hazard Identification

Hazard analysis. István Majzik Budapest University of Technology and Economics Dept. of Measurement and Information Systems

FRDS GEN II SIMULATOR WORKBOOK

4. Hazard Analysis. Limitations of Formal Methods. Need for Hazard Analysis. Limitations of Formal Methods

(C) Anton Setzer 2003 (except for pictures) A2. Hazard Analysis

Safety models & accident models

USING HAZOP TO IDENTIFY AND MINIMISE HUMAN ERRORS IN OPERATING PROCESS PLANT

Federal Aviation Administration Safety & Human Factors Analysis of a Wake Vortex Mitigation Display System

Risk Assessment Guidance for QM in Brachytherapy. Conflicts. Learning Objectives. Example Procedure

NAVIGATION ACCIDENTS AND THEIR CAUSES IS SHIPBOARD TECHNOLOGY A HELP OR HINDERANCE? CAPT.CLEANTHIS ORPHANOS MSc HEAD MAIC SERVICE

D-Case Modeling Guide for Target System

SIDRA INTERSECTION 6.1 UPDATE HISTORY

5 Function Indicator. Outside Air Temperature (C) Outside Air Temperature (F) Pressure Altitude Density Altitude Aircraft Voltage STD TEMP SL 15000

Calspan Loss-of-Control Studies Using In-flight Simulation. Lou Knotts, President November 20, 2012

Essential Performance for MED rd ed. PSES San Diego chapter meeting December 11, 2012

Workshop Information IAEA Workshop

Safety-critical systems: Basic definitions

New Airfield Risk Assessment / Categorisation

AN AUTONOMOUS DRIVER MODEL FOR THE OVERTAKING MANEUVER FOR USE IN MICROSCOPIC TRAFFIC SIMULATION

How to Design Medical Accelerator Systems for Reliability: IBA PT System

Series 3730 and Series 3731 EXPERTplus Valve Diagnostics with Partial Stroke Test (PST)

Risk Management Series Article 8: Risk Control

Critical Systems Validation

Identification and Screening of Scenarios for LOPA. Ken First Dow Chemical Company Midland, MI

Electronic Oncology Systems the Management and the Safety Measures. Hansen Chen, Director, TDSI

if all agents follow RSS s interpretation then there will be zero accidents.

Proposed Abstract for the 2011 Texas A&M Instrumentation Symposium for the Process Industries

Surrogate UAV Approach and Landing Testing Improving Flight Test Efficiency and Safety

A Novel Approach to Evaluate Pedestrian Safety at Unsignalized Crossings using Trajectory Data

1.0 PURPOSE 2.0 REFERENCES

Flight Systems Verification & Validation Mars 2020 Entry, Descent, and Landing

Robust Task Execution: Procedural and Model-based. Outline. Desiderata: Robust Task-level Execution

Operator Exposed to Chlorine Gas

A study evaluating if targeted training for startle effect can improve pilot reactions in handling unexpected situations. DR. MICHAEL GILLEN, PH.D.

CHAPTER 7: THE FEEDBACK LOOP

Safety Analysis: Event Classification

PROPERTY INSURANCE ASSOCIATION OF LOUISIANA

Review and Assessment of Engineering Factors

Cycling and risk. Cycle facilities and risk management

The Nitrogen Threat. The simple answer to a serious problem. 1. Why nitrogen is a risky threat to our reactors? 2. Current strategies to deal with it.

Prof. Brian C. Williams February 23 rd, /6.834 Cognitive Robotics. based on [Kim,Williams & Abramson, IJCAI01] Outline

Choosing and Installing Microbiological Safety Cabinets

Developing Startle and Surprise Training Interventions for Airline Training Programs

Development of a Detailed Multi-Body Computational Model of an ATV to Address and Prevent Child Injuries. Chandra K. Thorbole, Ph.D.

Codex Seven HACCP Principles. (Hazard Identification, Risk Assessment & Management)

Control Performance: An Imperative for Safety

Basic Mountain Flying

Failure Modes Events Analysis. Dr Tai Hwei Yee DCQO, National Healthcare Group ACMB ( Clinical Quality & Audit), TTSH

Appendix A: Crosswalk Policy

An atc-induced runway incursion

Surge suppressor To perform its intended functions, an AEI site must have the components listed above and shown in Fig. 4.1.

HYPOXIA IN OPERATION ORIENTED SIMULATION. SAFE EUROPE, ZEIST, APRIL 2017 Wietse Ledegang, MSc.

Risk Assessment. Residual Level of Risk (Considering existing and proposed controls) E, H, M, L or N See Tables 1 3 L

ARKANSAS TECH UNIVERSITY FACILITIES MANAGEMENT HEALTH AND SAFETY MANUAL (LOCKOUT/TAGOUT) 30.0

Policy for Evaluation of Certification Maintenance Requirements

VI.B. Traffic Patterns

Gas Accumulation Potential & Leak Detection when Converting to Gas

PREDICTING HEALTH OF FINAL CONTROL ELEMENT OF SAFETY INSTRUMENTED SYSTEM BY DIGITAL VALVE CONTROLLER

BHATNAGAR. Reducing Delay in V2V-AEB System by Optimizing Messages in the System

Verification Of Calibration for Direct-Reading Portable Gas Monitors

Confidence comes with knowing you are Code-Ready.

FixedWingLib CGF. Realistic CGF Aircraft Entities ware-in-the-loop Simulations

Transcription:

Systems Theoretic Process Analysis (STPA)

Systems approach to safety engineering (STAMP) STAMP Model (Leveson, 2012) Accidents are more than a chain of events, they involve complex dynamic processes. Treat accidents as a control problem, not just a failure problem Prevent accidents by enforcing constraints on component behavior and interactions Captures more causes of accidents: Component failure accidents Unsafe interactions among components Complex human, software behavior Design errors Flawed requirements esp. software-related accidents 2 Copyright John Thomas 2015

STAMP: basic control loop Control Actions Controller Control Algorithm Process Model Feedback Controlled Process Controllers use a process model to determine control actions Accidents often occur when the process model is incorrect A good model of both software and human behavior in accidents Four types of unsafe control actions: 1) Control commands required for safety are not given 2) Unsafe ones are given 3) Potentially safe commands but given too early, too late 4) Control action stops too soon or applied too long (Leveson, 2012) Copyright John Thomas 2015 3

STAMP and STPA STPA Hazard Analysis STAMP Model How do we find inadequate control in a design? Accidents are caused by inadequate control 4 Copyright John Thomas 2015

STPA (System-Theoretic Process Analysis) Identify accidents and hazards STPA Hazard Analysis STAMP Model (Leveson, 2012) Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback Can capture requirements flaws, software errors, human errors Copyright John Thomas 2015 5

Definitions Accident (Loss) An undesired or unplanned event that results in a loss, including loss of human life or human injury, property damage, environmental pollution, mission loss, etc. Hazard A system state or set of conditions that, together with a particular set of worst-case environment conditions, will lead to an accident (loss). Definitions from Engineering a Safer World

Definitions System Accident (Loss) An undesired or unplanned event that results in a loss, including loss of human life or human injury, property damage, environmental pollution, mission loss, etc. May involve environmental factors outside our control System Hazard A system state or set of conditions that, together with a particular set of worst-case environment conditions, will lead to an accident (loss). Something we can control in the design Something we want to prevent System Accident People die from exposure to toxic chemicals System Hazard Toxic chemicals from the plant are in the atmosphere Copyright John Thomas 2015

Definitions System Accident (Loss) An undesired or unplanned event that results in a loss, including loss of human life or human injury, property damage, environmental pollution, mission loss, etc. May involve environmental factors outside our control System Hazard A system state or set of conditions that, together with a particular set of worst-case environment conditions, will lead to an accident (loss). Something we can control in the design Something we want to prevent System Accident People die from exposure to toxic chemicals People die from radiation sickness Vehicle collides with another vehicle People die from food poisoning System Hazard Toxic chemicals from the plant are in the atmosphere Nuclear power plant radioactive materials are not contained Vehicles do not maintain safe distance from each other Food products for sale contain pathogens Copyright John Thomas 2015

System Accident (Loss) Definitions An undesired or unplanned event that results in a loss, including loss of human life or human injury, property damage, environmental pollution, mission loss, etc. May involve environmental factors outside our control System Hazard A system state or set of conditions that, together with a particular set of worst-case environment conditions, will lead to an accident (loss). Something we can control in the design System Accident People die from exposure to toxic chemicals People die from radiation sickness Vehicle collides with another vehicle People die from food poisoning Broad view of safety Accident is anything that is unacceptable, that must be prevented. System Hazard Not limited to loss of life or human injury! Toxic chemicals from the plant are in the atmosphere Nuclear power plant radioactive materials are not contained Vehicles do not maintain safe distance from each other Food products for sale contain pathogens

System Safety Constraints System Hazard Toxic chemicals from the plant are in the atmosphere Nuclear power plant radioactive materials are not contained Vehicles do not maintain safe distance from each other Food products for sale contain pathogens System Safety Constraint Toxic plant chemicals must not be released into the atmosphere Radioactive materials must note be released Vehicles must always maintain safe distances from each other Food products with pathogens must not be sold Additional hazards / constraints can be found in ESW p355 Copyright John Thomas 2015

Proton Radiation Therapy System Paul Scherrer Institute, Switzerland Accidents? Hazards? Copyright John Thomas 2015

Proton Therapy Machine (Antoine) Accidents ACC1. Patient injury or death ACC2. Ineffective treatment ACC3. Loss to non-patient quality of life (esp. personnel) ACC4. Facility or equipment damage Hazards? receive more dose than clinically desirable H-R2. Patient tumor receives less dose than clinically desirable H-R4. Non-patient (esp. personnel) is unnecessarily exposed to radiation H-R5. Equipment is subject to unnecessary stress Antoine PhD Thesis, 2012

Proton Therapy Machine (Antoine) Accidents ACC1. Patient injury or death ACC2. Ineffective treatment ACC3. Loss to non-patient quality of life (esp. personnel) ACC4. Facility or equipment damage Hazards H-R1. Patient tissues receive more dose than clinically desirable H-R2. Patient tumor receives less dose than clinically desirable H-R3. Non-patient (esp. personnel) is unnecessarily exposed to radiation H-R4. Equipment is subject to unnecessary stress Antoine PhD Thesis, 2012

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 14 Copyright John Thomas 2015

Control Structure Examples

Copyright John Thomas 2015 Proton Therapy Machine High-level Control Structure Gantry Cyclotron Beam path and control elements

Proton Therapy Machine High-level Control Structure Antoine PhD Thesis, 2012 Copyright John Thomas 2015

Proton Therapy Machine Control Structure

Proton Therapy Machine Detailed Control Structure

Adaptive Cruise Control Image from: http://www.audi.com/etc/medialib/ngw/efficiency/video_assets/fallback_videos.par.0002.image.jpg

Qi Hommes

Image from: http://www.cbgnetwork.org/2608.html Chemical Plant

ESW p354 Copyright John Thomas 2015 Chemical Plant Image from: http://www.cbgnetwork.org/2608.html

Copyright John Thomas 2015 Congress U.S. pharmaceutical safety control structure FDA Pharmaceutical Companies Doctors Image from: http://www.kleantreatmentcenter.com/wpcontent/uploads/2012/07/vioxx.jpeg Patients

Ballistic Missile Defense System Image from: http://www.mda.mil/global/images/system/aegis/ftm- 21_Missile%201_Bulkhead%20Center14_BN4H0939.jpg Safeware Corporation

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 27 Copyright John Thomas 2015

STPA Step 1: Unsafe Control Actions (UCA) Control Actions Controller Controlled process Feedback 4 ways unsafe control may occur: A control action required for safety is not provided or is not followed An unsafe control action is provided that leads to a hazard A potentially safe control action provided too late, too early, or out of sequence A safe control action is stopped too soon or applied too long (for a continuous or non-discrete control action) Not providing causes hazard Providing causes hazard Incorrect Timing/ Order Stopped Too Soon / Applied too long Control Action (A) (Leveson, 2012)

Copyright John Thomas 2015 Step 1: Identify Unsafe Control Actions (a more rigorous approach, will discuss later) Control Action Process Model Variable 1 Process Model Variable 2 Process Model Variable 3 Hazardous?

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 30 Copyright John Thomas 2015

STPA Step 2: Causal Factors and Scenarios Select an Unsafe Control Action A. Identify what could cause the unsafe control action Develop causal accident scenarios B. Identify how control actions may not be followed or executed properly Develop causal accident scenarios Copyright John Thomas 2015

Step 2A: Potential causes of UCAs Unsafe Control Action Controller Inadequate Procedures (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Delays, inaccuracies, missing/incorrect behavior Controller Actuator Inadequate operation Conflicting control actions Controlled Process Component failures Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process input missing or wrong Changes over time Unidentified or out-of-range disturbance Process output contributes to system hazard Copyright John Thomas 2015

STPA Step 2: Causal Factors and Scenarios Select an Unsafe Control Action A. Identify what could cause the unsafe control action Develop causal accident scenarios B. Identify how control actions may not be followed or executed properly Develop causal accident scenarios Copyright John Thomas 2015

Step 2B: Potential control actions not followed Control Action Controller Inadequate Procedures (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Actuator Inadequate operation Sensor Inadequate operation Delays, inaccuracies, missing/incorrect behavior Controller Conflicting control actions Process input missing or wrong Control action provided but not followed Controlled Process Component failures Changes over time Unidentified or out-of-range disturbance Incorrect or no information provided Measurement inaccuracies Feedback delays Process output contributes to system hazard Copyright John Thomas 2015

STPA Step 2: Causal Factors and Scenarios Select an Unsafe Control Action A. Identify what could cause the unsafe control action Develop causal accident scenarios B. Identify how control actions may not be followed or executed properly Develop causal accident scenarios Identify controls and mitigations for the accident scenarios Copyright John Thomas 2015

Example Controls for Causal Scenarios Scenario 1 Operator provides Start Treatment command when there is no patient on the table or patient is not ready. Operator was not in the room when the command was issued, as required by other safety constraints. Operator was expecting patient to have been positioned, but table positioning was delayed compared to plan (e.g. because of delays in patient preparation or patient transfer to treatment area; because of unexpected delays in beam availability or technical issues being processed by other personnel without proper communication with the operator). Controls: Provide operator with direct visual feedback to the gantry coupling point, and require check that patient has been positioned before starting treatment (M1). Provide a physical interlock that prevents beam-on unless table positioned according to plan Antoine PhD Thesis, 2012

Example Controls for Causal Scenarios Scenario 2 Operator provides start treatment command when there is no patient. The operator was asked to turn the beam on outside of a treatment sequence (e.g. because the design team wants to troubleshoot a problem, or for experimental purposes) but inadvertently starts treatment and does not realize that the facility proceeds with reading the treatment plan and records the dose as being administered. Controls: Reduce the likelihood that non-treatment activities have access to treatment-related input by creating a non-treatment mode to be used for QA and experiments, during which facility does not read treatment plans that may have been previously been loaded (M2); Make procedures (including button design if pushing a button is what starts treatment) to start treatment sufficiently different from non-treatment beam on procedures that the confusion is unlikely. Antoine PhD Thesis, 2012

Example Controls for Causal Scenarios Command not followed Scenario 3 The operator provides the Start Treatment command, but it does not execute properly because the proper steering file failed to load (either because operator did not load it, or previous plan was not erased from system memory and overwriting is not possible) or the system uses a previously loaded one by default. Controls: When fraction delivery is completed, the used steering file could for example be automatically dumped out of the system s memory (M4). Antoine PhD Thesis, 2012 Do not allow a Start Treatment command if the steering file does not load properly Provide additional checks to ensure the steering file matches the current patient (e.g. barcode wrist bands, physiological attributes, etc.)

Chemical Reactor Example

Copyright John thomas 2015 Chemical Reactor Design Toxic catalyst flows into reactor Chemical reaction creates heat, pressure Water and condenser provide cooling What are the accidents, system hazards, system safety constraints?

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 41 Copyright John Thomas 2015

Copyright John thomas 2015 Chemical Reactor Design Toxic catalyst flows into reactor Chemical reaction creates heat, pressure Water and condenser provide cooling Create Control Structure

Copyright John thomas 2015 STPA Analysis High-level (simple) Control Structure What are the main parts????

Copyright John thomas 2015 STPA Analysis High-level (simple) Control Structure What commands are sent?? Operator Computer? Valves

Copyright John thomas 2015 STPA Analysis High-level (simple) Control Structure What feedback is received? Start Process Stop Process Operator? Computer Open/close water valve Open/close catalyst valve? Valves

Copyright John thomas 2015 Chemical Reactor Design Control Structure:

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 47 Copyright John Thomas 2015

Control Structure: Copyright John thomas 2015 Chemical Reactor: Unsafe Control Actions Close Water Valve????

Control Structure: Chemical Reactor: Unsafe Control Actions Close Water Valve Not providing causes hazard? Providing causes hazard Computer provides Close Water cmd while catalyst Incorrect Timing/ Order Stopped Too Soon / Applied too long?? Copyright John thomas 2015

Structure of an Unsafe Control Action Example: Computer provides close water valve command when catalyst open Source Controller Type Control Action Context Four parts of an unsafe control action Source Controller: the controller that can provide the control action Type: whether the control action was provided or not provided Control Action: the controller s command that was provided / missing Context: conditions for the hazard to occur (system or environmental state in which command is provided) 50 Copyright John Thomas 2015

Chemical Reactor: Unsafe Control Actions (UCA) Close Water Valve Open Water Valve Open Catalyst Valve Close Catalyst Valve Not providing causes hazard Computer does not open water valve when catalyst open Computer does not close catalyst when water closed Providing causes hazard Computer closes water valve while catalyst open Computer opens catalyst valve when water valve not open Incorrect Timing/ Order Computer closes water valve before catalyst closes Computer opens water valve more than X seconds after open catalyst Computer opens catalyst more than X seconds before open water Computer closes catalyst more than X seconds after close water Stopped Too Soon / Applied too long Computer stops opening water valve before it is fully opened Computer stops closing catalyst before it is fully closed Copyright John Thomas 2015

Safety Constraints Unsafe Control Action Computer does not open water valve when catalyst valve open Computer opens water valve more than X seconds after catalyst valve open Computer closes water valve while catalyst valve open Computer closes water valve before catalyst valve closes Computer opens catalyst valve when water valve not open Etc. Safety Constraint Computer must open water valve whenever catalyst valve is open Computer must open water valve within X seconds of catalyst valve open Computer must not close water valve while catalyst valve open Computer must not close water valve before catalyst valve closes Computer must not open catalyst valve when water valve not open Etc.

Traceability Always provide traceability information between UCAs and the hazards they cause Same for Safety Constraints Two ways: Create one UCA table (or safety constraint list) per hazard, label each table with the hazard Create one UCA table for all hazards, include traceability info at the end of each UCA E.g. Computer closes water valve while catalyst open [H-1]

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 54 Copyright John Thomas 2015

Step 2: Potential causes of UCAs UCA: Computer opens catalyst valve when water valve not open Controller Inadequate Control Algorithm (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Delays, inaccuracies, missing/incorrect behavior Controller Actuator Inadequate operation Conflicting control actions Controlled Process Component failures Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process input missing or wrong Changes over time Unidentified or out-of-range disturbance Process output contributes to system hazard

Step 2: Potential control actions not followed Computer opens water valve Controller Inadequate Control Algorithm (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Delays, inaccuracies, missing/incorrect behavior Controller Actuator Inadequate operation Conflicting control actions Controlled Process Component failures Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process input missing or wrong Changes over time Unidentified or out-of-range disturbance Process output contributes to system hazard

Chemical Reactor: Real accident

ITP Exercise a new in-trail procedure for trans-oceanic flights 58

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 59 Copyright John Thomas 2015

System-level Accident (Loss): Aircraft crashes System-level Hazard: Two aircraft violate minimum separation Copyright John Thomas 2015

Aviation Examples System-level Accident (loss) A-1: Two aircraft collide A-2: Aircraft crashes into terrain / ocean System-level Hazards H-1: Two aircraft violate minimum separation H-2: Aircraft enters unsafe atmospheric region H-3: Aircraft enters uncontrolled state H-4: Aircraft enters unsafe attitude H-5: Aircraft enters prohibited area

System Safety Constraints System Hazard H-1: Two aircraft violate minimum separation H-2: Aircraft enters unsafe atmospheric region H-3: Aircraft enters uncontrolled state H-4: Aircraft enters unsafe attitude H-5: Aircraft enters prohibited area System Safety Constraint SC-1:? SC-2:? SC-3:? SC-4:? SC-5:?

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 63 Copyright John Thomas 2015

North Atlantic Tracks

STPA application: NextGen In-Trail Procedure (ITP) Current State Pilots will have separation information Pilots decide when to request a passing maneuver Air Traffic Control approves/denies request Proposed Change

Copyright John Thomas 2015 Draw the Functional Control Structure High-level (simple) Control Structure Main components and controllers????

Copyright John Thomas 2015 Draw the Functional Control Structure High-level (simple) Control Structure Who controls who? Flight Crew? Aircraft? Air Traffic Controller?

Copyright John Thomas 2015 Draw the Functional Control Structure High-level (simple) Control Structure What commands are sent?? Air Traffic Control? Flight Crew?? Aircraft

Copyright John Thomas 2015 Draw the Functional Control Structure High-level (simple) Control Structure Air Traffic Control Issue clearance to pass Feedback? Flight Crew Execute maneuver Feedback? Aircraft

Draw the Functional Control Structure High-level Control Structure Air Traffic Control Issue clearance to pass Request to pass Flight status Flight Crew Execute maneuver A/C status, ITP criteria, etc. Aircraft

More complex control structure Copyright John Thomas 2015

Adding Levels Congress Directives, funding Reports FAA Regulations, procedures Reports ATC Issue clearance Request to pass, Flight status Pilots Execute maneuvers A/C status ITP criteria, etc. Aircraft

Air Traffic Control (ATC) ATC Front Line Manager (FLM) Instructions Status Updates Instructions Status Updates Instructions Status Updates Instructions Company Dispatch Status Updates Instructions ATC Ground Controller Query Status Updates and acknowledgements ATC Radio Other Ground Controllers Execute maneuvers Pilots Pilots Pilots Pilots Aircraft Execute maneuvers Aircraft Execute maneuvers Aircraft Execute maneuvers Aircraft ACARS Text Messages Copyright John Thomas 2015

Pilot Responsibilities and Process Model Responsibilities: Assess whether ITP maneuver is appropriate Check if ITP criteria are met Request ITP Receive ITP approval Recheck criteria Execute flight level change Confirm new flight level to ATC Process Model Own ship climb/descend capability ADS-B data for nearby aircraft (velocity, position, orientation) ITP criteria (speed, distance, relative attitude, similar track, data quality) State of ITP request/approval etc.

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 75 Copyright John Thomas 2015

Identify Unsafe Control Actions Example: Let s start with the pilot Instructions ATC Request to pass Flight status Pilots Execute passing maneuver Aircraft status, position, etc Aircraft Control Action Not providing causes hazard Providing Causes hazard Incorrect Timing/ Order Stopped Too Soon Execute Passing Maneuver

Identify Unsafe Control Actions Instructions Execute passing maneuver ATC Pilots Request Status Aircraft status, position, etc Example UCA: Pilots provide passing maneuver when maneuver is not approved Source Controller Type Control Action Context Aircraft Control Action Not providing causes hazard Providing Causes hazard Incorrect Timing/ Order Stopped Too Soon Execute Passing Maneuver? Pilots perform passing maneuver when it is not approved [H-1]??

Copyright John Thomas 2015 Controller Safety Constraints Unsafe Control Action Pilots execute maneuver when ITP criteria are not satisfied Pilots execute maneuver with incorrect climb rate, final altitude, etc Pilots execute maneuver too soon before approval Pilots execute maneuver too late after reassessment Pilots stop maneuver before reaching designated altitude Pilots continue to climb/descend beyond designated altitude Safety Constraint Pilots must not execute maneuver when ITP criteria are not satisfied Pilots must not execute maneuver with incorrect climb rate, final altitude, etc. Pilots must not begin to execute maneuver before approval Pilots must execute maneuver within X minutes of reassessment Pilots must not stop maneuver before reaching designated altitude (except in emergency temination) Pilots must not climb/descent beyond designated altitude

STPA (System-Theoretic Process Analysis) (Leveson, 2012) Identify accidents and hazards Draw the control structure Step 1: Identify unsafe control actions Step 2: Identify causal factors and create scenarios Control Actions Controller Controlled process Feedback 79 Copyright John Thomas 2015

Step 2: Potential causes of UCAs UCA: Pilots execute maneuver when ITP criteria not met Controller Inadequate Procedures (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Delays, inaccuracies, missing/incorrect behavior Controller Actuator Inadequate operation Conflicting control actions Controlled Process Component failures Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process input missing or wrong Changes over time Unidentified or out-of-range disturbance Process output contributes to system hazard

STPA Step 2: Causal Factors and Scenarios Select an Unsafe Control Action A. Identify what could cause the unsafe control action Develop causal accident scenarios B. Identify how control actions may not be followed or executed properly Develop causal accident scenarios Copyright John Thomas 2015

Step 2: Potential control actions not followed Pilots execute maneuver Controller Inadequate Procedures (Flaws in creation, process changes, incorrect modification or adaptation) Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Delays, inaccuracies, missing/incorrect behavior Controller Actuator Inadequate operation Conflicting control actions Controlled Process Component failures Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process input missing or wrong Changes over time Unidentified or out-of-range disturbance Process output contributes to system hazard

Additional steps Use causal analysis to identify detailed safety design requirements and design options Iterate top-down Refine into more detailed control structures Refine safety constraints (requirements) into more detailed requirements for each component See examples of these in my presentation tomorrow

Operations and Performance Monitoring Consider how designed controls could degrade over time Use STPA results to build in protection: a) Planned performance audits where assumptions underlying the hazard analysis are the preconditions for the operational audits and controls b) Management of change procedures c) Incident/accident analysis

For more information Google: STPA Primer Written for industry to provide guidance in learning STPA Website: mit.edu/psas Previous MIT STAMP workshop presentations Book Engineering a Safer World by Nancy Leveson Sunnyday.mit.edu Academic STAMP papers, examples