How Many Iterations Are Enough?

Similar documents
Equation 1: F spring = kx. Where F is the force of the spring, k is the spring constant and x is the displacement of the spring. Equation 2: F = mg

Risk-Based Condition Assessment and Maintenance Engineering for Aging Aircraft Structure Components

March Madness Basketball Tournament

Running head: DATA ANALYSIS AND INTERPRETATION 1

Applying Hooke s Law to Multiple Bungee Cords. Introduction

Policy Management: How data and information impacts the ability to make policy decisions:

Queue analysis for the toll station of the Öresund fixed link. Pontus Matstoms *

March Madness Basketball Tournament

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

POWER Quantifying Correction Curve Uncertainty Through Empirical Methods

Descriptive Statistics Project Is there a home field advantage in major league baseball?

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

The NCCA S-Curve Tool DON Service Session

Determination of the Design Load for Structural Safety Assessment against Gas Explosion in Offshore Topside

Lesson 3 Pre-Visit Teams & Players by the Numbers

SCIENTIFIC COMMITTEE SEVENTH REGULAR SESSION August 2011 Pohnpei, Federated States of Micronesia

Calculation of Trail Usage from Counter Data

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Roundabout Design 101: Roundabout Capacity Issues

NBA TEAM SYNERGY RESEARCH REPORT 1

Chapter 2: Modeling Distributions of Data

Naval Postgraduate School, Operational Oceanography and Meteorology. Since inputs from UDAS are continuously used in projects at the Naval

ESTIMATION OF THE DESIGN WIND SPEED BASED ON


Algebra I: A Fresh Approach. By Christy Walters

Quantitative Risk of Linear Infrastructure on Permafrost Heather Brooks, PE. Arquluk Committee Meeting November 2015

Addendum to SEDAR16-DW-22

Boyle s Law: Pressure-Volume Relationship in Gases. PRELAB QUESTIONS (Answer on your own notebook paper)

Quality Assurance Charting for QC Data

Using Markov Chains to Analyze a Volleyball Rally

SOFTWARE. Sesam user course. 12 May 2016 HydroD Hydrostatics & Stability. Ungraded SAFER, SMARTER, GREENER DNV GL 2016

A Traffic Operations Method for Assessing Automobile and Bicycle Shared Roadways

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

SAMPLE MAT Proceedings of the 10th International Conference on Stability of Ships

Applying Occam s Razor to the Prediction of the Final NCAA Men s Basketball Poll

The Use of a Process Simulator to Model Aeration Control Valve Position and System Pressure

Algebra I: A Fresh Approach. By Christy Walters

Does Momentum Exist in Competitive Volleyball?

Lesson 5 Post-Visit Do Big League Salaries Equal Big Wins?

Pedestrian traffic flow operations on a platform: observations and comparison with simulation tool SimPed

The Corporation of the City of Sarnia. School Crossing Guard Warrant Policy

OBJECTIVE METHODOLOGY

USING A CALCULATOR TO INVESTIGATE WHETHER A LINEAR, QUADRATIC OR EXPONENTIAL FUNCTION BEST FITS A SET OF BIVARIATE NUMERICAL DATA

SUBMERGED VENTURI FLUME. Tom Gill 1 Robert Einhellig 2 ABSTRACT

Stat 139 Homework 3 Solutions, Spring 2015

CONTROL VALVE WHAT YOU NEED TO LEARN?

Legendre et al Appendices and Supplements, p. 1

Process Simulator Evaluates Blower and Valve Control Strategies for WWTP Aeration

EFFICIENCY OF TRIPLE LEFT-TURN LANES AT SIGNALIZED INTERSECTIONS

Boyle s Law: Pressure-Volume. Relationship in Gases

ITTC Recommended Procedures and Guidelines

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Appendix 12 Consequences

Analyses of the Scoring of Writing Essays For the Pennsylvania System of Student Assessment

intended velocity ( u k arm movements

A IMPROVED VOGEL S APPROXIMATIO METHOD FOR THE TRA SPORTATIO PROBLEM. Serdar Korukoğlu 1 and Serkan Ballı 2.

In addition to reading this assignment, also read Appendices A and B.

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings

Predicted Dispense Volume vs. Gravimetric Measurement for the MICROLAB 600. November 2010

PGA Tour Scores as a Gaussian Random Variable

Chapter 5: Methods and Philosophy of Statistical Process Control

MODELING&SIMULATION EXTRAPOLATED INTERNAL ARC TEST RESULTS: A COUPLED FLUID-STRUCTURAL TRANSIENT METHODOLOGY

Modeling Pedestrian Volumes on College Campuses

3D Inversion in GM-SYS 3D Modelling

8th Grade. Data.

Pitching Performance and Age

BEFORE YOU OPEN ANY FILES:

CHEMICAL ENGINEERING LABORATORY CHEG 239W. Control of a Steam-Heated Mixing Tank with a Pneumatic Process Controller

Boyle s Law. Pressure-Volume Relationship in Gases. Figure 1

1 PIPESYS Application

In the spring of 2006, national newspaper headlines screamed

STATIC AND DYNAMIC EVALUATION OF THE DRIVER SPEED PERCEPTION AND SELECTION PROCESS

An Analysis of Reducing Pedestrian-Walking-Speed Impacts on Intersection Traffic MOEs

Runs Solution Temp. Pressure zero reset

Laboratory Activity Measurement and Density. Average deviation = Sum of absolute values of all deviations Number of trials

Beamex. Calibration White Paper. Weighing scale calibration - How to calibrate weighing instruments

Navigate to the golf data folder and make it your working directory. Load the data by typing

Basketball field goal percentage prediction model research and application based on BP neural network

Ecological Risks to Natural Populations of Chinook Salmon by Hatchery Releases of Chinook and Coho Salmon. Throughout the Greater Puget Sound Region

CENTER PIVOT EVALUATION AND DESIGN

SIZING THE EXTROL DIAPHRAGM-TYPE HYDRO-PNEUMATIC TANK

Stats in Algebra, Oh My!

Gait Analyser. Description of Walking Performance

Operating Instructions METTLER TOLEDO Pipette Check Application for AX and MX/UMX Balances Version 1.xx

Section 5 Critiquing Data Presentation - Teachers Notes

Evaluation and further development of car following models in microscopic traffic simulation

NCSS Statistical Software

A Hare-Lynx Simulation Model

The Traveling NET Trainer Problem

Support Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016

Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness

PUBLISHED PROJECT REPORT PPR850. Optimisation of water flow depth for SCRIM. S Brittain, P Sanders and H Viner

Evaluation of NEA haddock Harvest Control Rules

Section I: Multiple Choice Select the best answer for each problem.

Journal of Chemical and Pharmaceutical Research, 2014, 6(3): Research Article

Wildlife Ad Awareness & Attitudes Survey 2015

Ozobot Bit Classroom Application: Boyle s Law Simulation

Traffic Parameter Methods for Surrogate Safety Comparative Study of Three Non-Intrusive Sensor Technologies

Building an NFL performance metric

Transcription:

T ECOLOTE R ESEARCH, I NC. Bridging Engineering and Economics Since 1973 How Many Are Enough? Alfred Smith 2008 Joint SCEA/ISPA Annual Conference and Training Workshop Pacific Palms Conference Resort, Industry Hills, CA June 2008 Los Angeles Washington, D.C. Boston Chantilly Huntsville Dayton Santa Barbara Albuquerque Colorado Springs Ft. Meade Ft. Monmouth Ogden Silver Spring Patuxent River Pensacola Washington Navy Yard Goddard Space Flight Center Cleveland Denver Johnson Space Center Dahlgren Montgomery New Orleans Oklahoma City Tampa Tacoma Warner Robins ALC Vandenberg AFB 0807 1

Setting the Stage Describe a cost uncertainty simulation model How to Test for Convergence Analytic test for convergence Test for convergence using simulation data Propose a simple, repeatable, tool independent approach Outline Applying the Approach to Several Models Look for patterns Identify model characteristics that influence the iterations required Concluding Comments 2

A Sample Inputs Simulation Model: AFCAA CRUH Example File 21 WBS elements (lowest level), 38 input variables Most of the common estimating methods are represented Linear, loglinear, triad, factor, build-up, third party tools, throughputs Date driven methods (uncertainty on duration) Normal, lognormal, triangular, uniform uncertainty distributions Functional and applied correlation Includes 10 discrete (Bernoulli) distributions Modeled using @Risk, CB, ACEIT 3

S-Curves Based On Different 10 AFCAA CRUH Example Total Cost Calculated with 100 and 50k iterations Confidence Level (CDF) 8 10 6 4 8 3 2 6 1 4 $250,000 $750,000 $1,250,000 $1,750,000 $2,250,000 BY2007 $K 50k (cdf) 100 (cdf) Confidence Level (CDF) 3 2 1 AFCAA CRUH Example Total Cost Calculated with 1k, 10k and 50k iterations 100 iterations clearly not enough 1k iterations almost matches the 50k run No visual evidence that 10k any different from 50k iteration result $250,000 $750,000 $1,250,000 $1,750,000 $2,250,000 BY2007 $K 50k (cdf) 10k (cdf) 1k (cdf) 4

T ECOLOTE R ESEARCH, I NC. Bridging Engineering and Economics Since 1973 Convergence 5

Analytical Solution Where: m = p c 1 Δp ( p) 2 m = number of iterations p = the percentile of interest c = inverse of the standard normal cumulative distribution For 95% confidence, in Excel use Normsinv(0.975) p = the percentile range of interest (for instance, use 0.05 if interested in +/- 5 percentile) Independent of distribution shape Source: M Granger Morgan and Max Henrion, UNCERTAINTY, A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis, pp 202 6

Analytical Solution 95% CL Bound as % of Target Percentile 5. 4.5% 4. 3.5% Required To Be 95% Confident That True Percentile Is Within Specified Range 50 Percentile 70 Percentile 90 Percentile 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Observations More iterations required to converge on the 50 percentile than 90 percentile Morgan & Henrion pp 202 describe the 50 percentile as the least precise estimated percentile Need 5 to 35k iterations to have error less than 1% Will Latin Hypercube sampling improve on this result? 7

Test for Sufficient From Simulation Data Goal: create a simple way to determine sufficient number of iterations to obtain accurate results using the simulation data Several potential metrics of interest: Mean, standard deviation, coefficient of variation Correlation coefficient Target percentile Other? All? Selected: target percentile for the WBS element(s) of interest Selected because this is the result that tends to be the basis for budget recommendations,, used in this study, but the one your decision maker needs might be a better choice 8

Several Issues to Resolve: How do we know the right answer Comparing a complex cost model simulation result to an analytic solution is not feasible Literature identifies 10k iterations as the benchmark for sufficient Morgan MG, Henrion M (1990) Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis Garvey P (2000) Probability Methods for Cost Uncertainty Analysis: A Systems Engineering Perspective How to gather the data? Use Latin Hypercube sampling rather than Monte Carlo Is it necessary to change the random seeds on each run? Is it necessary to perform separate runs, or is the data from a single 10k run sufficient? How to present results? Options include: Plot multiple statistics for a specific result Plot single statistic for multiple results Plot x iteration result as a % difference from the correct result Selected: Plot x iteration result as the absolute % difference to the correct result 9

Is it necessary to change random seed when checking for convergence? % Different from 10k Mean % Different from 10k Mean - - - - - - - - Missile System: 50 Percentile Variation Due to Change in Randon Seeds 500 2500 5000 10000 Missile System: 70 Percentile Variation Due to Change in Randon Seeds 500 2500 5000 10000 Missile System: 90 Percentile Variation Due to Change in Randon Seeds High 88% 12% Low High 88% 12% Low 25 identical CRUH files, but with a different set of random seeds All 25 files run at 500, 2500, 5000 and 10000 iterations 50, 70 and 90 percentile results at the Total level each compared to the average of the 10k result across all 25 files Observation: random seed selection generally has less than +/- impact on most results % Different from 10k Mean - - - - 500 2500 5000 10000 High 88% 12% Low Conclusion: No need to change random seed to check for convergence 10

Separate Runs vs A Single Run Option 1: Generate separate runs: Perform a separate run for each x iterations that will be compared to the 10k run Becomes extremely time consuming if any fidelity desired Option 2: Use data from a single 10k run: Obtain 10k iteration data and calculate statistics based on all 10k Recalculate the statistics based upon the first 200 iterations, first 300, first 400 and so on An alterative is to randomly sample with replacement from the 10k data Does not guarantee distributions are sampled across their entire range Far quicker and easier to manage than Option 1 Goal: Demonstrate that Option 2 is adequate 11

Is it necessary to perform separate runs when checking for convergence? % Single Run Different from Separate Runs - - - - - - Compare Results from Single Run to Separate Runs 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 (comparing Total $ percentiles) Results from first 200 iterations of a 10k run are compared to an independent run of 200 iterations and so on Conclusion: analysis of a single 10k run is sufficiently accurate to test for stability 12

Recap and Way Ahead Recap: 10k iterations selected as the benchmark Two sources noted Ignore impact of random seed changes Random seed change has a +/- impact Use the data from a single 10k simulation run Separate runs more completely sample the distribution, but statistics are generally less than 1% different from statistics calculated from a single 10k run Way Ahead: Create a tool to calculate the statistics for each sample of interest and compare them to the 10k statistics Design the tool so that the user may drop in the iteration data from any source 13

Calculating the Statistics In Excel Mean = AVERAGE(INDIRECT("B$51:B$" & 50+$D51)) % = LARGE(INDIRECT("B$51:B$" & 50+$D51),ROUND($D51-H$49*$D51,0)) Excel Functions: INDIRECT: allows column D to automatically calculate the statistic from the correct range LARGE: finds the value from the correct range for the percentile of interest Copy/paste iteration data from any simulation tool into Column B Column D can be edited to obtain any granularity of interest Create additional columns to calculate the % difference from the selected max iterations (in our case, 10k was selected) Using this approach, the process becomes tool independent This tool was used to create the charts that follow 14

Revisiting The 10k Decision ABS % Different from 10k result Convergence Results for: AFCAA CRUH Ex Relative to 50k 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Absolute % difference from 50k result is plotted for different confidence levels Any statistic of interest could be used Convergence Results for: AFCAA CRUH Ex Relative to 50k If different is considered noise then anything after 3k qualifies as accurate Conclusion: 10k iterations as the reference for accurate stands for this model ABS % Different from 10k result 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 15

Comparing Convergence Across Tools % Different From 10k Average Result AFCAA CRUH Missile Total 50 Percentile CB @Risk ACEIT Results at each iteration are compared to the average of the three tool results at 10k % Different From 10k Average Result % Different From 10k Average Result 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 AFCAA CRUH Missile Total 70 Percentile 0 1000 AFCAA 2000 CRUH 3000 Missile 4000 Total 5000 906000 Percentile 7000 8000 9000 10000 CB @Risk ACEIT CB @Risk ACEIT Patterns would differ if random seeds changed, but within +/- Conclusion: All three tools demonstrate similar convergence behavior 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 16

T ECOLOTE R ESEARCH, I NC. Bridging Engineering and Economics Since 1973 Convergence For Several Examples 17

A Typical Cost Model Over 70 WBS elements estimated using: Non-Linear CERs Linear CERs Factor Relationships Build-up estimates Data from 3 rd party tools Throughputs Over 150 input variables such as: Labor rates Configuration Inputs (mass, power, etc) Programmatic Inputs (design life, schedule, etc) Factors (overhead wraps, etc) 18

Identify For Convergence Convergence Results for: SMEX Total ABS % Different from 10k result 3.0 2. 2.0 1. 1.0 0. PDF 5% 4% 3% 2% 1% 0.0 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 3500 iterations appears sufficient May be different if anything is changed in the model Note that distribution shape is not normal 19

ACEIT Example Files Convergence Results for: Large ACEIT Example 5% ABS % Different from 10k result PDF 4% 3% 2% 1% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 Convergence Results for: Small ACEIT Example 5% ABS % Different from 10k result PDF 4% 3% 2% 1% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 20

Electronics ABS % Different from 10k result Convergence Results for: Large Elec LCC PDF 5% 4% 3% 2% 1% CV is very unusual 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 Convergence Results for: Small Elec LCC 5% ABS % Different from 10k result PDF 4% 3% 2% 1% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 21

Cost Presented at the 2008 SCEA-ISPA Joint Annual Conference and Training Workshop - www.iceaaonline.com $55,000 $54,000 $53,000 $52,000 $51,000 $50,000 $49,000 $48,000 $47,000 $46,000 $45,000 1.4 1.2 1.0 Compare Mean As Increase Mean With All 10k Mean With Lower 9.9k % Diff 0 2,000 4,000 6,000 8,000 10,000 Compare CV as Increase CV with All 10k CV with Lower 9.9k What s s Happening in the Large Electronic Simulation? 2 18% 16% 14% 12% 1 8% 6% 4% 2% % Difference After 1,000 iterations, the mean climbed (red line) as iterations increased CV jumps up dramatically periodically (red line) The top 100 results were stripped from the simulation and stats recalculated Mean and CV settled out very quickly (green lines) CV Cost 0.8 0.6 0.4 0.2 0.0 $85,000 $80,000 $75,000 $70,000 $65,000 $60,000 $55,000 $50,000 $45,000 $40,000 0 2,000 4,000 6,000 8,000 10,000 Compare Percentile Results For 10 and Top 9.9k 90 Percentile All 10k 90 Percentile Lower 9.9k 70 Percentile All 10k 70 Percentile Lower 9.9k 50 Percentile All 10k 50 Percentile Lower 9.9k 0 2,000 4,000 6,000 8,000 10,000 Examination of model revealed rare divide by zero due to denominator distributions, explaining the occasional huge result that swamped all others The percentile results were not affected. With all 10k iterations or with the lowest 9.9k, the 50, 70 and 90 percentile results all converge after several k iterations 22

Aircraft LCC Convergence Results for: Large Aircraft LCC 5% ABS % Different from 10k result PDF 4% 3% 2% 1% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 Convergence Results for: Small Aircraft LCC 5% ABS % Different from 10k result PDF 4% 3% 2% 1% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 23

Space Systems Convergence Results for: Large Space 15% ABS % Different from 10k result PDF 1 5% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 Convergence Results for: Small Space 15% ABS % Different from 10k result 1 5% 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 24

Why Does the Small Space Model Require So Many? 50k Convergence Results for: Small Space ABS % Different from 10k result 15% 1 5% 0 10,000 20,000 30,000 40,000 50,000 Model based upon following equation: 0.6636*V1^0.6567 * V2^0.1555 * V3^0.03226 * V4^0.4409 * V5^0.9142 * V6^-0.2879 Uncertainty on each variable CER result used to estimate other cost elements using uncertain factor relationships One of the smallest models, takes the most iterations 25

Concluding Comments Convergence was defined as the number of iterations required such that statistic of interest stays within of the 10k result 50, 70, 90 percentile selected in this study as basis for testing for convergence Simple Excel tool provides a consistent, tool independent way to test for convergence Observations: None of the models generated a Normal distribution at the total level Can ignore impact of random seed changes Convergence can be estimated from a single 10k simulation run Models tested converged faster than analytic formula suggests, possibly due using Latin Hypercube over Monte Carlo Contrary to the analytic approach, more iterations are required as percentile increases CV more important than # of elements in model when assessing iteration requirement 10k iterations may be insufficient if model CV is high, i.e. > 0.6 How many iterations are required? Unfortunately, the answer is: it depends Use a simple, consistent method to find out 26

T ECOLOTE R ESEARCH, I NC. Bridging Engineering and Economics Since 1973 Backup 27

AFCAA CRUH 10k Iteration Example Results 10 Compare ACE, CB, @Risk Missile Risk Results 8 6 CDF 4 3 2 ACE CB @Risk 1 $500,000 $600,000 $700,000 $800,000 $900,000 $1,000,000 $1,100,000 $1,200,000 $1,300,000 2007 BY $K Results are tool independent The handbook does not endorse or recommended any specific tool 28

Different Ways to Present Results ABS % Different from 10k result 3.0 2. 2.0 1. 1.0 0. Convergence Results for: Missile System PDF 5% 4% 3% 2% 1% ABS % Different from 10k result 3.0 2. 2.0 1. 1.0 0. Convergence Results for: Missile System 10 per. Mov. Avg. () 0.0 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 0.0 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 BY 2007K$ $1,300,000 $1,250,000 $1,200,000 $1,150,000 $1,100,000 $1,050,000 $1,000,000 $950,000 $900,000 $850,000 Convergence Results for: Missile System 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 % Different from 10k result 3.0 2. 2.0 1. 1.0 0. 0.0-0. -1.0-1. -2.0-2. -3.0 Convergence Results for: Missile System 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 Derived from evaluating the iteration data from a 10k run Appears that for this model (AFCAA CRUH Ex), 2-3 k iterations are sufficient Conclusion: Upper left selected as standard way to present analysis 29