Policy Management: How data and information impacts the ability to make policy decisions: Louis Cripps Regional Transportation District, Asset Management Denver, Colorado
Quick exercise... What do these look like?
Apophenia or Patternicity The experience of perceiving patterns or connections in random or meaningless data.
Message - WE are built to see patterns. Our awareness of this can help us guard against seeing patterns in data that may not exist.
Goals of this presentation: A discussion based in fun Help us ask better questions of our data Learn to watch for common errors Improve decisions based on data How to ask better questions Understanding Measurements
Transit Example We can see seasonality in data where none exists.
Average Monthly Boardings 2000 to 2014 6000000 5800000 5600000 5400000 5200000 5000000 4800000 4600000 4400000 4200000 4000000 January February March April May June July August September October November December
Average Monthly Boardings with Control Limits 6000000 5800000 5600000 5400000 5200000 5000000 4800000 Average Monthly Boardings Lower Bound Upper Bound 4600000 4400000 4200000 4000000
With Minimum and Maximum Observed Ridership
Tukey s B Post Hoc Test Group 1 = October has Unusually High Ridership Group 2 = June, July and December have Unusually Low Ridership July 4,882,658 December 4,916,457 June 5,014,757 February 5,216,807 5,216,807 January 5,269,090 5,269,090 November 5,317,047 5,317,047 May 5,375,980 5,375,980 August 5,495,968 5,495,968 March 5,543,448 5,543,448 April 5,596,196 5,596,196 September 5,612,340 5,612,340 October 5,979,804
Instruments and Tools
Data Analysis and Visualization It is difficult to see the message in rows and columns of data. The message if properly displayed is easy to see and conveys an accurate story.
Difficult to quickly see signals
Easy to quickly see signals
Easy to quickly see signals
Seeing a pattern that is not there Context cues Failure to see a pattern Coincidence How do you know? Tools Statistics, control charts and other information were created for the purpose of preventing ourselves from relying on our instincts Statistics can also hide important information [A common statistic can hide] o Variability
The Average Perhaps the most commonly used statistic is the average City one average temp 65 City two average temp 70 Question: Should I pack a coat?
Averages Can Hide Variability Inconsistent Consistent Run 1 20 10 Run 2 30 10 Run 3 30 10 Run 4 10 10 Run 5 1 10 Run 6 0 10 Run 7 0 10 Run 8 0 10 Run 9 0 10 Run 10 0 10 Run 11 0 10 Run 12 0 10 Run 13 0 10 Run 14 0 10 Run 15 0 10 Run 16 1 10 Run 17 10 10 Run 18 30 10 Run 19 30 10 Run 20 20 10 Average 9.1 10
The Average Works well: Little variability Small changes don t matter, but magnitude does Works poorly: Variability impacts decision
Definition of Measurement Quantitatively expressed reduction of uncertainty based on one or more observation.
A problem well stated is a problem half solved Charles Kettering Clarification If it matters at all it is detectable / observable If it is detectable, it can be detected as an amount or range of possible amounts If it can be detected as a range of possible amounts, it can be measured.
It s a numbers game Scales Different audiences Different purposes Linear Temperature Ordinal Star rating Exponential Microorganism Logarithmic PH of Solutions
Ordinal scale
Interval or Linear Scale Numerical scale in which intervals have the same interpretation throughout. Fahrenheit scale of temperature Difference between 30 and 40 degrees represent the same temp change as between 80 and 90 degrees. Can t mix Celsius vs. Fahrenheit
Ratio Scales Money, Voltage Math can be applied to these measures Zero position indicates the absence of quantity being measured
Dash-boards and Data Displays Gee Whiz Curve Accuracy versus Precision Context
Gee Whiz Goodness "Before" "After"
Same Data 100% Goodness 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% "Before" "After"
What Excel thinks you want 101% 100% Goodness 100% 99% Goodness 99% 98% 98% 97% "Before" "After"
Graphic displays can illustrate variability 8000 7000 Number of Stops with Average Weekday Boarding Level 6938 6000 5000 4000 3000 2000 1000 0 497 0 More than 0, Less than 20 912 20 or more, less than 40 344 316 420 40 or more, less than 60 60 or more, less than 100 100 or more, less than 500 131 500 or more
Refining graphic information displays 3000 Number of Stops with Average Weekday Boarding Level 2729 2500 2168 2000 1500 1383 1155 1000 500 554 567 451 420 0 70 61 0 to.99 1 to 5 5.01 to 10 10.01 to 20 20.01 to 30 30.01 to 50 50.01 to 100 100.01 to 500 500.01 to 1000 Over 1000
Proportion is Contextual 60% Ridership Change in Percentage 50% 40% 33% 20% 0% 0% -20% -40% -60% -50% January February March April
Context 60% Ridership Change in Percentage 1,200,000 40% 1,000,000 20% 800,000 0% January February March April 600,000 Ridership Change in Percentage Actual Ridership -20% 400,000-40% 200,000-60% 0
Side Effects of Exact In todays world we expect precision. But can precision lead to a poor decision?
Presenting Ideas How can we help our audience understand our data?
Understanding Magnitude Million Billion Trillion Seconds Minutes Hours Days Years 1 Million 1,000,000 16,667 278 12 0.03 1 Billion 1,000,000,000 16,666,667 277,778 11,574 31.71 1 Trillion 1,000,000,000,000 16,666,666,667 277,777,778 11,574,074 31,709.79
Finished Thank you Come on, we covered this It s just a cup of coffee