Combining Experimental and Non-Experimental Design in Causal Inference

Similar documents
Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Bayesian Methods: Naïve Bayes

Special Topics: Data Science

ECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6

Lecture 5. Optimisation. Regularisation

Imperfectly Shared Randomness in Communication

Functions of Random Variables & Expectation, Mean and Variance

Jasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM)

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Physical Design of CMOS Integrated Circuits

Operational Risk Management: Preventive vs. Corrective Control

Conservation of Energy. Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition

Communication Amid Uncertainty

CS249: ADVANCED DATA MINING

NCSS Statistical Software

Tie Breaking Procedure

Communication Amid Uncertainty

CS145: INTRODUCTION TO DATA MINING

ISyE 6414 Regression Analysis

Pre-Kindergarten 2017 Summer Packet. Robert F Woodall Elementary

Use of Auxiliary Variables and Asymptotically Optimum Estimators in Double Sampling

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

New Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary Variables

Analysis of Gini s Mean Difference for Randomized Block Design

Logistic Regression. Hongning Wang

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Coaches, Parents, Players and Fans

TSP at isolated intersections: Some advances under simulation environment

A Class of Regression Estimator with Cum-Dual Ratio Estimator as Intercept

CT4510: Computer Graphics. Transformation BOCHANG MOON

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

knn & Naïve Bayes Hongning Wang

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

Support Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016

New Albany / Fred Klink Memorial Classic 3rd Grade Championship Bracket **Daylight Saving Time - Turn Clocks Up One Hour**

Jamming phenomena of self-driven particles

ISyE 6414: Regression Analysis

Grade K-1 WRITING Traffic Safety Cross-Curriculum Activity Workbook

San Francisco State University ECON 560 Summer Midterm Exam 2. Monday, July hour 15 minutes

SNARKs with Preprocessing. Eran Tromer

115th Vienna International Rowing Regatta & International Masters Meeting. June 15 to June 17, 2018

Lesson 18: There Is Only One Line Passing Through a Given Point with a Given Slope

Do New Bike Share Stations Increase Member Use?: A Quasi-Experimental Study

Guidelines for Applying Multilevel Modeling to the NSCAW Data

Introduction to Genetics

2016 AzMERIT and Historical Trend Data

Addition and Subtraction of Rational Expressions

Machine Learning Application in Aviation Safety

NEUE DONAU / VIENNA. June 24 to 26,

Name May 3, 2007 Math Probability and Statistics

The MACC Handicap System

NATIONAL FEDERATION RULES B. National Federation Rules Apply with the following TOP GUN EXCEPTIONS

Stat 139 Homework 3 Solutions, Spring 2015

Full Name: Period: Heredity EOC Review

INSTALLING THE PROWLER 13 RUDDER

EE582 Physical Design Automation of VLSI Circuits and Systems

What is Restrained and Unrestrained Pipes and what is the Strength Criteria

Three different funding sources funded different facets of the research.

Lecture 10. Support Vector Machines (cont.)

graphic standards manual Mountain States Health Alliance

Existence of Nash Equilibria

Name: Grade: LESSON ONE: Home Row

Bhagwant N. Persaud* Richard A. Retting Craig Lyon* Anne T. McCartt. May *Consultant to the Insurance Institute for Highway Safety

My ABC Insect Discovery Book

Chapter 10 Aggregate Demand I: Building the IS LM Model

What does it take to produce an Olympic champion? A nation naturally

Effects of sea lion predation on Willamette River winter steelhead viability

The Effect of Public Sporting Expenditures on Medal Share at the Summer Olympic Games: A Study of the Differential Impact by Sport and Gender

Reduction of Speed Limit at Approaches to Railway Level Crossings in WA. Main Roads WA. Presenter - Brian Kidd

b) (2 pts.) Does the study show that drinking 4 or more cups of coffee a day caused the higher death rate?

Scoil Rince Ní Bhrogáin

Configurable Test-Goal Set Partitioning for Directed Multi- Goal Test Generation

Abstract In this paper, the author deals with the properties of circumscribed ellipses of convex quadrilaterals, using tools of parallel projective tr

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings

How To Win The Ashes A Statistician s Guide

Assignment. To New Heights! Variance in Subjective and Random Samples. Use the table to answer Questions 2 through 7.

Navigate to the golf data folder and make it your working directory. Load the data by typing

Genetics and Inheritance

Quality Assurance Charting for QC Data

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Supplementary Online Content

To Illuminate or Not to Illuminate: Roadway Lighting as It Affects Traffic Safety at Intersections

The Effects of Altitude on Soccer Match Outcomes

In aseptic processing, performance

Impact of a Pilot Walking School Bus Intervention on Children s Pedestrian Safety Behaviors

INTRODUCTION Microfilm copy of the Draper Collection of manuscripts. Originals located at the State Historical Society of Wisconsin.

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Tie Breaking Procedure

8. International Matchplay-Trophy 2017 Golfclub Sinsheim

Predicting Results of March Madness Using the Probability Self-Consistent Method

ADB Sri Lanka Resident Mission

Risk-Based Inspection Pressure Relief Devices

A definition of depth for functional observations

An Analysis of the Travel Conditions on the U. S. 52 Bypass. Bypass in Lafayette, Indiana.

DESIGN AND ANALYSIS OF ALGORITHMS (DAA 2017)

Five Great Activities Using Spinners. 1. In the circle, which cell will you most likely spin the most times? Try it.

Math 243 Section 4.1 The Normal Distribution

Paper 2.2. Operation of Ultrasonic Flow Meters at Conditions Different Than Their Calibration

Holly Burns. Publisher Mary D. Smith, M.S. Ed. Author

Transcription:

Combining Experimental and Non-Experimental Design in Causal Inference Kari Lock Morgan Department of Statistics Penn State University Rao Prize Conference May 12 th, 2017

A Tribute to Don Design trumps analysis Motivated by a real study Experimental design & rerandomization Observational study & propensity scores Rubin causal model & potential outcomes Educational testing (AP scores) (Missing data) (Noncompliance)

Design trumps Analysis For Objective Causal Inference, Design trumps Analysis Rubin 2008 X = covariates, W = treatment, Y = outcome(s) Design W X Analysis Y W, X Balance covariates As much as possible should be done without observed outcomes!

Knowledge in Action Goal: estimate causal effect of Knowledge in Action (KIA) (a form of project-based learning) in AP classes on AP scores and other outcomes Part 1 ( Efficacy Study ): randomize schools to KIA or control; compare outcomes after 1 year Part 2 ( Maturation Study ): continue to follow schools another year (experimental & observational)

Districts (blocks) *In this talk I ll just focus on one district District 1 District 5 Schools (clusters) RANDOMIZATION Teachers Students OUTCOMES

Covariates Covariates available at randomization: School covariates (e.g. Title 1 status, type, etc.) Teacher covariates (e.g. years of experience) Previous student (class) covariates: Race/ethnicity Poverty status Parental education PSAT scores x 1 8 th grade standardized test scores Total number of students Number of students who took the AP exam If covariates are available, we should use them when we randomize! x 2 2 covariates used for randomization

Rerandomization Collect covariate data Specify criteria for acceptable balance (Re)randomize Randomize units units to to treatment groups Check balance xx 1,TT xx 1,CC < 0.05 and xx 2,TT xx 2,CC < 0.05 unacceptable acceptable Conduct experiment Analyze results

Covariate Balance: Empirical Percent reduction in variance: PPPPPPPP = vvvvvv xx jj,tt xx jj,cc vvvvvv xx jj,tt xx jj,cc rrrrrrrrrrrr. vvvvvv xx jj,tt xx jj,cc

Covariate Balance: Theoretical Suppose xx jj,tt xx jj,cc ~ Normal for jj 1 kk xx 1 xx 2 xx kk Rerandomize if xx jj,tt xx jj,cc aa jj for jj 1 kk Then the PRIV for xx jj is pp xx1 = 0.984 pp xx2 = 0.973 pp zzjj = 1 2 γγ 3 2, aa jj 2 2vvvvvv(xx jj ) γγ 1 2, aa jj 2 2vvvvvv(xx jj ) 1 nn + 1 TT nncc 1 nn + 1 TT nncc, where γγ(bb, cc) 0 cc yy bb 1 ee yy dddd.

Outcome PRIV If rerandomization is equal percent variance reducing (EPVR), then PRIV for the outcome difference in means is PPPPPPPP YY = RR 2 PPPPPPPP XX Here, RR 2 0.75 and PPPPPPPP XX 98%, so PPPPPPPP YY 0.75 0.98 = 74% Precision increases by a factor of 1 1 0.74 = 3.85 Equivalent to almost quadrupling n!!! (Effective sample size goes from 76 to 293!) NOTE: This is TRUE variance! Need randomization-based inference to reflect this

Correlational Structure

Affine Invariance Affine invariance: rerandomization stays the same for any affine transformation a + bx If rerandomization criterion is affinely invariant and x is ellipsoidally symmetric 1. Ε XX TT XX cc rerand. = Ε XX TT XX cc = 00 => Rerandomization leads to unbiased estimates for any linear function of x 2. cov XX TT XX cc rerand. cov XX TT XX cc Preserves the correlations of XX TT XX cc Balance improvement equal for each xx jj (equal percent variance reducing) (Morgan and Rubin, Annals of Statistics, 2012)

Mahalanobis Mahalanobis: XX TT XX cc cov xx 1 XX TT XX cc

Knowledge in Action Part 1 ( Efficacy Study ): randomize schools to KIA or control; compare outcomes after 1 year Part 2 ( Maturation Study ): continue to follow schools another year (experimental & observational)

Covariate data for schools not in RCT MATCHING Matched Sample: 2 years of KIA no KIA Covariate data for schools in RCT RANDOMIZE WAVE 1: WAVE 2: KIA KIA: 2 nd year KIA: 1 st year 2 years of KIA no KIA? 1 year of KIA no KIA 2 years of KIA 1 year of KIA 2015-2016 2016-2017 2017-2018

2 years of KIA no KIA? 2 years of KIA no KIA Non-experimental direct approach Matched Sample: WAVE 1: KIA KIA: 2 nd year WHICH IS BETTER??? WAVE 2: KIA: 1 st year 1 year of KIA no KIA 2 years of KIA 1 year of KIA Experimental indirect approach 2016-2017 2017-2018

Potential Outcomes & Estimands YY jj (WW jj, tt)= potential outcome for school j under treatment WW jj in year t Causal effect: compare potential outcomes under different treatments ττ 1,tt YY 1, tt YY 0, tt = nn jj=1 YY jj 1, tt nn nn jj=1 YY jj 0, tt nn ττ 2 1,tt YY 2, tt YY 1, tt = nn jj=1 YY jj 2, tt nn nn jj=1 YY jj 1, tt nn ττ 2,tt YY 2, tt YY 0, tt = nn jj=1 YY jj 2, tt nn nn jj=1 YY jj 1, tt nn

Estimators ττ 1,2017 nn jj=1 WWjj YY jj (1,2017) nn jj=1 jj=1 WW jj nn (1 WWjj ) YY jj (0,2017) nn jj=1 (1 WW jj ) ττ 2 1,2018 nn jj=1 II WWjj =2 YY jj (2,2018) nn jj=1 II WWjj =2 nn jj=1 II WWjj =1 YY jj (1,2018) nn jj=1 II WWjj =1 ττ 2,2018 nn jj=1 II WWjj =2 YY jj (2,2018) nn jj=1 II WWjj =2 nn jj=1 II WWjj =1 YY jj (0,2018) nn jj=1 II WWjj =1

2 years of KIA no KIA? 2 years of KIA no KIA Non-experimental direct approach Matched Sample: WAVE 1: KIA KIA: 2 nd year WHICH IS BETTER??? WAVE 2: KIA: 1 st year 1 year of KIA no KIA 2 years of KIA 1 year of KIA Experimental indirect approach 2016-2017 2017-2018

Propensity Score Matching 1 if in Wave 1 of experiment WW jj = 0 if not in experiment Propensity score: ee jj = PP WW jj = 1 xx jj ) Match each Wave 1 teacher with a control with a similar propensity score Criteria for success: Quality of observed covariate data can only balance observed data Good matches available adequate overlap between groups large enough pool of potential controls

Propensity Score Matching If we have good matches, we can balance observed covariates Key point: unless we have data on all relevant covariates (which we won t), there will still be bias (baseline differences) Usually hard to quantify this bias BUT we have a very rare feature!!

1 year of KIA no KIA 2 years of KIA no KIA Matched Sample: WAVE 1: WAVE 2: KIA KIA: 2 nd year KIA: 1 st year We can validate the nonexperimental approach by comparing 1 year impact estimates! 1 year of KIA no KIA 2016-2017 2017-2018

2 years of KIA no KIA? 1 year of KIA no KIA 2 years of KIA no KIA Non-experimental direct approach Matched Sample: WAVE 1: KIA KIA: 2 nd year WHICH IS BETTER??? WAVE 2: KIA: 1 st year 1 year of KIA no KIA 2 years of KIA 1 year of KIA Experimental indirect approach 2016-2017 2017-2018

Experimental Indirect Approach ττ 2 1,2018 + ττ 1,2017 = YY 2,2018 YY 1,2018 + YY 1,2017 YY 0,2017 Critical assumption: potential outcomes may depend on year, but treatment effects do not That is, YY 1,2017 YY 1,2018, but ττ 1,2017 = ττ 1,2018 ττ 1 This implies ττ 1 + ττ 2 1 = ττ 2

Define ττ 2 ττ 1 + ττ 2 1 Unbiased Theorem: Assuming treatment effects do not vary by year, Ε ττ 2 = ττ 2. Proof: Ε ττ 2 = E ττ 1 + ττ 2 1 = ττ 1 + ττ 2 1 = ττ 2.

Variance vvvvvv( ττ 2 ) = vvvvvv( ττ 1 + ττ 2 1 ) = vvvvvv ττ 1 +vvvvvv ττ 2 1 + 2cov( ττ 1, ττ 2 1 ) Both estimates are comparisons of the same teachers; likely to be highly positively correlated More than double the variance of each individual estimate

Constant Treatment Effect? Suppose constant treatment effect, so YY jj 1, tt = YY jj 0, tt + ττ 1 and YY jj 2, tt = YY jj 1, tt + ττ 2 1 jj. Then: o o ττ 1 = ττ 1 + YY WWWWWWWWW (0, 2017) YY WWWWWWWWW (0, 2017) ττ 2 1 = ττ 2 1 + YY WWWWWWWWW (0, 2018) YY WWWWWWWWW (0, 2018) Under additivity, and if we again assume differences in time cancel with comparisons within the same year, then ττ 1 and ττ 2 1 are perfectly correlated! vvvvvv( ττ 2 ) = vvvvvv ττ 1 +vvvvvv ττ 2 1 + 2 vvvvvv ττ 1 vvvvvv ττ 2 1 If vvvvvv ττ 1 vvvvvv ττ 2 1, then vvvvvv( ττ 2 ) 4vvvvvv ττ 1

2 years of KIA no KIA? 1 year of KIA no KIA 2 years of KIA no KIA Non-experimental direct approach Matched Sample: WHICH IS BETTER??? WAVE 1: WAVE 2: KIA KIA: 2 nd year KIA: 1 st year BIAS- VARIANCE TRADEOFF! Complementary! 1 year of KIA no KIA 2 years of KIA 1 year of KIA Experimental indirect approach 2016-2017 2017-2018

Other Interesting Tidbits Student-level versus school level analysis Combined analyses? Student/parental consent => missing data Joiners Non-compliance Teachers switching schools/courses Anticipation bias and more!

Conclusion Rerandomization can improve experimental design Propensity score matching can improve observational studies Bias-variance tradeoff for 2 year impact Lots of fun statistics in rich applied problems!

klm47@psu.edu Funded by George Lucas Educational Foundation Joint work with Anna Saavedra, Amie Rappaport, Ying Liu, and Juan Saavedra

Weighting Option 1: Weight schools equally ττ 1 = nn jj=1 WW jj YY jj 1 nn jj=1 nn (1 WW jj ) YY jj 0 WW nn jj (1 WW jj ) jj=1 jj=1 Option 2: Weight students equally nn ττ 1 = jj=1 WW jj YY jj 1 nn jj nn jj=1(1 WW jj ) YY jj 0 nn jj WW jj nn nn jj (1 WW jj )nn jj jj=1 nn jj=1 Differing number of students (3-127) ττ may vary with class size = nn jj=1 nn jj WWjj YYii ii=1 (1) nn nn nn jj jj=1 (1 WWjj ) YYii ii=1 (0) WW jj nn nn jj (1 WW jj )nn jj jj=1 jj=1

Multilevel Model 2 Student-level: YY ii WW jj ii ~NN μμ jj WW jj + ββ 1 xx 1, σσ YY 2 School-level: μμ jj WW jj ~NN αα kk + ττww jj + ββ 2 xx 2, σσ μμ District-level: αα kk ~NN(αα + ββ 3 xx 3, σσ 2 αα ) Smaller schools shrink more; in between the two weighting extremes