Exploring Factors Affecting Metrorail Ridership in Washington D.C. Chao Liu, Ph.D., Hiro Iseki, Ph.D. National Center for Smart Growth University of Maryland, College Park September 2015, GIS in Transit Conference 1
Motivation for this research Overview of Direct Ridership Model Visualization of the key variables Methodology Ordinary Least Squares (OLS) Geographically Weighted Regression (GWR) Results
5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00 Ridership per Half Hour Bethesda Our Rail Stations Show Dramatically Different Ridership Patterns Daily Ridership: 11,500 1000 900 800 700 600 500 400 300 200 100 0 Daily Ridership: 6,300 Suitland Bethesda Suitland 3
Direct Ridership Model (DRM) DRM quantifies the relationship between station-level transit ridership and factors of land use, transit service, and socio-demographics. DRM provides an estimate of ridership increase, given certain land use changes (e.g., changing the number of households or jobs). 10-min. walk Land Use Ridership 4
Direct Ridership Model Households Jobs, by type Demographics Transit service & Accessibility Built Environment Multivariate regressions Walk/Bike Ridership By Time Period By Station Walk Area (10-minute walking) Demographics: Transit characteristics: Built environment: household income, education level, etc. level of service, level of accessibility, network connectivity, parking Density (households & jobs), diversity (land use & industry), design (intersection density), job-accessibility, walk-score etc. 5
Half-Mile Walk Sheds: Examples Ballston Cheverly 6
Walk/bike AM peak entries vs. PM Peak entries Walk/Bike AM entries: include many commuters from suburban stations Walk/Bike PM entries: include many commuters to downtown DC 7
Households vs. jobs Households: distribute across the region with high concentration at several stations, such as Dupont Circle and Columbia Heights. Jobs: include jobs in downtown DC, as well as a few suburban stations, such as Bethesda and Ballston. 8
Jobs Accessible in 30 Minutes Job accessibility: the number of jobs can be accessed from each station within 30-minute transit time and 10-minute walking 9
AM peak entries model Objective Estimate ridership increase by adding X number of households (HHs) Regression Related a change in household to a change in ridership Added a term to indicate CBD in the model to improve accuracy 10
AM peak entries model-results Independent Variable Coefficient P-value Households000050miles 0.14 0.010 HHsXJobsAccessRailvHighway 0.25 0.000 MedianHHIncome 0.005 0.002 HHsXGoodService 0.16 0.000 IntersectionH -3.73 0.180 DowntownCore* -279.73 0.066 Constant 136.86 0.252 r-squared: 0.84 Note: After testing different samples, downtowncore includes: Dupont Circle S, Farragut North S, Metro Center, Farragut West, Foggy Bottom, McPherson Square, Federal Triangle, Smithsonian S
Predicted AM Entries per HH Important variables: Households (Household) x (Job accessibility) (Household) x (Good service) Median house hold income Downtown core Note: Interactive effects 12
PM peak entries model Objective Estimate ridership increase by adding X number of jobs Regression A statistical issue persisted due to large variance in both ridership and jobs. Station grouping helped to solve this issue. Group 1 (orange): Downtown Stations : in the downtown core Group 2 (green): Low Job Stations : with jobs < 2,500 Group 3 (red): Mixed Job Suburban Stations : other stations 13
Grouping of Stations Station grouping Group 1 (orange) Downtown Stations : in the downtown core Group 2 (green) Low Job Stations jobs < 2500 Group 3 (red) Mixed Job Suburban Stations others 14
PM peak entries model (3 groups) Independent Variable Group 1 Downtown Stations Group 2 Low Job Stations Group 3 Mixed Job Suburban Stations jobs in half-mile walk sheds 0.22 0.34 0.14 Trains per hour NS NS 52.38 (Jobs) x (Miles from downtown) NS -0.02-0.008 N 21 35 24 R 2 0.77 0.53 0.89 Notes: significant level is 0.05; NS means statistically not significant; for Group2 and Group3, the total effect of adding jobs is determined by both jobs and miles from downtown. 15
PM Entries per Job PM Entries per Job Predicted PM Peak Entries per New Job in the Walk Shed 16
Example Results and Trade-offs Imagine 274 new HHs near Metro: at Rhode Island Avenue: 144 new entries/day at New Carrollton: 52 new entries/day Imagine 10,000 new 9-5 jobs near Metro: At NoMa: 3,550 new entries/day At Friendship Hts: 1,341 new entries/day At Greenbelt: 881 new entries/day 18
19
Motivations In a typically linear regression model, coefficients of independent variables are considered constant across observations ---- global estimates GWR allows the estimated coefficient vary among observations and can provide local estimates of the transit ridership. Not all TODs are alike. Certain predictors play more important roles than others at different Metro stations. Specific results can be used as evidence to support local policies and decision-making. 20
AM peak entries model-results Independent Variable Coefficient P-value Households000050miles 0.14 0.010 HHsXJobsAccessRailvHighway 0.25 0.000 MedianHHIncome 0.005 0.002 HHsXGoodService 0.16 0.000 IntersectionH -3.73 0.180 DowntownCore* -279.73 0.066 Constant 136.86 0.252 r-squared: 0.84 Note: After testing different samples, downtowncore includes: Dupont Circle S, Farragut North S, Metro Center, Farragut West, Foggy Bottom, McPherson Square, Federal Triangle, Smithsonian S
Coefficients of the number of households 22
Coefficients of median income 23
PM peak entries model (3 groups) Independent Variable Group 1 Downtown Stations Group 2 Low Job Stations Group 3 Mixed Job Suburban Stations jobs in half-mile walk sheds 0.22 0.34 0.14 Trains per hour NS NS 52.38 (Jobs) x (Miles from downtown) NS -0.02-0.008 N 21 35 24 R 2 0.77 0.53 0.89 Notes: significant level is 0.05; NS means statistically not significant; for Group2 and Group3, the total effect of adding jobs is determined by both jobs and miles from downtown. 24
Coefficients of the number of jobs 25
Predicted PM entries 26
PM Entries per Job PM Entries per Job Predicted PM Peak Entries per New Job in the Walk Shed 27
AM peak entries model OLS vs. GWR OLS GWR R 2 0.837 0.907 Adjusted R 2 0.824 0.845 AIC 1241.16 1262.951 Number of parameters 7 42 Sigma (σ) 340.1 344.924 28
Category Variables Description Data source Dependent Variable Walk_Bike_Entries Walk_Bike_Exits Average weekday ridership for September 2013, by AM peak periods entries. WMATA, 2013 Transit Service Parking 2013 These are the number of daily parking transactions at WMATAowned parking facilities at the station, from our fare system. WMATA 2013 AM Peak headway Number of trains of both directions in AM peak WMATA 2013 Built Environment transit connectivity index* Households- in half - mile walk sheds Jobs-in half - mile walk sheds Intersection in half - mile walk sheds Miles from the downtown core Composite index based on graph theory including transit routes, coverage, speed, capacity, urban form, etc Number of households within a half-mile catchment area of stations NCSG 2010 WMATA 2012 Number of Jobs within a half-mile catchment area of stations WMATA 2012 Number of intersections within a half-mile catchment area around stations WMATA 2012 Distance of each station to downtown WMATA 2012 job accessibility by auto * Job accessibility by transit* industry diversity * location dummies* Number of jobs that can be accessed within 30 minutes by auto using the Maryland State Transportation model (MSTM) Number of jobs that can be accessed within 45 minutes by transit using the MSTM Entropy index of industry diversity within a half-mile catchment area of stations Downtown, municipalities, redline NCSG 2012 NCSG 2012 NCSG 2012 NCSG 2012 WMATA 2012 Walk Score https://www.walkscore.com/ Walk Score 2013 Sociodemographics Median household income Median household income within a half-mile catchment area of stations WMATA 2012 32
Appendix 33
Where did you come from just before entering the Metro station? Walk only 34
Land Use and Ridership: Other Lessons We Learned Peak is all about jobs and households, and accessibility Off-peak is all about certain kinds of jobs Accessibility is significant in nearly all models If you can reach more stuff via transit from a station, ridership generation is stronger. Median income usually helps as control variables Positive on peak, negative at midday, no effect in evening WalkScore, most built environment variables phasing out in face of land use data Parking phasing out, overlap with job access No value in differentiating private vs. federal jobs Nearby jobs and neighborhoods are a small but robust source for off-peak ridership 35
Using Jobs by Type as Proxy for Activity Schedule Job NAICS Types INCLUDED NAICS Excluding Govt 92 922120 922140 922160 922190 9-to-5ers Office 51-56 5616 Nonprofits 813 Manufacturing 33 (near Metrorail, NAICS2 33 is mostly HQs for Boeing, Lockheed Martin, Xerox, etc - not manuf. plants) Shift Workers and Midday Appointments Nights and Weekends All Medical 62 Jails 922140 Hotels 7211 7213 Police/Fire 922120 922160 922190 Security Services 5616 Restaurants, Bars 72 7211 7213 (hotels) Retail 44 45 Concert Halls, Arenas 71 7121 7131 Schools Schools 61 Midday + Weekend Attractors Museums, Zoos, Parks 7121 7131 36
Did You Look At? Rule of Regression: no collinearity. If two explanatory variables are similar across stations, you can only use one E.g., housing density and 4-way intersections So, choose the set of variables that best predict ridership, as proxies for other effects 37
Work Relationship NCSG provided: An additional set of variables for analysis, Technical reports and consultation, STATA program files and GIS maps, and Base-excel files to predict ridership by station. NCSG closely communicated with Justin regarding: Specific requirements: regression specification Discussion of problems in regression analysis Discussion of intermediate results and further steps to explore Translation of regression analysis results in ridership prediction 38
39
AM peak exits model 1 2 3 Group 1 (orange): Downtown Core : in the downtown core Group 2 (green): Low Job Stations : jobshalf < 2500 Group 3 (red): Mixed Job Suburbs : others 40
Group 2 Jobs < 2500 42