Inferring land use from mobile phone activity

Similar documents
Chapter 5 FUNCTIONAL CLASSIFICATION

Improving the Accuracy and Reliability of ACS Estimates for Non-Standard Geographies Used in Local Decision Making

Acknowledgements. Ms. Linda Banister Ms. Tracy With Mr. Hassan Shaheen Mr. Scott Johnston

Rationale. Previous research. Social Network Theory. Main gap 4/13/2011. Main approaches (in offline studies)

DRAFT 5/31/2016. The Value of Self-Reported Frequently Visited Addresses in GPS Assisted Travel Surveys Timothy Michalowski Dara Seidl

Understanding the Pattern of Work Travel in India using the Census Data

SOUNDCAST CALIBRATION AND SENSITIVITY TEST RESULTS (DRAFT) TABLE OF CONTENTS. Puget Sound Regional Council. Suzanne Childress.

Comparison of urban energy use and carbon emission in Tokyo, Beijing, Seoul and Shanghai

PREDICTING the outcomes of sporting events

VGI for mapping change in bike ridership

Section I: Multiple Choice Select the best answer for each problem.

A Framework For Integrating Pedestrians into Travel Demand Models

RIDERSHIP PREDICTION

Designing a Bicycle and Pedestrian Count Program in Blacksburg, VA

SoundCast Design Intro

Chapter 12 Practice Test

STRAVA - DATA ACCESS AND USES. March 9, 2016 Shaun Davis + Dewayne Carver Florida Dept. of Transportation

Real-Time Electricity Pricing

Webinar: Development of a Pedestrian Demand Estimation Tool

Spatial Patterns / relationships. Model / Predict

Environmental Science: An Indian Journal

MODELING THE ACTIVITY BASED TRAVEL PATTERN OF WORKERS OF AN INDIAN METROPOLITAN CITY: CASE STUDY OF KOLKATA

CSC242: Intro to AI. Lecture 21

Kevin Proft NRS 509 Final Project: Written Overview & Annotated Bibliography GIS Applications in Active Transportation Planning

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

Evaluating and Classifying NBA Free Agents

Active Travel and Exposure to Air Pollution: Implications for Transportation and Land Use Planning

Understanding Transit Demand. E. Beimborn, University of Wisconsin-Milwaukee

NACTO Designing Cities Conference Project Evaluation: Tools for Measuring Success and Building Support. October 29, 2015

Road Congestion Measures Using Instantaneous Information From the Canadian Vehicle Use Study (CVUS)

Planning Daily Work Trip under Congested Abuja Keffi Road Corridor

Forecasting High Speed Rail Ridership Using Aggregate Data:

Bike Counter Correlation

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Preliminary Exploration of Pedestrian Destinations using Traces from WiFi Infrastructures

GIS Based Data Collection / Network Planning On a City Scale. Healthy Communities Active Transportation Workshop, Cleveland, Ohio May 10, 2011

Briefing Paper #1. An Overview of Regional Demand and Mode Share


Transport Poverty in Scotland. August 2016

Deconstructing Data Science

Exploring the relationship between Strava cyclists and all cyclists*.

Analyzing Gainesville s Bicycle Infrastructure

The relationship between sea level and bottom pressure in an eddy permitting ocean model

2017 Northwest Arkansas Trail Usage Monitoring Report

Congestion and Safety: A Spatial Analysis of London

A computer program that improves its performance at some task through experience.

Pedestrian Demand Modeling: Evaluating Pedestrian Risk Exposures

2011 Origin-Destination Survey Bicycle Profile

Summary of NWA Trail Usage Report November 2, 2015

INTRODUCTION TO PATTERN RECOGNITION

A MULTI-MODAL PUBLIC TRANSPORT SOLUTION FOR MALE, MALDIVES

GIS Based Non-Motorized Transportation Planning APA Ohio Statewide Planning Conference. GIS Assisted Non-Motorized Transportation Planning

National Bicycle and Pedestrian Documentation Project INSTRUCTIONS

Predicting NBA Shots

1999 On-Board Sacramento Regional Transit District Survey

Discriminative Feature Selection for Uncertain Graph Classification

Thresholds and Impacts of Walkable Distance for Active School Transportation in Different Contexts

Pocatello Regional Transit Master Transit Plan Draft Recommendations

Fixed Guideway Transit Outcomes on Rents, Jobs, and People and Housing

Brazil Baseline and Mitigation Scenarios

Journal of Emerging Trends in Computing and Information Sciences

Appendix C. Corridor Spacing Research

Dynamic Cluster-Based Over-Demand Prediction in Bike Sharing Systems

Modeling Driving Behavior at Roundabouts: Impact of Roundabout Layout and Surrounding Traffic on Driving Behavior

A N E X P L O R AT I O N W I T H N E W Y O R K C I T Y TA X I D ATA S E T

Webinar: The Association Between Light Rail Transit, Streetcars and Bus Rapid Transit on Jobs, People and Rents

Validation of 12.5 km Resolution Coastal Winds. Barry Vanhoff, COAS/OSU Funding by NASA/NOAA

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

Mobility Detection Using Everyday GSM Traces

Resource Allocation for Malaria Prevention

Lecture 5. Optimisation. Regularisation

Deconstructing Data Science

Rolling Out Measures of Non-Motorized Accessibility: What Can We Now Say? Kevin J. Krizek University of Colorado

Analyzing Spatial Patterns, Statistics Based on FAST Act Safety Performance Measures

Using Spatio-Temporal Data To Create A Shot Probability Model

Bicycle Demand Forecasting for Bloomingdale Trail in Chicago

Travel Behavior of Baby Boomers in Suburban Age Restricted Communities

Feasibility Analysis of China s Traffic Congestion Charge Legislation

Temporal and Spatial Variation in Non-motorized Traffic in Minneapolis: Some Preliminary Analyses

Name May 3, 2007 Math Probability and Statistics

Frequently asked questions about how the Transport Walkability Index was calculated are answered below.

Projecting Three-Point Percentages for the NBA Draft

Urban planners have invested a lot of energy in the idea of transit-oriented

R&D Initiatives for Sustainable Mobility

Analysis On Night-time Public Transportation Access In Seoul: How Do People Travel At Night In Seoul Using Taxies?

ABSTRACT PEDESTRIAN - VEHICULAR CRASHES: THE INFLUENCE OF PERSONAL AND ENVIRONMENTAL FACTORS. Carolina V. Burnier, M.Sc., 2005

Exploring Factors Affecting Metrorail Ridership in Washington D.C.

Transport Poverty in Scotland. August 2016

Travel Characteristics on Weekends:

Station heterogeneity and the dynamics of retail gasoline prices

March 6, 2013 Tony Giarrusso, Rama Sivakumar Center for GIS, Georgia Institute of Technology

DEVELOPMENT OF A SET OF TRIP GENERATION MODELS FOR TRAVEL DEMAND ESTIMATION IN THE COLOMBO METROPOLITAN REGION

2014 QUICK FACTS ILLINOIS CRASH INFORMATION. Illinois Emergency Medical Services for Children February 2016 Edition

2012 QUICK FACTS ILLINOIS CRASH INFORMATION. Illinois Emergency Medical Services for Children September 2014 Edition

Market Factors and Demand Analysis. World Bank

CITI BIKE S NETWORK, ACCESSIBILITY AND PROSPECTS FOR EXPANSION. Is New York City s Premier Bike Sharing Program Accesssible to All?

IVR Caller Experience Metrics. World IA Day 2013 March 2

February Funded by NIEHS Grant #P50ES RAND Center for Population Health and Health Disparities

An Analysis of Central Business District Pedestrian Circulation Patterns

Presentation Summary Why Use GIS for Ped Planning? What Tools are Most Useful? How Can They be Applied? Pedestrian GIS Tools What are they good for?

Transcription:

Inferring land use from mobile phone activity Jameson L. Toole (MIT) Michael Ulm (AIT) Dietmar Bauer (AIT) Marta C. Gonzalez (MIT) UrbComp 2012 Beijing, China 1

The Big Questions Can land use be predicted from mobile phone activity? Can mobile phone data help measure temporal patterns of land use? Are these patterns robust across cities? 2

Sensing Urban Systems Data Scarcity Methods Outputs Abundance Traditional Data: Zoning Maps Novel Data: Mobile phone activity Statistical Physics Machine learning Geographic Boundaries (Ratti et al 2010) Intra-city Mobility (Cho et al 2011) Inter-city Mobility (Gonzalez et al 2008) (Wang et al 2009) (Simini 2012) Unsupervised Classification (Reades 2009) (Calabrese 2010) (Soto 2011)(Zheng 2012) 3

Land use and Travel Behavior travel demand Land-use supply Temporal Component opening hours of shops and services available time Individual Component income, gender, education level Travel supply Demand available time for activities vehicle ownership, etc. Demand where a person wants to go when a person can travel a persons ability to travel a persons desire to travel Static GIS data + = New micro level data from mobile phones Dynamic Land Use adapted from: K.T. Geurs, B. van Wee / Journal of Transport Geography 12 (2004) 127 140 4

Data Zoning Data GIS shapefiles at the parcel level Provided by MASSDOT Aggregated to 5 zoning classification: Residential, Commercial, Industrial, Parks, Other Mobile Phone ~600,000 Users in the Boston Metropolitan Area (Pop: ~3 million, Mkt Share: 30%-50%) 1 month of Call Detail Records (CDR) for voice calls and text messages. Each event geo-tagged with (Lat, Lon) triangulated from nearby towers Grid Partition the region into 200m X 200m cells Each cell is labeled with the most prevalent zoning type within Grid provides privacy All mobile phone events are aggregated hourly into an average week 5

6

ZONING DATA 40km Parks Other CBD 7

Zoned use & Dynamic Population Results: Absolute Activity Downtown Boston is mostly zoned as OTHER thus more activity is related to more people. Abs. Act. 120 100 80 60 40 20 Parks Other 8

Zoned use & Dynamic Population Results: Normalized Activity a norm ij (t) = aabs ij (t) µ a abs ij σ a abs ij 1 0 Parks Other Similarities are related to how individuals use mobile phones, not how they move. 9

Zoned use & Dynamic Population Results: Residual Activity a res ij (t) =a norm (t) ā ij norm (t) Inverse relationship between RESIDENTIAL and COMMERCIAL Hour Residential Industrial Parks Other Weekends are quantitatively different than weekdays. 10

Zoned use & Dynamic Population static population Abs. Act. 120 100 80 60 40 20 phone use 1 0 movement 0.2 0 Hour Parks Other 11

Absolute Activity Results: Low Activity High Activity 12

Inferring Land Use Methods - Supervised Learning: Random Decision Tree Θ 1 1 <y 1 h(x, Θ k ) = c Θ 1 2 <y 2 Θ 1 3 <y 3 c 1 c 2 c 3 c 4 13

Inferring Land Use Methods - Supervised Learning: Random Forest Random Forest: (Breiman 2001) 1.1. A random forest i {h(x, k ), k = 1,...} ectors and each tree casts x is a T 1vector Θ k is a m 1vector,(m<T) Θ 1 1 <y 1...... Θ k 1 <z 1 Θ 1 2 <y 2 Θ 1 3 <y 3 Θ k 2 <z 2 Θ k 3 <z 3 c 1 c 2 c 3 c 4 c 1 c 2 c 3 c 4 14

Inferring Land Use Methods - Supervised Learning: Random Forest Random Forest: (Breiman 2001) 1.1. A random forest i {h(x, k ), k = 1,...} ectors and each tree casts x is a T 1vector Θ k is a m 1vector,(m<T) Θ 1 1 <y 1...... Θ k 1 <z 1 Θ 1 2 <y 2 Θ 1 3 <y 3 Θ k 2 <z 2 Θ k 3 <z 3 c 1 c 2 c 3 c 4 c 1 c 3 c 4 c 2 {h(x, k ) } =...... } {c2c2 c3 c2 c1 c2 c1 c4 ectors and c ^ = mode( {h(x, k ) }) perf = N ^ i I(c i = j) N ectors and 15

Results: Inferring Land Use Confusion Matrix Thresh Total Accuracy Weights Res Com Ind Prk Oth 500 0.538 ( 0.60 0.10 0.10 0.10 0.10) 0.62 0.21 0.15 0.01 0.01 0.30 0.48 0.19 0.00 0.02 0.33 0.27 0.38 0.00 0.02 Res Com Ind Prk Oth 0.52 0.26 0.18 0.02 0.02 Share: 0.74 0.09 0.08 0.04 0.05 0.37 0.28 0.25 0.00 0.10 Res Com Ind Prk Oth 16

Inferring Land Use Results: Confusion Matrix Thresh Total Accuracy Weights Res Com Ind Prk Oth 500 0.538 ( 0.60 0.10 0.10 0.10 0.10) 0.62 0.21 0.15 0.01 0.01 0.30 0.48 0.19 0.00 0.02 0.33 0.27 0.38 0.00 0.02 Res Com Ind Prk Oth 0.52 0.26 0.18 0.02 0.02 Share: 0.74 0.09 0.08 0.04 0.05 0.37 0.28 0.25 0.00 0.10 17 + + + + = 1 Res Com Ind Prk Oth

Results: Inferring Land Use Confusion Matrix Thresh Total Accuracy Weights Res Prk Ind Con Oth 500 0.396 ( N/A 0.30 0.30 0.20 0.20) N/A N/A N/A N/A N/A N/A 0.50 0.19 0.11 0.19 N/A 0.27 0.37 0.12 0.24 Res Com Ind Prk Oth N/A 0.31 0.18 0.29 0.21 Share: 0.00 0.33 0.31 0.16 0.20 N/A 0.26 0.24 0.15 0.34 18

Inferring Land Use Classification Group I Group II Group III Error Analysis Cells correctly predicted to be a Cells of a given use incorrectly Cells of a different use incorrectly predicted to be a given Residential I II III Commercial I II III 19

Robustness Across Cities Cross Tabulation Thresh Avg. Error Cutoffs Res Com Ind Par Oth 100 0.468 ( 0.50 0.12 0.12 0.12 0.12) 0.66 0.09 0.08 0.05 0.12 0.53 0.15 0.10 0.05 0.16 0.52 0.10 0.19 0.05 0.14 Res Com Ind Prk Oth 0.59 0.10 0.11 0.06 0.13 Share: 0.63 0.09 0.09 0.09 0.10 0.55 0.12 0.10 0.06 0.17 20

Robustness Across Cities 80 City wide Time Series Chicago LBS Data Abs. Act. 60 40 20 Norm. Act. 0.5 0 0.5 Res. Act. 0.1 0.05 0 0.05 0.1 Hour 21

Inferring Land Use Summary: Introduced a way to reconcile and normalize traditional and new data sources Used temporal activity patterns to identify land uses Performed error analysis to determine why classifications failed Similar results obtained for different cities Implications: Mobile phone activity can be radically different depending on zoned land use. Traditional zoning maps should be augmented to account for real-time use. Dynamic land use patterns seem relatively consistent despite differences in history and regulatory classification 22

Future Contributions Use more sophisticated feature spaces for zoning classification. Introduce new data sources. Measure urban system dynamics in multiple cities around the globe to identify similarities and differences in behavior. Apply normalization metrics to other data sources. Combine our understanding of temporal land use with measurements of human mobility to develop better models of intra-city travel behavior. 23

Thank You! Questions? 24

Errors 25

Correlation with Static Population 26

Group I Group II Group III Classification Error Analysis Cells correctly predicted to be a Cells of a given use incorrectly Cells of a different use incorrectly predicted to be a given Residential I II III Commercial Industrial I II III I II III Parks Other I II III I II III 27