Deconstructing Data Science

Size: px
Start display at page:

Download "Deconstructing Data Science"

Transcription

1 Deconstructing Data Science David Bamman, UC Berkele Info 29 Lecture 4: Regression overview Jan 26, 217

2 Regression A mapping from input data (drawn from instance space ) to a point in R (R = the set of real numbers) = the empire state building =

3 task predicting bo office revenue movie total bo office predicting stock movements predicting vote share $TWTR price at time t+1 Clinton 47%

4 Regression Supervised learning Given training data in the form of <, > pairs, learn ĥ()

5 Regression Can ou create (or find) labeled data that marks that value for a bunch of eamples? Can ou make that choice? Can ou create features that might help in distinguishing those classes?

6 Eperiment design training development testing size 8% 1% 1% purpose training models model selection evaluation; never look at it until the ver end

7 Metrics Measure difference between the prediction ŷ and the true Mean squared error 1 N (ŷ i i ) 2 (MSE) N i=1 Mean absolute error 1 N ŷ i i (MAE) N i=1

8 ŷ MAE MSE % of total MAE 98.6% of total MSE MSE error penalizes outliers more than MAE

9 Linear regression F ŷ = i β i i= β R F (F-dimensional vector of real numbers)

10 Polnomial regression F F ŷ = i β a,i + 2 i β b,i i=1 i=1 4 3 ^2 2 1 βa, βb R F (F-dimensional vector of real numbers)

11 Polnomial regression F F F ŷ = i β a,i + 2 i β b,i + 3 i β c,i i=1 i=1 i=1 5 ^3-5 βa, βb, βc R F (F-dimensional vector of real numbers)

12 Nonlinear regression Deep learning Decision trees Probabilistic graphical models Random forests Support vector machines (regression) Networks Neural networks

13 Number of Parameters order 1 (linear reg.) ŷ = F i=1 i β a,i F F order 2 ŷ = i β a,i + 2 i β b,i i=1 i=1 F F F order 3 ŷ = i β a,i + 2 i β b,i + 3 i β c,i i=1 i=1 i=1

14

15 instance space labeled data labeled data labeled data

16 degree 1, training MSE = 73.4

17 degree 2, training MSE = 71.9

18 degree 3, training MSE = 6.9

19 degree 4, training MSE = 6.6

20 degree 5, training MSE = 59.1

21 degree 6, training MSE = 5.2

22 degree 7, training MSE = 49.6

23 degree 8, training MSE = 46.8

24 degree 9, training MSE = 41.2

25 degree 1, training MSE = 35.8

26 degree 11, training MSE = 21.1

27 degree 12, training MSE = 18.4

28

29

30

31

32

33 Overfitting Memorizing the nuances (and noise) of the training data that prevents generalizing to unseen data

34 Sources of error Bias: Error due to mis-specifing the relationship between input and the output. [too few parameters, or the wrong kinds] Variance: Error due to sensitivit to random fluctuations in the training data. If ou train on different data, do ou get radicall different predictions? [too man parameters]

35 Low variance High variance Low bias High bias Image from Flach 212

36 Eample: High bias, low variance: Alwas predict Berkele geolocation on Twitter High bias, high variance: Predict most frequent cit in training data Low bias, high variance: man features, some of which capture true signal but capture random noise Low bias, low variance: enough features to capture the true signal

37 Ordinal regression In between classification and regression Y is categorical (e.g.,,, ) Elements of Y are ordered < < <

38 Ordinal regression task Y predicting star ratings movie {,, }

39 Computational Journalism Sarah Cohen, James T. Hamilton, and Fred Turner, Computational Journalism, Communications of the ACM (211) Slvain Parasie, Data-Driven Revelation? Epistemological tensions in investigative journalism in the age of big data, Digital Journalism (215)

40 Computational Journalism Changing how stories are discovered, presented, aggregated, monetized and archived (Cohen et al. 212) Draws on earlier tradition of computer-assisted reporting and precision journalism (Meer 1972)

41 Computational Journalism Database linking, e.g.: voting records to the deceased press releases from different members of congress indictments/settlements from U.S. attornes documents from SEC, Pentagon, defense contractors to note movement to industr (Cohen 212) DSA database of safet status of CA public schools + US seismic zones + school list from CA Dept of (Parasie 215)

42 Computational Journalism Information etraction: need to pull out people, places, organizations and their relationship from large (often sudden) dumps of documents. Analzing the relationship between entities

43 Computational Journalism Data-driven stories about large-scale trends Relationship between birth ear and political views NY Times (Jul 7, 214) Change in insured Americans under the ACA, NY Times (Oct 29, 214) 43

44 Computational Journalism Data-driven lead generation; the outliers in analsis that point to a stor

45 Computational Journalism Demands: High precision Fast turnaround Needs (Stra 216): Accurate document analsis Guided search Interactive methods

46 Project proposal, due 2/16 Collaborative project (involving up to 3 students), where the methods learned in class will be used to draw inferences about the world and criticall assess the qualit of those results. Proposal (2 pages): outline the work ou re going to undertake formulate a hpothesis to be eamined motivate its rationale as an interesting question worth asking assess its potential to contribute new knowledge b situating it within related literature in the scientific communit. (cite 5 relevant sources) who is the team and what are each of our responsibilities (everone gets the same grade)

Deconstructing Data Science

Deconstructing Data Science Deconstructing Data Science David Bamman, UC Berkele Info 29 Lecture 4: Regression overview Feb 1, 216 Regression A mapping from input data (drawn from instance space ) to a point in R (R = the set of

More information

Building an NFL performance metric

Building an NFL performance metric Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and

More information

Predicting Horse Racing Results with Machine Learning

Predicting Horse Racing Results with Machine Learning Predicting Horse Racing Results with Machine Learning LYU 1703 LIU YIDE 1155062194 Supervisor: Professor Michael R. Lyu Outline Recap of last semester Object of this semester Data Preparation Set to sequence

More information

Projecting Three-Point Percentages for the NBA Draft

Projecting Three-Point Percentages for the NBA Draft Projecting Three-Point Percentages for the NBA Draft Hilary Sun hsun3@stanford.edu Jerold Yu jeroldyu@stanford.edu December 16, 2017 Roland Centeno rcenteno@stanford.edu 1 Introduction As NBA teams have

More information

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG GOAL OF PROJECT The goal is to predict the winners between college men s basketball teams competing in the 2018 (NCAA) s March

More information

Estimating the Probability of Winning an NFL Game Using Random Forests

Estimating the Probability of Winning an NFL Game Using Random Forests Estimating the Probability of Winning an NFL Game Using Random Forests Dale Zimmerman February 17, 2017 2 Brian Burke s NFL win probability metric May be found at www.advancednflstats.com, but the site

More information

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II Fun Neural Net Demo Site CS 188: Artificial Intelligence Demo-site: http://playground.tensorflow.org/ Deep Learning II Instructors: Pieter Abbeel & Anca Dragan --- University of California, Berkeley [These

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 6 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Outline 2 Review of lecture 5 The

More information

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 6 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Outline Review of lecture 5 The

More information

Introduction to Machine Learning NPFL 054

Introduction to Machine Learning NPFL 054 Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká hladka@ufal.mff.cuni.cz Martin Holub holub@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and

More information

PREDICTING the outcomes of sporting events

PREDICTING the outcomes of sporting events CS 229 FINAL PROJECT, AUTUMN 2014 1 Predicting National Basketball Association Winners Jasper Lin, Logan Short, and Vishnu Sundaresan Abstract We used National Basketball Associations box scores from 1991-1998

More information

A Novel Approach to Predicting the Results of NBA Matches

A Novel Approach to Predicting the Results of NBA Matches A Novel Approach to Predicting the Results of NBA Matches Omid Aryan Stanford University aryano@stanford.edu Ali Reza Sharafat Stanford University sharafat@stanford.edu Abstract The current paper presents

More information

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Unit 4: Inference for numerical variables Lecture 3: ANOVA Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June 10, 2013 Announcements Announcements Proposals due tomorrow. Will be returned to you by Wednesday. You MUST

More information

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING A N D R E W L E V A N D O S K I A N D J O N A T H A N L O B O The Ringer/Getty Images THE TOURNAMENT MARCH MADNESS 68 teams (4 play-in games)

More information

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held As part of my research in Sebastian Thrun's autonomous driving team, my goal is to predict the wait-time for a car at a 4-way intersection.

More information

Efficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia

Efficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia Efficiency Wages in Major League Baseball Starting Pitchers 1998-2001 Greg Madonia Statement of Problem Free agency has existed in Major League Baseball (MLB) since 1974. This is a mechanism that allows

More information

CSC242: Intro to AI. Lecture 21

CSC242: Intro to AI. Lecture 21 CSC242: Intro to AI Lecture 21 Quiz Stop Time: 2:15 Learning (from Examples) Learning Learning gives computers the ability to learn without being explicitly programmed (Samuel, 1959)... agents that can

More information

Evaluating and Classifying NBA Free Agents

Evaluating and Classifying NBA Free Agents Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful

More information

Environmental Science: An Indian Journal

Environmental Science: An Indian Journal Environmental Science: An Indian Journal Research Vol 14 Iss 1 Flow Pattern and Liquid Holdup Prediction in Multiphase Flow by Machine Learning Approach Chandrasekaran S *, Kumar S Petroleum Engineering

More information

CS 221 PROJECT FINAL

CS 221 PROJECT FINAL CS 221 PROJECT FINAL STUART SY AND YUSHI HOMMA 1. INTRODUCTION OF TASK ESPN fantasy baseball is a common pastime for many Americans, which, coincidentally, defines a problem whose solution could potentially

More information

Influence of Forecasting Factors and Methods or Bullwhip Effect and Order Rate Variance Ratio in the Two Stage Supply Chain-A Case Study

Influence of Forecasting Factors and Methods or Bullwhip Effect and Order Rate Variance Ratio in the Two Stage Supply Chain-A Case Study International Journal of Engineering and Technical Research (IJETR) ISSN: 31-0869 (O) 454-4698 (P), Volume-4, Issue-1, January 016 Influence of Forecasting Factors and Methods or Bullwhip Effect and Order

More information

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts Jan Kodovský, Jessica Fridrich May 16, 2012 / IH Conference 1 / 19 What is JPEG-compatibility steganalysis? Detects embedding

More information

Basketball data science

Basketball data science Basketball data science University of Brescia, Italy Vienna, April 13, 2018 paola.zuccolotto@unibs.it marica.manisera@unibs.it BDSports, a network of people interested in Sports Analytics http://bodai.unibs.it/bdsports/

More information

Predicting Season-Long Baseball Statistics. By: Brandon Liu and Bryan McLellan

Predicting Season-Long Baseball Statistics. By: Brandon Liu and Bryan McLellan Stanford CS 221 Predicting Season-Long Baseball Statistics By: Brandon Liu and Bryan McLellan Task Definition Though handwritten baseball scorecards have become obsolete, baseball is at its core a statistical

More information

Chapter 12 Practice Test

Chapter 12 Practice Test Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value

More information

Legendre et al Appendices and Supplements, p. 1

Legendre et al Appendices and Supplements, p. 1 Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction

More information

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday. Announcements Announcements UNIT 7: MULTIPLE LINEAR REGRESSION LECTURE 1: INTRODUCTION TO MLR STATISTICS 101 Problem Set 10 Due Wednesday Nicole Dalzell June 15, 2015 Statistics 101 (Nicole Dalzell) U7

More information

knn & Naïve Bayes Hongning Wang

knn & Naïve Bayes Hongning Wang knn & Naïve Bayes Hongning Wang CS@UVa Today s lecture Instance-based classifiers k nearest neighbors Non-parametric learning algorithm Model-based classifiers Naïve Bayes classifier A generative model

More information

Lecture 5. Optimisation. Regularisation

Lecture 5. Optimisation. Regularisation Lecture 5. Optimisation. Regularisation COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne Iterative optimisation Loss functions Coordinate

More information

Inferring land use from mobile phone activity

Inferring land use from mobile phone activity Inferring land use from mobile phone activity Jameson L. Toole (MIT) Michael Ulm (AIT) Dietmar Bauer (AIT) Marta C. Gonzalez (MIT) UrbComp 2012 Beijing, China 1 The Big Questions Can land use be predicted

More information

Universal Style Transfer via Feature Transforms

Universal Style Transfer via Feature Transforms Universal Style Transfer via Feature Transforms Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang UC Merced, Adobe Research, NVIDIA Research Presented: Dong Wang (Refer to slides by

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Jason Corso SUNY at Buffalo 12 January 2009 J. Corso (SUNY at Buffalo) Introduction to Pattern Recognition 12 January 2009 1 / 28 Pattern Recognition By Example Example:

More information

Predicting Horse Racing Results with TensorFlow

Predicting Horse Racing Results with TensorFlow Predicting Horse Racing Results with TensorFlow LYU 1703 LIU YIDE WANG ZUOYANG News CUHK Professor, Gu Mingao, wins 50 MILLIONS dividend using his sure-win statistical strategy. News AlphaGO defeats human

More information

Predicting NBA Shots

Predicting NBA Shots Predicting NBA Shots Brett Meehan Stanford University https://github.com/brettmeehan/cs229 Final Project bmeehan2@stanford.edu Abstract This paper examines the application of various machine learning algorithms

More information

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan Scenario 1: Team 1 scored 200 runs from their 50 overs, and then Team 2 reaches 146 for the loss of two wickets from their

More information

B. AA228/CS238 Component

B. AA228/CS238 Component Abstract Two supervised learning methods, one employing logistic classification and another employing an artificial neural network, are used to predict the outcome of baseball postseason series, given

More information

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824 Neural Networks II Chen Gao ECE-5424G / CS-5824 Virginia Tech Spring 2019 Neural Networks Origins: Algorithms that try to mimic the brain. What is this? A single neuron in the brain Input Output Slide

More information

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions Announcements Announcements Lecture 19: Inference for SLR & Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner will be announced

More information

Predicting the Total Number of Points Scored in NFL Games

Predicting the Total Number of Points Scored in NFL Games Predicting the Total Number of Points Scored in NFL Games Max Flores (mflores7@stanford.edu), Ajay Sohmshetty (ajay14@stanford.edu) CS 229 Fall 2014 1 Introduction Predicting the outcome of National Football

More information

Lane changing and merging under congested conditions in traffic simulation models

Lane changing and merging under congested conditions in traffic simulation models Urban Transport 779 Lane changing and merging under congested conditions in traffic simulation models P. Hidas School of Civil and Environmental Engineering, University of New South Wales, Australia Abstract

More information

Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers

Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers Shuja Khalid 1 1 Department of Computer Science, University of Toronto arxiv:1811.02885v1 [cs.cy] 15

More information

Two Machine Learning Approaches to Understand the NBA Data

Two Machine Learning Approaches to Understand the NBA Data Two Machine Learning Approaches to Understand the NBA Data Panagiotis Lolas December 14, 2017 1 Introduction In this project, I consider applications of machine learning in the analysis of nba data. To

More information

Standard 3.1 The student will plan and conduct investigations in which

Standard 3.1 The student will plan and conduct investigations in which Teacher Name: Tammy Heddings Date: April 04, 2009 Grade Level: 3-6 Subject: Science Time: 30 minutes Concept: Scientific Investigation Topic: Variables SOLs: Standard 3.1 The student will plan and conduct

More information

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Analysis of Variance. Copyright 2014 Pearson Education, Inc. Analysis of Variance 12-1 Learning Outcomes Outcome 1. Understand the basic logic of analysis of variance. Outcome 2. Perform a hypothesis test for a single-factor design using analysis of variance manually

More information

Section I: Multiple Choice Select the best answer for each problem.

Section I: Multiple Choice Select the best answer for each problem. Inference for Linear Regression Review Section I: Multiple Choice Select the best answer for each problem. 1. Which of the following is NOT one of the conditions that must be satisfied in order to perform

More information

A computer program that improves its performance at some task through experience.

A computer program that improves its performance at some task through experience. 1 A computer program that improves its performance at some task through experience. 2 Example: Learn to Diagnose Patients T: Diagnose tumors from images P: Percent of patients correctly diagnosed E: Pre

More information

Machine Learning Methods for Climbing Route Classification

Machine Learning Methods for Climbing Route Classification Machine Learning Methods for Climbing Route Classification Alejandro Dobles Mathematics adobles@stanford.edu Juan Carlos Sarmiento Management Science & Engineering jcs10@stanford.edu Abstract Peter Satterthwaite

More information

intended velocity ( u k arm movements

intended velocity ( u k arm movements Fig. A Complete Brain-Machine Interface B Human Subjects Closed-Loop Simulator ensemble action potentials (n k ) ensemble action potentials (n k ) primary motor cortex simulated primary motor cortex neuroprosthetic

More information

Pairwise Comparison Models: A Two-Tiered Approach to Predicting Wins and Losses for NBA Games

Pairwise Comparison Models: A Two-Tiered Approach to Predicting Wins and Losses for NBA Games Pairwise Comparison Models: A Two-Tiered Approach to Predicting Wins and Losses for NBA Games Tony Liu Introduction The broad aim of this project is to use the Bradley Terry pairwise comparison model as

More information

Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families

Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families P. Sundaravadivel 1, S. P. Mohanty 2, E. Kougianos 3, V. P. Yanambaka 4, and M. K. Ganapathiraju 5 University of North Texas,

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Announcements Course TA: Hao Xiong Office hours: Friday 2pm-4pm in ECSS2.104A1 First homework

More information

Basketball field goal percentage prediction model research and application based on BP neural network

Basketball field goal percentage prediction model research and application based on BP neural network ISSN : 0974-7435 Volume 10 Issue 4 BTAIJ, 10(4), 2014 [819-823] Basketball field goal percentage prediction model research and application based on BP neural network Jijun Guo Department of Physical Education,

More information

A) The linear correlation is weak, and the two variables vary in the same direction.

A) The linear correlation is weak, and the two variables vary in the same direction. 1 Which of the following is NOT affected b outliers in a data set? A) Mean C) Range B) Mode D) Standard deviation 2 The following scatter plot represents a two-variable statistical distribution. Which

More information

Matrix-analog measure-cerrelatepredict

Matrix-analog measure-cerrelatepredict Matrix-analog measure-cerrelatepredict approach ICEM 2015 22-26 June 2015, Boulder David Hanslian Institute of Atmospheric Physics AS CR "Measure-correlate-predict" (MCP) = methods to estimate long-term

More information

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils 86 Pet.Sci.(29)6:86-9 DOI 1.17/s12182-9-16-x Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils Ehsan Khamehchi 1, Fariborz Rashidi

More information

Journal of Quantitative Analysis in Sports Manuscript 1039

Journal of Quantitative Analysis in Sports Manuscript 1039 An Article Submitted to Journal of Quantitative Analysis in Sports Manuscript 1039 A Simple and Flexible Rating Method for Predicting Success in the NCAA Basketball Tournament Brady T. West University

More information

One Way ANOVA (Analysis of Variance)

One Way ANOVA (Analysis of Variance) One Wa ANOVA (Analsis of Variance) The one-wa analsis of variance (ANOVA) is used to determine whether there are an significant differences between the means of two or more independent (unrelated) groups

More information

How Do Injuries in the NFL Affect the Outcome of the Game

How Do Injuries in the NFL Affect the Outcome of the Game How Do Injuries in the NFL Affect the Outcome of the Game Andy Sun STATS 50 2014 Motivation NFL injury surveillance shows upward trend in both the number and severity of injuries Packers won 2010 Super

More information

E STIMATING KILN SCHEDULES FOR TROPICAL AND TEMPERATE HARDWOODS USING SPECIFIC GRAVITY

E STIMATING KILN SCHEDULES FOR TROPICAL AND TEMPERATE HARDWOODS USING SPECIFIC GRAVITY P ROCESSES E STIMATING KILN SCHEDULES FOR TROPICAL AND TEMPERATE HARDWOODS USING SPECIFIC GRAVITY W ILLIAM T. SIMPSON S TEVE P. VERRILL A BSTRACT Dry-kiln schedules have been developed for many hardwood

More information

Planning and Design of Proposed ByPass Road connecting Kalawad Road to Gondal Road, Rajkot - Using Autodesk Civil 3D Software.

Planning and Design of Proposed ByPass Road connecting Kalawad Road to Gondal Road, Rajkot - Using Autodesk Civil 3D Software. Planning and Design of Proposed ByPass Road connecting Kalawad Road to Gondal Road, Rajkot - Using Autodesk Civil 3D Software. 1 Harshil S. Shah, 2 P.A.Shinkar 1 M.E. Student(Transportation Engineering),

More information

Citation for published version (APA): Canudas Romo, V. (2003). Decomposition Methods in Demography Groningen: s.n.

Citation for published version (APA): Canudas Romo, V. (2003). Decomposition Methods in Demography Groningen: s.n. University of Groningen Decomposition Methods in Demography Canudas Romo, Vladimir IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please

More information

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning Kelvin C.P. Wang Oklahoma State University and WayLink Systems Corp. 2017-10-19, Copenhagen, Denmark European

More information

Modeling Salmon Behavior on the Umpqua River. By Scott Jordan 6/2/2015

Modeling Salmon Behavior on the Umpqua River. By Scott Jordan 6/2/2015 Modeling Salmon Behavior on the Umpqua River By Scott Jordan 6/2/2015 1 Importance of Salmon Delicious Recreation 631,000 people in Oregon went fishing in 2008 spent $264.6 Million on fishing trips Commercial

More information

Name May 3, 2007 Math Probability and Statistics

Name May 3, 2007 Math Probability and Statistics Name May 3, 2007 Math 341 - Probability and Statistics Long Exam IV Instructions: Please include all relevant work to get full credit. Encircle your final answers. 1. An article in Professional Geographer

More information

Staking plans in sports betting under unknown true probabilities of the event

Staking plans in sports betting under unknown true probabilities of the event Staking plans in sports betting under unknown true probabilities of the event Andrés Barge-Gil 1 1 Department of Economic Analysis, Universidad Complutense de Madrid, Spain June 15, 2018 Abstract Kelly

More information

A Machine Learning Approach to Predicting Winning Patterns in Track Cycling Omnium

A Machine Learning Approach to Predicting Winning Patterns in Track Cycling Omnium A Machine Learning Approach to Predicting Winning Patterns in Track Cycling Omnium Bahadorreza Ofoghi 1,2, John Zeleznikow 1, Clare MacMahon 1,andDanDwyer 2 1 Victoria University, Melbourne VIC 3000, Australia

More information

Navigate to the golf data folder and make it your working directory. Load the data by typing

Navigate to the golf data folder and make it your working directory. Load the data by typing Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better

More information

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement SPORTSCIENCE sportsci.org Original Research / Performance Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement Carl D Paton, Will G Hopkins Sportscience

More information

Systematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries

Systematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries Systematic Review and Meta-analysis of Bicycle Helmet Efficacy to Mitigate Head, Face and Neck Injuries Prudence Creighton & Jake Olivier MATHEMATICS & THE UNIVERSITY OF NEW STATISTICS SOUTH WALES Creighton

More information

FISH 415 LIMNOLOGY UI Moscow

FISH 415 LIMNOLOGY UI Moscow Sampling Equipment Lab FISH 415 LIMNOLOGY UI Moscow Purpose: - to familiarize you with limnological sampling equipment - to use some of the equipment to obtain profiles of temperature, dissolved oxygen,

More information

Dynamic validation of Globwave SAR wave spectra data using an observation-based swell model. R. Husson and F. Collard

Dynamic validation of Globwave SAR wave spectra data using an observation-based swell model. R. Husson and F. Collard Dynamic validation of Globwave SAR wave spectra data using an observation-based swell model. R. Husson and F. Collard Context 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992

More information

What Causes the Favorite-Longshot Bias? Further Evidence from Tennis

What Causes the Favorite-Longshot Bias? Further Evidence from Tennis MPRA Munich Personal RePEc Archive What Causes the Favorite-Longshot Bias? Further Evidence from Tennis Jiri Lahvicka 30. June 2013 Online at http://mpra.ub.uni-muenchen.de/47905/ MPRA Paper No. 47905,

More information

Using Spatio-Temporal Data To Create A Shot Probability Model

Using Spatio-Temporal Data To Create A Shot Probability Model Using Spatio-Temporal Data To Create A Shot Probability Model Eli Shayer, Ankit Goyal, Younes Bensouda Mourri June 2, 2016 1 Introduction Basketball is an invasion sport, which means that players move

More information

CSE 190a Project Report: Golf Club Head Tracking

CSE 190a Project Report: Golf Club Head Tracking CSE 190a Project Report: Golf Club Head Tracking Ravi Chugh rchugh@cs.ucsd.edu Krystle de Mesa kdemesa@cs.ucsd.edu Abstract Computer vision and graphics technologies have been used extensively in developing

More information

FISH 415 LIMNOLOGY UI Moscow

FISH 415 LIMNOLOGY UI Moscow FISH 415 LIMNOLOGY UI Moscow Sampling Equipment Purpose: - to familiarize you with limnological sampling equipment - to use some of the equipment to obtain profiles of temperature, dissolved oxygen, conductivity

More information

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables Kromrey & Rendina-Gobioff An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables Jeffrey D. Kromrey Gianna Rendina-Gobioff University of South Florida The Type I error

More information

Introduction to topological data analysis

Introduction to topological data analysis Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 1 /

More information

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON Dorit Hammerling 1, Matthew Cefalu 2, Jessi Cisewski 3, Francesca Dominici 2, Giovanni Parmigiani 2,4, Charles Paulson 5, Richard Smith 1,6 1 Statistical

More information

Application of Bayesian Networks to Shopping Assistance

Application of Bayesian Networks to Shopping Assistance Application of Bayesian Networks to Shopping Assistance Yang Xiang, Chenwen Ye, and Deborah Ann Stacey University of Guelph, CANADA Abstract. We develop an on-line shopping assistant that can help a e-shopper

More information

An Investigation of Freeway Capacity Before and During Incidents

An Investigation of Freeway Capacity Before and During Incidents An Investigation of Freeway Capacity Before and During Incidents Cuie Lu and Lily Elefteriadou Department of Civil and Coastal Engineering University of Florida March 4, 2011 Outline Database and Analysis

More information

Business and housing market cycles in the euro area: a multivariate unobserved component approach

Business and housing market cycles in the euro area: a multivariate unobserved component approach Business and housing market cycles in the euro area: a multivariate unobserved component approach Laurent Ferrara (a) and Siem Jan Koopman (b) http://staff.feweb.vu.nl/koopman (a) Banque de France (b)

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Jason Corso SUNY at Buffalo 19 January 2011 J. Corso (SUNY at Buffalo) Introduction to Pattern Recognition 19 January 2011 1 / 32 Examples of Pattern Recognition in

More information

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com Naïve Bayes These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period GLMM standardisation of the commercial abalone for Zones A-D over the period 1980 2015 Anabela Brandão and Doug S. Butterworth Marine Resource Assessment & Management Group (MARAM) Department of Mathematics

More information

FREEWAY WORK ZONE SPEED MODEL DOCUMENTATION

FREEWAY WORK ZONE SPEED MODEL DOCUMENTATION APPENDIX B FREEWAY WORK ZONE SPEED MODEL DOCUMENTATION B-1 APPENDIX B FREEWAY WORK ZONE SPEED MODEL DOCUMENTATION B.1 INTRODUCTION This software can be used for predicting the speed of vehicles traveling

More information

Advanced Animal Science TEKS/LINKS Student Objectives One Credit

Advanced Animal Science TEKS/LINKS Student Objectives One Credit First Six Weeks Career/Safety/Work Habits AAS 1(A) The student will identify career development and entrepreneurship opportunities in the field of animal systems. AAS 1(B) The student will apply competencies

More information

Grade 6 Math Circles Fall October 7/8 Statistics

Grade 6 Math Circles Fall October 7/8 Statistics Faculty of Mathematics Waterloo, Ontario Centre for Education in Mathematics and Computing Grade 6 Math Circles Fall 2014 - October 7/8 Statistics Statistics (or Stats) is a branch of math that deals with

More information

Neural Network in Computer Vision for RoboCup Middle Size League

Neural Network in Computer Vision for RoboCup Middle Size League Journal of Software Engineering and Applications, 2016, *,** Neural Network in Computer Vision for RoboCup Middle Size League Paulo Rogério de Almeida Ribeiro 1, Gil Lopes 1, Fernando Ribeiro 1 1 Department

More information

Real-Time Electricity Pricing

Real-Time Electricity Pricing Real-Time Electricity Pricing Xi Chen, Jonathan Hosking and Soumyadip Ghosh IBM Watson Research Center / Northwestern University Yorktown Heights, NY, USA X. Chen, J. Hosking & S. Ghosh (IBM) Real-Time

More information

Analysis of Curling Team Strategy and Tactics using Curling Informatics

Analysis of Curling Team Strategy and Tactics using Curling Informatics Hiromu Otani 1, Fumito Masui 1, Kohsuke Hirata 1, Hitoshi Yanagi 2,3 and Michal Ptaszynski 1 1 Department of Computer Science, Kitami Institute of Technology, 165, Kouen-cho, Kitami, Japan 2 Common Course,

More information

RUGBY is a dynamic, evasive, and highly possessionoriented

RUGBY is a dynamic, evasive, and highly possessionoriented VISUALIZING RUGBY GAME STYLES USING SOMS 1 Visualizing Rugby Game Styles Using Self-Organizing Maps Peter Lamb, Hayden Croft Abstract Rugby coaches and analysts often use notational data describing match

More information

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

1.1 The size of the search space Modeling the problem Change over time Constraints... 21 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 I What Are the Ages of My Three Sons? : : : : : : : : : : : : : : : : : 9 1 Why Are Some Problems Dicult to Solve? : : :

More information

Road Data Input System using Digital Map in Roadtraffic

Road Data Input System using Digital Map in Roadtraffic Data Input System using Digital Map in traffic Simulation Namekawa,M 1., N.Aoyagi 2, Y.Ueda 2 and A.Satoh 2 1 College of Management and Economics, Kaetsu University, Tokyo, JAPAN 2 Faculty of Engineering,

More information

The Economic Factors Analysis in Olympic Game

The Economic Factors Analysis in Olympic Game ISSN 0- (print) International Journal of Sports Science and Engineering Vol. 0 (0) No. 0, pp. - The Economic Factors Analsis in Olmpic Game Yong Jiang, Tingting Ma, Zhe Huang Facult of Mathematics and

More information

Lab 11: Introduction to Linear Regression

Lab 11: Introduction to Linear Regression Lab 11: Introduction to Linear Regression Batter up The movie Moneyball focuses on the quest for the secret of success in baseball. It follows a low-budget team, the Oakland Athletics, who believed that

More information

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com Naïve Bayes These slides were assembled by Byron Boots, with only minor modifications from Eric Eaton s slides and grateful acknowledgement to the many others who made their course materials freely available

More information

A history-based estimation for LHCb job requirements

A history-based estimation for LHCb job requirements A history-based estimation for LHCb job requirements Nathalie Rauschmayr on behalf of LHCb Computing 13th April 2015 Introduction N. Rauschmayr 2 How long will a job run and how much memory might it need?

More information

INTER-AMERICAN TROPICAL TUNA COMMISSION SCIENTIFIC ADVISORY COMMITTEE FOURTH MEETING. La Jolla, California (USA) 29 April - 3 May 2013

INTER-AMERICAN TROPICAL TUNA COMMISSION SCIENTIFIC ADVISORY COMMITTEE FOURTH MEETING. La Jolla, California (USA) 29 April - 3 May 2013 INTER-AMERICAN TROPICAL TUNA COMMISSION SCIENTIFIC ADVISORY COMMITTEE FOURTH MEETING La Jolla, California (USA) 29 April - 3 May 2013 DOCUMENT SAC-04-04c INDICES OF RELATIVE ABUNDANCE OF YELLOWFIN TUNA

More information

MICROSCOPIC ROAD SAFETY COMPARISON BETWEEN CANADIAN AND SWEDISH ROUNDABOUT DRIVER BEHAVIOUR

MICROSCOPIC ROAD SAFETY COMPARISON BETWEEN CANADIAN AND SWEDISH ROUNDABOUT DRIVER BEHAVIOUR MICROSCOPIC ROAD SAFETY COMPARISON BETWEEN CANADIAN AND SWEDISH ROUNDABOUT DRIVER BEHAVIOUR Canadian Association of Road Safety Professionals Conference 2017 Paul St-Aubin, Ph.D. 1,2,3, Nicolas Saunier,

More information

For IEC use only. Technical Committee TC3: Information structures, documentation and graphical symbols

For IEC use only. Technical Committee TC3: Information structures, documentation and graphical symbols For IEC use only 3/686/INF 2003-08 INTERNATIONAL ELECTROTECHNICAL COMMISSION Technical Committee TC3: Information structures, documentation and graphical symbols Labels to be used in database standards

More information