STAT 625: 2000 Olympic Diving Exploration

Size: px
Start display at page:

Download "STAT 625: 2000 Olympic Diving Exploration"

Transcription

1 Corey S Brier, Department of Statistics, Yale University 1 STAT 625: 2000 Olympic Diving Exploration Corey S Brier Yale University Abstract This document contains a preliminary investigation of data from the 2000 Olympic Diving Event. In particular, we offer an explanation for the bimodality in the degree of difficulty. The assignment for 9/17 begins in section 4. 1 Data import and formatting The data are provided in an easy to use CSV file so we may import it directly. > library(yaletoolkit) > some <- function(data, n = 7, replace = FALSE) { + sel <- sample(1:dim(data)[1], n, replace) + return(data[sel,]) + } > setwd("c:/users/corey/documents/yale/s3/625/week3") > data <- read.csv("diving2000.csv", as.is = T) > whatis(data) variable.name type missing distinct.values precision 1 Event character 0 4 NA 2 Round character 0 3 NA 3 Diver character NA 4 Country character 0 42 NA 5 Rank numeric DiveNo numeric Difficulty numeric JScore numeric Judge character 0 25 NA 10 JCountry character 0 21 NA min max 1 M10mPF W3mSB 2 Final Semi 3 ABALLI Jesus-Iory ZHUPINA Olena 4 ARG ZIM

2 Corey S Brier, Department of Statistics, Yale University ALT Walter ZAITSEV Oleg 10 AUS ZIM It is useful to change some data types and add a new column for gender: > data$event <- as.factor(data$event) > data$round <- as.factor(data$round) # This could be left as numeric > data$event <- as.factor(data$event) > data$round <- as.factor(data$round) # This could be left as numeric Let s add a column for gender: > menloc <- (data$event == "M3mSB") (data$event == "M10mPF") > femaleloc <-!menloc > data$sex[menloc] <- "M" > data$sex[femaleloc] <- "F" > data$sex <- factor(data$sex) Each row of the data corresponds to a score for a dive, not a particular contestant, so we expect some amount of clustering It could be useful to get all rows for a particular diver, so let s assign each distinct diver a different number: > data$divernumber <- rep(na,length(data$diver)) > for (i in 1:length(unique(data$Diver))) { + dname <- (unique(data$diver))[i] + data[data$diver == dname,]$divernumber <- i + } Also, for each dive, let us compute the average score and add that back into our dataset. We used a vectorized method to avoid an unnecessary loop. > dmeans <- apply(matrix(data$jscore, ncol = 7, byrow = T),1,mean) > data$avg <- rep(dmeans, each = 7) 2 Graphical Exploration We start with a simple histogram of the judge s scores:

3 Corey S Brier, Department of Statistics, Yale University 3 Histogram of data$jscore Frequency Judge's Score We notice there is quite a bit of bimodality in the difficulty: Histogram of data$diff Frequency Dive Difficulties

4 Corey S Brier, Department of Statistics, Yale University 4 Constructing side by side box plots of the dive difficulties reveals that the difficulties from dives in the semifinal round are much lower than those of the final or preliminary rounds: Difficulty Final Prelim Semi To confirm suspicions that Round is a large source of bimodality, we plot the difficulty vs. Judge s Score, jittering each point to deal somewhat with the over-plotting. Additionally, all of the points in the Semi-Final round are colored red.

5 Corey S Brier, Department of Statistics, Yale University 5 Judge's Score (jittered) Difficulty (jittered) This clearly indicates that those dives performed in the Semi-Final round had lower difficulties than the other two rounds. Knowledge of the exact dive requirements and scoring system for the 2000 Olympics would also shed more insight onto why this is the case. Now, let us subset out data from the semi-final round and see if there is any bimodality:

6 Corey S Brier, Department of Statistics, Yale University 6 > datanosemi <- data[(data$round!="semi"),] > hist(datanosemi$diff, xlab = "Difficulty without semifinal round") Histogram of datanosemi$diff Frequency Difficulty without semifinal round Looking at the above figure, it certainly seems that bimodality is less of an issue, although there is still some concern which may merit further investigation. Next, we consider 4 plots, where the difficulty is plotted on the vertical axis. First (top-left) we construct box plots that contrast the 4 different events. The men s events (left two box plots) seem to possibly indicate slightly higher difficulties, so we isolate the men and women without considering the specific event in the top-right plot. We see that perhaps there is a small difference, but nothing drastic is occurring. Of course, due to the size of our data set, we should not be surprised if the standard statistical tests would indicate that there is a significant effect. The bottom-left plot compares the difficulties of the the dive numbers across all of the contestants. There seems to be little difference initially, with perhaps higher than average difficulties on dive number six. Finally, the bottom right graphic plots rank versus dive difficulty. Certainly for each value on the horizontal axis, multiple dives are present, but what is more interesting is the cluster on the bottom which seems to stop at the diver ranked number 20. Those points which correspond to the semi-final round are colored red and they in-fact match this cluster. One possible explanation is that divers ranked higher numerically (i.e. a lower position) only participated in the preliminary round.

7 Corey S Brier, Department of Statistics, Yale University M10mPF W10mPF F M jitter(data$diff) data$rank We can confirm that divers with rank at best 20 only participated in the preliminary round as follows: > table(data[data$rank >= 20,]$Round) Final Prelim Semi

8 Corey S Brier, Department of Statistics, Yale University 8 3 Considering the judges and the scoring The data include the countries that the divers are from as well as the countries of the Judges. One possible analysis might search for any bias, such as a judge giving preferential treatment to a competitor for his or her own country. Although this section is not a complete analysis, we present some preliminary steps. First, it makes sense to actually find out if any Judge evaluated a competitor for their own country: > finalsdata <- data[data$round == "Final",] > sum(as.numeric(finalsdata$country == finalsdata$jcountry)) [1] 0 > prelimdata <- data[data$round == "Prelim",] > sum(as.numeric(prelimdata$country == prelimdata$jcountry)) [1] 201 > semidata <- data[data$round == "Semi",] > sum(as.numeric(semidata$country == semidata$jcountry)) [1] 113 Although a single diver is represented on multiple rows of our data set, because each row corresponds to a judge s score for a dive, we do not need to worry about over-counting using this code. The results are clear: No one judged their own country s team in the finals, but did in the preliminary and semi-final rounds. An additional option is to extract the data where the diver s country and the judge s country were the same, and where they were not the same, to allow for a comparison: > samecountry <- data[data$country == data$jcountry,] > diffcountry <- data[!(data$country == data$jcountry),] > summary(samecountry$jscore) Min. 1st Qu. Median Mean 3rd Qu. Max > summary(diffcountry$jscore) Min. 1st Qu. Median Mean 3rd Qu. Max

9 Corey S Brier, Department of Statistics, Yale University 9 Of course, the data are very unbalanced now, but the univariate summaries indicate that scores are higher in both mean and median when a judge evaluated a diver from his or her own country. However, it is not yet clear how significant this relationship is. Focusing only on data from the preliminary round, we can plot each diver on the horizontal axis (DiverNumber was generated above) and the score for each of their dives on the vertical axis. Also, we have colored and enlarged any point where a judge has given a score for a diver of the same country: > plot(jitter(prelimdata$divernumber),jitter(prelimdata$jscore), + pch = 20, + col = 1 + as.numeric(prelimdata$country == prelimdata$jcountry), + cex = 1 + 1*as.numeric(prelimdata$Country == prelimdata$jcountry), + xlab ="Diver Number", ylab = "Judges score") Judge's score Diver Number Right away we wonder if something is wrong because the graph appears to be in 4 distinct regions, and within each region the scores of the divers seem to be decreasing. This actually makes sense however! The data was given to us sorted first by event (starting with the men s spring board), and within each event the divers were ordered by rank. So the overall shape of the graph may be slightly distracting, but it should not be alarming. More importantly, we wish to look for patterns in the red points. It is very tempting to say that the scores corresponding to the red points seem artificially inflated, but the graph does not provide conclusive evidence (especially when compared to our investigation of the

10 Corey S Brier, Department of Statistics, Yale University 10 bimodality above.) 4 Investigating Steve McFarland for Potential Bias Although misleading, to begin our search for bias from Steve McFarland, we compute his average score for US competitors and non-us competitors: > steveusa <- data[data$judge == "McFARLAND Steve" & data$country == "USA",] > mean(steveusa$jscore) [1] > stevenousa <- data[data$judge == "McFARLAND Steve" & data$country!= "USA",] > mean(stevenousa$jscore) [1] We see that on average, Steve McFarland scored American divers 1.1 points higher than non- American divers. We have to be careful, however. It could be the case that the American divers are actually better, on the average, than the other competitors. Thus, we calculate the average score of all of the judges, except Steve McFarland, for American Divers: > nosteve <- data[data$judge!= "McFARLAND Steve",] > mean(nosteve[nosteve$country == "USA",]$JScore) [1] This reveals that indeed the scores for USA divers are higher than Steve s scores for non- USA divers. However, McFarland s scores for the Americans are still about.34 points higher than the other judge s scores for the Americans. This might indicate some bias, so let s look more closely at those US divers that Steve McFarland judged. We proceed by, for each of those 7 divers, plotting all of their scores. Black points indicate scores from judge s besides McFarland, while points in red correspond to McFarland s scores. The green triangles represent the average of McFarland s scores, for that diver, and the blue diamonds represent the average of all of the other judge s score, for that diver. Data from the final round is excluded, but some within-diver clustering is expected because for each dive, and within each event, we expect reasonably comparable scores:

11 Corey S Brier, Department of Statistics, Yale University 11 jitter(final$jscore) jitter(final$divernumber2) We see right away that McFarland s average score is always above the average score from the other judges, for each of these 7 divers. The greatest absolute discrepancy between Steve s average score and the other judge s score occurs for diver 5 on this chart, corresponding to DAVISON, Michelle. To statistically search for bias, we can assume that all judge s are unbiased and then permute the judges over the dives. This will preserve the performance standard within countries and individual competitors, but will test against judge s being extreme in scoring: > dataperm <- data[data$round!= "Final",] > dataperm$judge <- sample(dataperm$judge) > print(m1 <- mean(dataperm[dataperm$judge == "McFARLAND Steve" & + dataperm$country == "USA",]$JScore)) [1] > print(m2 <- mean(dataperm[dataperm$judge!= "McFARLAND Steve" & + dataperm$country == "USA",]$JScore)) [1] > abs(m1 - m2) [1]

12 Corey S Brier, Department of Statistics, Yale University 12 As before, the results here are both for the mean scores of US competitors. The first assumed the judge is McFarland (under permutation), while the second assumes it is not. We see that indeed these results are very similar, indicating that the difference we saw initially may be significant. By considering many permutations, the absolute difference remains very small, so we may reasonably assume that McFarland has some amount of bias. Earlier, we computed the mean score for each dive. Also, we already found those dives for which So, for each dive we can compare the mean score given by the judges besides McFarland and McFarland s score. > mean(steveusa$jscore - steveusa$avg) [1] We see that McFarland scored about.20 higher than the judges across dives performed by an Americans. Let s see if he is enthusiastic and grades non-usa divers higher by.2 as well: > discrep <- mean(stevenousa$jscore - stevenousa$avg) > discrep [1] This is a value very close to zero! It is positive, so on the average McFarland does score higher on a particular dive the the other judges, but the amount is not nearly so great as the bias he seems to give to the Americans. Subtracting out this average deviation, we have an estimate of his actual bias: > mean(steveusa$jscore - steveusa$avg) - mean(stevenousa$jscore - stevenousa$avg) [1] Now, if McFarland is really unbiased, subtracting the discrepancy from his scores and comparing the mean to the scores given by the other judges to USA competitors should not yield a difference. Thus we have a (1 sided hypothesis test): > t.test(steveusa$jscore - discrep,steveusa$avg, alternative = "greater") Welch Two Sample t-test data: steveusa$jscore - discrep and steveusa$avg t = , df = , p-value = alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: Inf sample estimates: mean of x mean of y

13 Corey S Brier, Department of Statistics, Yale University 13 Which yields a p-value of about.111, which indicates it may not be truly significant. Now, there are a number of issues with this test so we need to be careful. We would like both steveusa$jscore - discrep and steveusa$avg to be roughly normal. So, we can plot some basic histograms: stogram of steveusa$jscore d Histogram of steveusa$avg Frequency Frequency steveusa$jscore discrep steveusa$avg The first histogram appears roughly acceptable, though there is perhaps some cause for concern in the second. Also the two samples here are not independent since certainly the average scores for all of the judges will include McFarland s score. We suspect then that excluding McFarland s score from the average would slightly increase the significance level. A non-parametric test we could try is the (2-sample) Mann Whitney U Test: > wilcox.test(steveusa$jscore - discrep,steveusa$avg, alternative = "greater", + exact = FALSE) Wilcoxon rank sum test with continuity correction data: steveusa$jscore - discrep and steveusa$avg W = 941, p-value = alternative hypothesis: true location shift is greater than 0 Again we see a result that does not seem significant. Also, we could try using a permutation test which would not require the data follow a normal distribution as well. Another option would be to create an indicator variable that designates if McFarland is adjudicating a US Diver:

14 Corey S Brier, Department of Statistics, Yale University 14 > data$issteveusa <- rep(0,length(data$avg)) > data[data$judge == "McFARLAND Steve" & data$country =="USA",]$isSteveUSA <- 1 We could then create a regression model including this indicator variable, and see if it is significant. Some care would need to be taken because fitting JScore as the response would include each dive as seven separate observations which is not appropriate. Further explorations might consider the bias of all judges on their home country. If most or all judges are biased, then it would be useful to compare how biased McFarland is to the others. Perhaps he is not as biased as some of the other judges. Alternatively, perhaps if most judges are biased, then there is actually no net effect on the rankings, since each competitor s score will be similarly inflated. These are only speculations, but provide direction for additional analyses.

STAT 625: 2000 Olympic Diving Exploration

STAT 625: 2000 Olympic Diving Exploration Corey S Brier, Department of Statistics, Yale University 1 STAT 625: 2000 Olympic Diving Exploration Corey S Brier Yale University Abstract This document contains an investigation of bias using data from

More information

Case Studies Homework 3

Case Studies Homework 3 Case Studies Homework 3 Breanne Chryst September 11, 2013 1 In this assignment I did some exploratory analysis on a data set containing diving information from the 2000 Olympics. My code and output is

More information

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data.

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data. 1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data. Green Blue Brown Blue Blue Brown Blue Blue Blue Green Blue Brown Blue Brown Brown Blue

More information

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Stats 2002: Probabilities for Wins and Losses of Online Gambling Abstract: Jennifer Mateja Andrea Scisinger Lindsay Lacher Stats 2002: Probabilities for Wins and Losses of Online Gambling The objective of this experiment is to determine whether online gambling is a

More information

NBA TEAM SYNERGY RESEARCH REPORT 1

NBA TEAM SYNERGY RESEARCH REPORT 1 NBA TEAM SYNERGY RESEARCH REPORT 1 NBA Team Synergy and Style of Play Analysis Karrie Lopshire, Michael Avendano, Amy Lee Wang University of California Los Angeles June 3, 2016 NBA TEAM SYNERGY RESEARCH

More information

Bivariate Data. Frequency Table Line Plot Box and Whisker Plot

Bivariate Data. Frequency Table Line Plot Box and Whisker Plot U04 D02 Univariate Data Frequency Table Line Plot Box and Whisker Plot Univariate Data Bivariate Data involving a single variable does not deal with causes or relationships the major purpose of univariate

More information

Evaluating The Best. Exploring the Relationship between Tom Brady s True and Observed Talent

Evaluating The Best. Exploring the Relationship between Tom Brady s True and Observed Talent Evaluating The Best Exploring the Relationship between Tom Brady s True and Observed Talent Heather Glenny, Emily Clancy, and Alex Monahan MCS 100: Mathematics of Sports Spring 2016 Tom Brady s recently

More information

Internet Technology Fundamentals. To use a passing score at the percentiles listed below:

Internet Technology Fundamentals. To use a passing score at the percentiles listed below: Internet Technology Fundamentals To use a passing score at the percentiles listed below: PASS candidates with this score or HIGHER: 2.90 High Scores Medium Scores Low Scores Percentile Rank Proficiency

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 256 Introduction This procedure computes summary statistics and common non-parametric, single-sample runs tests for a series of n numeric, binary, or categorical data values. For numeric data,

More information

Navigate to the golf data folder and make it your working directory. Load the data by typing

Navigate to the golf data folder and make it your working directory. Load the data by typing Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better

More information

How to Make, Interpret and Use a Simple Plot

How to Make, Interpret and Use a Simple Plot How to Make, Interpret and Use a Simple Plot A few of the students in ASTR 101 have limited mathematics or science backgrounds, with the result that they are sometimes not sure about how to make plots

More information

Stat 139 Homework 3 Solutions, Spring 2015

Stat 139 Homework 3 Solutions, Spring 2015 Stat 39 Homework 3 Solutions, Spring 05 Problem. Let i Nµ, σ ) for i,..., n, and j Nµ, σ ) for j,..., n. Also, assume that all observations are independent from each other. In Unit 4, we learned that the

More information

Legendre et al Appendices and Supplements, p. 1

Legendre et al Appendices and Supplements, p. 1 Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction

More information

1wsSMAM 319 Some Examples of Graphical Display of Data

1wsSMAM 319 Some Examples of Graphical Display of Data 1wsSMAM 319 Some Examples of Graphical Display of Data 1. Lands End employs numerous persons to take phone orders. Computers on which orders are entered also automatically collect data on phone activity.

More information

Reproducible Research: Peer Assessment 1

Reproducible Research: Peer Assessment 1 Introduction Reproducible Research: Peer Assessment 1 It is now possible to collect a large amount of data about personal movement using activity monitoring devices such as a Fitbit, Nike Fuelband, or

More information

There are 3 sections to the home page shown in the first screenshot below. These are:

There are 3 sections to the home page shown in the first screenshot below. These are: Welcome to pacecards! Pacecards is a unique service that places the likely pace in the race and the running style of each horse at the centre of horseracing form analysis. This user guide takes you through

More information

Organizing Quantitative Data

Organizing Quantitative Data Organizing Quantitative Data MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2018 Objectives At the end of this lesson we will be able to: organize discrete data in

More information

Lesson 14: Modeling Relationships with a Line

Lesson 14: Modeling Relationships with a Line Exploratory Activity: Line of Best Fit Revisited 1. Use the link http://illuminations.nctm.org/activity.aspx?id=4186 to explore how the line of best fit changes depending on your data set. A. Enter any

More information

Opleiding Informatica

Opleiding Informatica Opleiding Informatica Determining Good Tactics for a Football Game using Raw Positional Data Davey Verhoef Supervisors: Arno Knobbe Rens Meerhoff BACHELOR THESIS Leiden Institute of Advanced Computer Science

More information

That pesky golf game and the dreaded stats class

That pesky golf game and the dreaded stats class That pesky golf game and the dreaded stats class Marsha Jance Indiana University East A case study that involves golf and statistics is presented. This case study focuses on descriptive statistics and

More information

CHAPTER 1 ORGANIZATION OF DATA SETS

CHAPTER 1 ORGANIZATION OF DATA SETS CHAPTER 1 ORGANIZATION OF DATA SETS When you collect data, it comes to you in more or less a random fashion and unorganized. For example, what if you gave a 35 item test to a class of 50 students and collect

More information

Chapter 2: Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data Chapter 2: Modeling Distributions of Data Section 2.1 The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 2 Modeling Distributions of Data 2.1 2.2 Normal Distributions Section

More information

Descriptive Statistics Project Is there a home field advantage in major league baseball?

Descriptive Statistics Project Is there a home field advantage in major league baseball? Descriptive Statistics Project Is there a home field advantage in major league baseball? DUE at the start of class on date posted on website (in the first 5 minutes of class) There may be other due dates

More information

Average Runs per inning,

Average Runs per inning, Home Team Scoring Advantage in the First Inning Largely Due to Time By David W. Smith Presented June 26, 2015 SABR45, Chicago, Illinois Throughout baseball history, the home team has scored significantly

More information

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings Michelle Yeh and Jordan Multer United States Department of Transportation Volpe National Transportation

More information

Background Information. Project Instructions. Problem Statement. EXAM REVIEW PROJECT Microsoft Excel Review Baseball Hall of Fame Problem

Background Information. Project Instructions. Problem Statement. EXAM REVIEW PROJECT Microsoft Excel Review Baseball Hall of Fame Problem Background Information Every year, the National Baseball Hall of Fame conducts an election to select new inductees from candidates nationally recognized for their talent or association with the sport of

More information

Pace Handicapping with Brohamer Figures

Pace Handicapping with Brohamer Figures Frandsen Publishing Presents Favorite ALL-Ways TM Newsletter Articles Pace Handicapping with Brohamer Figures Part 2 of the 4 Part Series The Brohamer Track Decision Model This is the second of a four-part

More information

March Madness Basketball Tournament

March Madness Basketball Tournament March Madness Basketball Tournament Math Project COMMON Core Aligned Decimals, Fractions, Percents, Probability, Rates, Algebra, Word Problems, and more! To Use: -Print out all the worksheets. -Introduce

More information

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question: Data Set 7: Bioerosion by Parrotfish Background Bioerosion of coral reefs results from animals taking bites out of the calcium-carbonate skeleton of the reef. Parrotfishes are major bioerosion agents,

More information

High-Rise Fireground Field Experiments Results

High-Rise Fireground Field Experiments Results High-Rise Fireground Field Experiments Results These graphs show average times for each critical task by crew size. Percentage calculations for all the charts were based on the overall time from detection

More information

12. School travel Introduction. Part III Chapter 12. School travel

12. School travel Introduction. Part III Chapter 12. School travel 12. School travel 12.1 Introduction This chapter presents the evidence on changes in travel patterns for the journey to school in the three towns over the period of the Sustainable Travel Town project.

More information

Effective Use of Box Charts

Effective Use of Box Charts Effective Use of Box Charts Purpose This tool provides guidelines and tips on how to effectively use box charts to communicate research findings. Format This tool provides guidance on box charts and their

More information

STANDARD SCORES AND THE NORMAL DISTRIBUTION

STANDARD SCORES AND THE NORMAL DISTRIBUTION STANDARD SCORES AND THE NORMAL DISTRIBUTION REVIEW 1.MEASURES OF CENTRAL TENDENCY A.MEAN B.MEDIAN C.MODE 2.MEASURES OF DISPERSIONS OR VARIABILITY A.RANGE B.DEVIATION FROM THE MEAN C.VARIANCE D.STANDARD

More information

Quality Assurance Charting for QC Data

Quality Assurance Charting for QC Data Quality Assurance Charting for QC Data September 2018 Iowa s Environmental & Public Health Laboratory Copyright the State Hygienic Laboratory at the University of Iowa 2017. All rights reserved. Images

More information

Lesson 2.1 Frequency Tables and Graphs Notes Stats Page 1 of 5

Lesson 2.1 Frequency Tables and Graphs Notes Stats Page 1 of 5 Stats Page 1 of 5 Frequency Table: partitions data into classes or intervals and shows how many data values are in each class. The classes or intervals are constructed so that each data value falls exactly

More information

Chapter 5: Methods and Philosophy of Statistical Process Control

Chapter 5: Methods and Philosophy of Statistical Process Control Chapter 5: Methods and Philosophy of Statistical Process Control Learning Outcomes After careful study of this chapter You should be able to: Understand chance and assignable causes of variation, Explain

More information

Table 1. Average runs in each inning for home and road teams,

Table 1. Average runs in each inning for home and road teams, Effect of Batting Order (not Lineup) on Scoring By David W. Smith Presented July 1, 2006 at SABR36, Seattle, Washington The study I am presenting today is an outgrowth of my presentation in Cincinnati

More information

March Madness Basketball Tournament

March Madness Basketball Tournament March Madness Basketball Tournament Math Project COMMON Core Aligned Decimals, Fractions, Percents, Probability, Rates, Algebra, Word Problems, and more! To Use: -Print out all the worksheets. -Introduce

More information

PSY201: Chapter 5: The Normal Curve and Standard Scores

PSY201: Chapter 5: The Normal Curve and Standard Scores PSY201: Chapter 5: The Normal Curve and Standard Scores Introduction: Normal curve + a very important distribution in behavior sciences + three principal reasons why... - 1. many of the variables measured

More information

Sample Final Exam MAT 128/SOC 251, Spring 2018

Sample Final Exam MAT 128/SOC 251, Spring 2018 Sample Final Exam MAT 128/SOC 251, Spring 2018 Name: Each question is worth 10 points. You are allowed one 8 1/2 x 11 sheet of paper with hand-written notes on both sides. 1. The CSV file citieshistpop.csv

More information

An Investigation: Why Does West Coast Precipitation Vary from Year to Year?

An Investigation: Why Does West Coast Precipitation Vary from Year to Year? METR 104, Our Dynamic Weather w/lab Spring 2012 An Investigation: Why Does West Coast Precipitation Vary from Year to Year? I. Introduction The Possible Influence of El Niño and La Niña Your Name mm/dd/yy

More information

Analysis of recent swim performances at the 2013 FINA World Championship: Counsilman Center, Dept. Kinesiology, Indiana University

Analysis of recent swim performances at the 2013 FINA World Championship: Counsilman Center, Dept. Kinesiology, Indiana University Analysis of recent swim performances at the 2013 FINA World Championship: initial confirmation of the rumored current. Joel M. Stager 1, Andrew Cornett 2, Chris Brammer 1 1 Counsilman Center, Dept. Kinesiology,

More information

DIFFERENCES BETWEEN THE WINNING AND DEFEATED FEMALE HANDBALL TEAMS IN RELATION TO THE TYPE AND DURATION OF ATTACKS

DIFFERENCES BETWEEN THE WINNING AND DEFEATED FEMALE HANDBALL TEAMS IN RELATION TO THE TYPE AND DURATION OF ATTACKS DIFFERENCES BETWEEN THE WINNING AND DEFEATED FEMALE HANDBALL TEAMS IN RELATION TO THE TYPE AND DURATION OF ATTACKS Katarina OHNJEC, Dinko VULETA, Lidija BOJIĆ-ĆAĆIĆ Faculty of Kinesiology, University of

More information

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income Announcements Announcements Unit 7: Multiple Linear Regression Lecture 3: Case Study Statistics 101 Mine Çetinkaya-Rundel April 18, 2013 OH: Sunday: Virtual OH, 3-4pm - you ll receive an email invitation

More information

The pth percentile of a distribution is the value with p percent of the observations less than it.

The pth percentile of a distribution is the value with p percent of the observations less than it. Describing Location in a Distribution (2.1) Measuring Position: Percentiles One way to describe the location of a value in a distribution is to tell what percent of observations are less than it. De#inition:

More information

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STAT 155 Introductory Statistics Lecture 2-2: Displaying Distributions with Graphs 8/31/06 Lecture 2-2 1 Recall Data: Individuals Variables Categorical variables

More information

Acknowledgement: Author is indebted to Dr. Jennifer Kaplan, Dr. Parthanil Roy and Dr Ashoke Sinha for allowing him to use/edit many of their slides.

Acknowledgement: Author is indebted to Dr. Jennifer Kaplan, Dr. Parthanil Roy and Dr Ashoke Sinha for allowing him to use/edit many of their slides. Acknowledgement: Author is indebted to Dr. Jennifer Kaplan, Dr. Parthanil Roy and Dr Ashoke Sinha for allowing him to use/edit many of their slides. Topic for this lecture 0Today s lecture s materials

More information

Statistics Class 3. Jan 30, 2012

Statistics Class 3. Jan 30, 2012 Statistics Class 3 Jan 30, 2012 Group Quiz 2 1. The Statistical Abstract of the United States includes the average per capita income for each of the 50 states. When those 50 values are added, then divided

More information

Analysis of Traditional Yaw Measurements

Analysis of Traditional Yaw Measurements Analysis of Traditional Yaw Measurements Curiosity is the very basis of education and if you tell me that curiosity killed the cat, I say only the cat died nobly. Arnold Edinborough Limitations of Post-

More information

Analysis of Factors Affecting Train Derailments at Highway-Rail Grade Crossings

Analysis of Factors Affecting Train Derailments at Highway-Rail Grade Crossings Chadwick et al TRB 12-4396 1 1 2 3 Analysis of Factors Affecting Train Derailments at Highway-Rail Grade Crossings 4 5 TRB 12-4396 6 7 8 9 Submitted for consideration for presentation and publication at

More information

SCIENTIFIC COMMITTEE SEVENTH REGULAR SESSION August 2011 Pohnpei, Federated States of Micronesia

SCIENTIFIC COMMITTEE SEVENTH REGULAR SESSION August 2011 Pohnpei, Federated States of Micronesia SCIENTIFIC COMMITTEE SEVENTH REGULAR SESSION 9-17 August 2011 Pohnpei, Federated States of Micronesia CPUE of skipjack for the Japanese offshore pole and line using GPS and catch data WCPFC-SC7-2011/SA-WP-09

More information

100-Meter Dash Olympic Winning Times: Will Women Be As Fast As Men?

100-Meter Dash Olympic Winning Times: Will Women Be As Fast As Men? 100-Meter Dash Olympic Winning Times: Will Women Be As Fast As Men? The 100 Meter Dash has been an Olympic event since its very establishment in 1896(1928 for women). The reigning 100-meter Olympic champion

More information

A Hare-Lynx Simulation Model

A Hare-Lynx Simulation Model 1 A Hare- Simulation Model What happens to the numbers of hares and lynx when the core of the system is like this? Hares O Balance? S H_Births Hares H_Fertility Area KillsPerHead Fertility Births Figure

More information

NAME: A graph contains five major parts: a. Title b. The independent variable c. The dependent variable d. The scales for each variable e.

NAME: A graph contains five major parts: a. Title b. The independent variable c. The dependent variable d. The scales for each variable e. NAME: Graphing is an important procedure used by scientists to display the data that is collected during a controlled experiment. Line graphs demonstrate change over time and must be constructed correctly

More information

Understanding Winter Road Conditions in Yellowstone National Park Using Cumulative Sum Control Charts

Understanding Winter Road Conditions in Yellowstone National Park Using Cumulative Sum Control Charts 1 Understanding Winter Road Conditions in Yellowstone National Park Using Cumulative Sum Control Charts Amber Nuxoll April 1st, 2014 Contents 1 Introduction 2 2 Data Collection and Background 2 3 Exploratory

More information

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher Mini-project 3 Tennis ball launcher Mini-Project 3 requires you to use MATLAB to model the trajectory of a tennis ball being shot from a tennis ball launcher to a player. The tennis ball trajectory model

More information

3D Turbulence at the Offshore Wind Farm Egmond aan Zee J.W. Wagenaar P.J. Eecen

3D Turbulence at the Offshore Wind Farm Egmond aan Zee J.W. Wagenaar P.J. Eecen 3D Turbulence at the Offshore Wind Farm Egmond aan Zee J.W. Wagenaar P.J. Eecen OWEZ_R_121_3Dturbulence_20101008 ECN-E--10-075 OCTOBER 2010 Abstract NoordzeeWind carries out an extensive measurement and

More information

WHAT IS THE ESSENTIAL QUESTION?

WHAT IS THE ESSENTIAL QUESTION? WHAT IS THE ESSENTIAL QUESTION? Essential Question Essential Question Essential Question Essential Question Essential Question Essential Question Essential Question Week 3, Lesson 1 1. Warm up 2. Notes

More information

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 AUDIT TRAIL

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 AUDIT TRAIL INSTITUTE AND FACULTY OF ACTUARIES Curriculum 2019 AUDIT TRAIL Subject CP2 Actuarial Modelling Paper One Institute and Faculty of Actuaries Triathlon model Objective Each year on the Island of IFoA a Minister

More information

Chapter 12 Practice Test

Chapter 12 Practice Test Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value

More information

Section 5 Critiquing Data Presentation - Teachers Notes

Section 5 Critiquing Data Presentation - Teachers Notes Topics from GCE AS and A Level Mathematics covered in Sections 5: Interpret histograms and other diagrams for single-variable data Select or critique data presentation techniques in the context of a statistical

More information

% per year Age (years)

% per year Age (years) Stat 1001 Winter 1998 Geyer Homework 2 Problem 3.1 66 inches and 72 inches. Problem 3.2 % per year 0.0 0.5 1.0 1.5 0 20 40 60 80 Age (years) (a) Age 1. (b) More 31-year olds (c) More people age 35{44,

More information

Ozobot Bit Classroom Application: Boyle s Law Simulation

Ozobot Bit Classroom Application: Boyle s Law Simulation OZO AP P EAM TR T S BO RO VE D Ozobot Bit Classroom Application: Boyle s Law Simulation Created by Richard Born Associate Professor Emeritus Northern Illinois University richb@rborn.org Topics Chemistry,

More information

DO YOU KNOW WHO THE BEST BASEBALL HITTER OF ALL TIMES IS?...YOUR JOB IS TO FIND OUT.

DO YOU KNOW WHO THE BEST BASEBALL HITTER OF ALL TIMES IS?...YOUR JOB IS TO FIND OUT. Data Analysis & Probability Name: Date: Hour: DO YOU KNOW WHO THE BEST BASEBALL HITTER OF ALL TIMES IS?...YOUR JOB IS TO FIND OUT. This activity will find the greatest baseball hitter of all time. You

More information

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart? Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group were accidents, 12,958; homicide, 5,793; suicide,

More information

Handicapping Process Series

Handicapping Process Series Frandsen Publishing Presents Favorite ALL-Ways TM Newsletter Articles Handicapping Process Series Part 1 of 6: Toolbox vs. Black Box Handicapping Plus Isolating the Contenders and the Most Likely Horses

More information

Golfers in Colorado: The Role of Golf in Recreational and Tourism Lifestyles and Expenditures

Golfers in Colorado: The Role of Golf in Recreational and Tourism Lifestyles and Expenditures Golfers in Colorado: The Role of Golf in Recreational and Tourism Lifestyles and Expenditures by Josh Wilson, Phil Watson, Dawn Thilmany and Steve Davies Graduate Research Assistants, Associate Professor

More information

9.3 Histograms and Box Plots

9.3 Histograms and Box Plots Name Class Date 9.3 Histograms and Box Plots Essential Question: How can you interpret and compare data sets using data displays? Explore Understanding Histograms Resource Locker A histogram is a bar graph

More information

Practice Test Unit 6B/11A/11B: Probability and Logic

Practice Test Unit 6B/11A/11B: Probability and Logic Note to CCSD Pre-Algebra Teachers: 3 rd quarter benchmarks begin with the last 2 sections of Chapter 6, and then address Chapter 11 benchmarks; logic concepts are also included. We have combined probability

More information

Running head: DATA ANALYSIS AND INTERPRETATION 1

Running head: DATA ANALYSIS AND INTERPRETATION 1 Running head: DATA ANALYSIS AND INTERPRETATION 1 Data Analysis and Interpretation Final Project Vernon Tilly Jr. University of Central Oklahoma DATA ANALYSIS AND INTERPRETATION 2 Owners of the various

More information

5.1 Introduction. Learning Objectives

5.1 Introduction. Learning Objectives Learning Objectives 5.1 Introduction Statistical Process Control (SPC): SPC is a powerful collection of problem-solving tools useful in achieving process stability and improving capability through the

More information

Exemplar for Internal Achievement Standard. Mathematics and Statistics Level 1

Exemplar for Internal Achievement Standard. Mathematics and Statistics Level 1 Exemplar for Internal Achievement Standard Mathematics and Statistics Level 1 This exemplar supports assessment against: Achievement Standard Investigate a given multivariate data set using the statistical

More information

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections Todd Knox Center for Transportation Research and Education Iowa State University 2901 South Loop Drive, Suite 3100

More information

Lesson 16: More on Modeling Relationships with a Line

Lesson 16: More on Modeling Relationships with a Line Student Outcomes Students use the least squares line to predict values for a given data set. Students use residuals to evaluate the accuracy of predictions based on the least squares line. Lesson Notes

More information

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida The Myth of the Closer By David W. Smith Presented July 29, 216 SABR46, Miami, Florida Every team spends much effort and money to select its closer, the pitcher who enters in the ninth inning to seal the

More information

SHOT ON GOAL. Name: Football scoring a goal and trigonometry Ian Edwards Luther College Teachers Teaching with Technology

SHOT ON GOAL. Name: Football scoring a goal and trigonometry Ian Edwards Luther College Teachers Teaching with Technology SHOT ON GOAL Name: Football scoring a goal and trigonometry 2006 Ian Edwards Luther College Teachers Teaching with Technology Shot on Goal Trigonometry page 2 THE TASKS You are an assistant coach with

More information

1 Streaks of Successes in Sports

1 Streaks of Successes in Sports 1 Streaks of Successes in Sports It is very important in probability problems to be very careful in the statement of a question. For example, suppose that I plan to toss a fair coin five times and wonder,

More information

SPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction

SPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS KELLIN RUMSEY Introduction The 2016 National Basketball Association championship featured two of the leagues biggest names. The Golden State Warriors Stephen

More information

Lab Report Outline the Bones of the Story

Lab Report Outline the Bones of the Story Lab Report Outline the Bones of the Story In this course, you are asked to write only the outline of a lab report. A good lab report provides a complete record of your experiment, and even in outline form

More information

BEFORE YOU OPEN ANY FILES:

BEFORE YOU OPEN ANY FILES: Dive Analysis Lab * Make sure to download all the data files for the lab onto your computer. * Bring your computer to lab. * Bring a blank disk or memory stick to class to save your work and files. The

More information

PGA Tour Scores as a Gaussian Random Variable

PGA Tour Scores as a Gaussian Random Variable PGA Tour Scores as a Gaussian Random Variable Robert D. Grober Departments of Applied Physics and Physics Yale University, New Haven, CT 06520 Abstract In this paper it is demonstrated that the scoring

More information

Diameter in cm. Bubble Number. Bubble Number Diameter in cm

Diameter in cm. Bubble Number. Bubble Number Diameter in cm Bubble lab Data Sheet Blow bubbles and measure the diameter to the nearest whole centimeter. Record in the tables below. Try to blow different sized bubbles. Name: Bubble Number Diameter in cm Bubble Number

More information

STT 315 Section /19/2014

STT 315 Section /19/2014 Name: PID: A STT 315 Section 101 05/19/2014 Quiz 1A 50 minutes 1. A survey by an electric company contains questions on the following: Age of household head, Gender of household head and use of electric

More information

Full file at

Full file at Chapter 2 1. Describe the distribution. survival times of persons diagnosed with terminal lymphoma A) approximately normal B) skewed left C) skewed right D) roughly uniform Ans: C Difficulty: low 2. Without

More information

Evaluating and Classifying NBA Free Agents

Evaluating and Classifying NBA Free Agents Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful

More information

BEFORE YOU OPEN ANY FILES:

BEFORE YOU OPEN ANY FILES: Dive Analysis Lab *If you are using a school computer bring a USB drive to class to save your work and the files for the lab. *If you are using your own computer, make sure to download the data and files

More information

8th Grade. Data.

8th Grade. Data. 1 8th Grade Data 2015 11 20 www.njctl.org 2 Table of Contents click on the topic to go to that section Two Variable Data Line of Best Fit Determining the Prediction Equation Two Way Table Glossary Teacher

More information

Compression Study: City, State. City Convention & Visitors Bureau. Prepared for

Compression Study: City, State. City Convention & Visitors Bureau. Prepared for : City, State Prepared for City Convention & Visitors Bureau Table of Contents City Convention & Visitors Bureau... 1 Executive Summary... 3 Introduction... 4 Approach and Methodology... 4 General Characteristics

More information

Equation 1: F spring = kx. Where F is the force of the spring, k is the spring constant and x is the displacement of the spring. Equation 2: F = mg

Equation 1: F spring = kx. Where F is the force of the spring, k is the spring constant and x is the displacement of the spring. Equation 2: F = mg 1 Introduction Relationship between Spring Constant and Length of Bungee Cord In this experiment, we aimed to model the behavior of the bungee cord that will be used in the Bungee Challenge. Specifically,

More information

MTB 02 Intermediate Minitab

MTB 02 Intermediate Minitab MTB 02 Intermediate Minitab This module will cover: Advanced graphing Changing data types Value Order Making similar graphs Zooming worksheet Brushing Multi-graphs: By variables Interactively upgrading

More information

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Jonathan Tung University of California, Riverside tung.jonathanee@gmail.com Abstract In Major League Baseball, there

More information

Supplemental Information

Supplemental Information Supplemental Information Supplemental Methods Principal Component Analysis (PCA) Every patient (identified by index k varying between 1 and n) was characterized by 4 cell-level measured features (quantitative

More information

Note that all proportions are between 0 and 1. at risk. How to construct a sentence describing a. proportion:

Note that all proportions are between 0 and 1. at risk. How to construct a sentence describing a. proportion: Biostatistics and Research Design in Dentistry Categorical Data Reading assignment Chapter 3 Summarizing data in Dawson-Trapp starting with Summarizing nominal and ordinal data with numbers on p 40 thru

More information

4-3 Rate of Change and Slope. Warm Up. 1. Find the x- and y-intercepts of 2x 5y = 20. Describe the correlation shown by the scatter plot. 2.

4-3 Rate of Change and Slope. Warm Up. 1. Find the x- and y-intercepts of 2x 5y = 20. Describe the correlation shown by the scatter plot. 2. Warm Up 1. Find the x- and y-intercepts of 2x 5y = 20. Describe the correlation shown by the scatter plot. 2. Objectives Find rates of change and slopes. Relate a constant rate of change to the slope of

More information

Unit 3 ~ Data about us

Unit 3 ~ Data about us Unit 3 ~ Data about us Investigation 3: Data Sets & Displays I can construct, interpret, and compare data sets and displays. I can find, interpret, and compare measures of center and variation for data

More information

FireWorks NFIRS BI User Manual

FireWorks NFIRS BI User Manual FireWorks NFIRS BI User Manual Last Updated: March 2018 Introduction FireWorks Business Intelligence BI and analytics tool is used to analyze NFIRS (National Fire Incident Reporting System) data in an

More information

Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots

Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots AFM Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots Name Date To find the mean, median and mode from a histogram, you first need to know how many data points were used. Use the frequency

More information

Today s plan: Section 4.2: Normal Distribution

Today s plan: Section 4.2: Normal Distribution 1 Today s plan: Section 4.2: Normal Distribution 2 Characteristics of a data set: mean median standard deviation five-number summary 2 Characteristics of a data set: mean median standard deviation five-number

More information

Performance Task # 1

Performance Task # 1 Performance Task # 1 Goal: Arrange integers in order. Role: You are a analyzing a Julie Brown Anderson s dive. Audience: Reader of article. Situation: You are interviewing for a job at a sports magazine.

More information

STAT 155 Introductory Statistics. Lecture 2: Displaying Distributions with Graphs

STAT 155 Introductory Statistics. Lecture 2: Displaying Distributions with Graphs The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STAT 155 Introductory Statistics Lecture 2: Displaying Distributions with Graphs 8/29/06 Lecture 2-1 1 Recall Statistics is the science of data. Collecting

More information