The pth percentile of a distribution is the value with p percent of the observations less than it.

Similar documents
Chapter 2: Modeling Distributions of Data

Scaled vs. Original Socre Mean = 77 Median = 77.1

CHAPTER 2 Modeling Distributions of Data

IHS AP Statistics Chapter 2 Modeling Distributions of Data MP1

CHAPTER 2 Modeling Distributions of Data

AP Stats Chapter 2 Notes

DESCRIBE the effect of adding, subtracting, multiplying by, or dividing by a constant on the shape, center, and spread of a distribution of data.

CHAPTER 2 Modeling Distributions of Data

How are the values related to each other? Are there values that are General Education Statistics

Mrs. Daniel- AP Stats Ch. 2 MC Practice

NUMB3RS Activity: Is It for Real? Episode: Hardball

Statistics. Wednesday, March 28, 2012

% per year Age (years)

STANDARD SCORES AND THE NORMAL DISTRIBUTION

Descriptive Statistics Project Is there a home field advantage in major league baseball?

9.3 Histograms and Box Plots

Chapter 3.4. Measures of position and outliers. Julian Chan. September 11, Department of Mathematics Weber State University

STAT 101 Assignment 1

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data.

PSY201: Chapter 5: The Normal Curve and Standard Scores

STT 315 Section /19/2014

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs

STAT 155 Introductory Statistics. Lecture 2: Displaying Distributions with Graphs

Reminders. Homework scores will be up by tomorrow morning. Please me and the TAs with any grading questions by tomorrow at 5pm

Quantitative Literacy: Thinking Between the Lines

Unit 3 - Data. Grab a new packet from the chrome book cart. Unit 3 Day 1 PLUS Box and Whisker Plots.notebook September 28, /28 9/29 9/30?

North Point - Advance Placement Statistics Summer Assignment

Full file at

3.3 - Measures of Position

Internet Technology Fundamentals. To use a passing score at the percentiles listed below:

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots

Measuring Relative Achievements: Percentile rank and Percentile point

CHAPTER 1 ORGANIZATION OF DATA SETS

5.1. Data Displays Batter Up. My Notes ACTIVITY

Lesson 2 Pre-Visit Slugging Percentage

Lesson 3 Pre-Visit Teams & Players by the Numbers

Quiz 1.1A AP Statistics Name:

46 Chapter 8 Statistics: An Introduction

Fundamentals of Machine Learning for Predictive Data Analytics

Statistical Studies: Analyzing Data III.B Student Activity Sheet 6: Analyzing Graphical Displays

Statistical Studies: Analyzing Data III.B Student Activity Sheet 6: Analyzing Graphical Displays

Running head: DATA ANALYSIS AND INTERPRETATION 1

In my left hand I hold 15 Argentine pesos. In my right, I hold 100 Chilean

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Name Date Period. E) Lowest score: 67, mean: 104, median: 112, range: 83, IQR: 102, Q1: 46, SD: 17

Chapter 2 - Displaying and Describing Categorical Data

Chapter 3 - Displaying and Describing Categorical Data

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Analyzing Categorical Data & Displaying Quantitative Data Section 1.1 & 1.2

Practice Test Unit 6B/11A/11B: Probability and Logic

Histogram. Collection

Unit 3 ~ Data about us

Practice Test Unit 06B 11A: Probability, Permutations and Combinations. Practice Test Unit 11B: Data Analysis

Chapter 4 Displaying Quantitative Data

Lab 5: Descriptive Statistics

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

Year 10 Term 2 Homework

CHAPTER 1 Exploring Data

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

March Madness Basketball Tournament

Smoothing the histogram: The Normal Curve (Chapter 8)

Solutionbank S1 Edexcel AS and A Level Modular Mathematics

PRACTICE PROBLEMS FOR EXAM 1

0-13 Representing Data

Chapter 2 - Frequency Distributions and Graphs

Descriptive Statistics. Dr. Tom Pierce Department of Psychology Radford University

1wsSMAM 319 Some Examples of Graphical Display of Data

March Madness Basketball Tournament

Effective Use of Box Charts

Frequency Distributions

Averages. October 19, Discussion item: When we talk about an average, what exactly do we mean? When are they useful?

Average Runs per inning,

How Fast Can You Throw?

Bivariate Data. Frequency Table Line Plot Box and Whisker Plot

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

ACTIVITY: Drawing a Box-and-Whisker Plot. a. Order the data set and write it on a strip of grid paper with 24 equally spaced boxes.

ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010

Additional Reading General, Organic and Biological Chemistry, by Timberlake, chapter 8.

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Age of Fans

Diameter in cm. Bubble Number. Bubble Number Diameter in cm

Dotplots, Stemplots, and Time-Series Plots

Algebra 1 Unit 7 Day 2 DP Box and Whisker Plots.notebook April 10, Algebra I 04/10/18 Aim: How Do We Create Box and Whisker Plots?

Why We Should Use the Bullpen Differently

Hitting with Runners in Scoring Position

Homework Exercises Problem Set 1 (chapter 2)

IGCSE - Cumulative Frequency Questions

Assignment. To New Heights! Variance in Subjective and Random Samples. Use the table to answer Questions 2 through 7.

Descriptive Stats. Review

Chapter. 1 Who s the Best Hitter? Averages

Movement and Position

NOTES: STANDARD DEVIATION DAY 4 Textbook Chapter 11.1, 11.3

Today s plan: Section 4.2: Normal Distribution

All AQA Unit 1 Questions Higher

Box-and-Whisker Plots

WHAT IS THE ESSENTIAL QUESTION?

Stats in Algebra, Oh My!

Section 3.2: Measures of Variability

5.1 Introduction. Learning Objectives

Transcription:

Describing Location in a Distribution (2.1) Measuring Position: Percentiles One way to describe the location of a value in a distribution is to tell what percent of observations are less than it. De#inition: The pth percentile of a distribution is the value with p percent of the observations less than it.

The stemplot below shows the number of wins for each of the 30 Major League Baseball teams in 2009. 5 9 6 2455 7 00455589 8 0345667778 9 123557 10 3 Problem: Find the percentiles for the following teams: (a) The Colorado Rockies, who won 92 games. (b) The New York Yankees, who won 103 games. The stemplot below shows the number of wins for each of the 30 Major League Baseball teams in 2009. 5 9 6 2455 7 00455589 8 0345667778 9 123557 10 3 Problem: Find the following - (c) The percentile for the Kansas City Royals, who won 65 games. (d) A MLB team represented the 75th percentile for number of wins. Assuming no team had the same number of wins that they did, approximately how many teams had more wins than they did?

A cumulative relative frequency graph (or ogive) displays the cumulative relative frequency of each class of a frequency distribution. Here is a table showing the distribution of median household incomes for the 50 states and the District of Columbia. Here is a table showing the distribution of median household incomes for the 50 states and the District of Columbia. Construct a cumulative relative frequency graph (ogive) of this data set.

Problem: Use the cumulative relative frequency graph for the state income data to answer each question. (a) At what percentile is California, with a median income of $57,445? (b) Estimate and interpret the ]irst quartile of this solution. (c) Are there any states whose median income can be considered an outlier? Measuring Position: z-scores A z-score tells us how many standard deviations from the mean an observation falls, and in what direction. De#inition: If x is an observation from a distribution that has known mean and standard deviation, the standardized value of x is: A standardized value is often called a z-score.

In 2009, the mean number of wins for teams in the MLB was 81 with a standard deviation of 11.4 wins. Problem: Find and interpret the z-scores for the following teams. (a) The New York Yankees, with 103 wins. (b) The Baltimore Orioles, with 64 wins. The single-season home run record for major league baseball has been set just three times since Babe Ruth hit 60 home runs in 1927. Roger Maris hit 61 in 1961, Mark McGwire hit 70 in 1998 and Barry Bonds hit 73 in 2001. In an absolute sense, Barry Bonds had the best performance of these four players, since he hit the most home runs in a single season. However, in a relative sense this may not be true. Baseball historians suggest that hitting a home run has been easier in some eras than others. This is due to many factors, including quality of batters, quality of pitchers, hardness of the baseball, dimensions of ballparks, and possible use of performance-enhancing drugs. To make a fair comparison, we should see how these performances rate relative to others hitters during the same year.

Problem: Compute the standardized scores for each performance. Which player had the most outstanding performance relative to his peers? Transforming Data (adding and subtracting constants) Transforming converts the original observations from the original units of measurements to another scale. Transformations can affect the shape, center, and spread of a distribution. EXAMPLE: Record your guess for the width of this room (length of blue wall) on a postit and bring up to the smart panel. Here are your classes guesses: Create a dotplot of the data and describe the distribution of guesses.

The actual width of the room is: Determine the error associated with each guess by subtracting the actual width from each guess and record here: Construct a dotplot of the errors and describe their distribution. In this example we added (a negative...) to each data point/observation. What did this do to the shape? spread? center?

Adding the same number a (either positive, zero, or negative) to each observation: adds a to measures of center and location/ position (mean, median, quartiles, percentiles), BUT does NOT change the shape of the distribution or measures of spread (range, IQR, standard deviation, variance) Transforming Data (multiplying and dividing by constants) EXAMPLE (con't): Suppose we wanted to convert your guesses for the width of the room from feet to inches. Multiply all guesses from the previous example by 12 in./ft. to convert the data to inches. Construct a dotplot of the new data and describe the distribution of guesses (in inches).

In this example we multiplied each data point/ observation by the same constant. What did this do to the shape? spread? center? Multiplying (or dividing) each observation by the same number b (either positive, zero, or negative): multiplies (divides) measures of center and location/position (mean, median, quartiles, percentiles) by b multiplies (divides) measures of spread and location/position (range, IQR, standard deviation) by b (and variance by b 2 ) does NOT change the shape of the distribution

EXAMPLE: In 2010, Taxi Cabs in New York City charged an initial fee of $2.50 plus $2 per mile. In equation form, fare = 2.50 + 2(miles). At the end of a month a businessman collects all of his taxi cab receipts and calculates some numerical summaries. The mean fare he paid was $15.45 with a standard deviation of $10.20. What are the mean and standard deviation of the lengths of his cab rides in miles? In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we ll add one more step to the strategy. Exploring Quantitative Data 1. Always plot your data: make a graph. 2. Look for the overall pattern (shape, center, and spread) and for striking departures such as outliers. 3. Calculate a numerical summary (summary statistics) to brie]ly describe center and spread. 4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.

De#inition: A density curve is a curve that is always on or above the horizontal axis, and has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval. The histogram below shows the distribution of batting average (proportion of hits) for the 432 Major League Baseball players with at least 100 plate appearances in the 2009 season. The smooth curve shows the overall shape of the distribution. In the ]irst graph below, the bars in red represent the proportion of players who had batting averages of at least 0.270. There are 177 such players out of a total of 432, for a proportion of 0.410. In the second graph below, the area under the curve to the right of 0.270 is shaded. This area is 0.391, only 0.019 away from the actual proportion of 0.410.