Descriptive Stats. Review

Similar documents
Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Analyzing Categorical Data & Displaying Quantitative Data Section 1.1 & 1.2

Chapter 3 Displaying and Describing Categorical Data

Full file at

Chapter 2 - Displaying and Describing Categorical Data

Chapter 3 - Displaying and Describing Categorical Data

STT 315 Section /19/2014

Reminders. Homework scores will be up by tomorrow morning. Please me and the TAs with any grading questions by tomorrow at 5pm

How are the values related to each other? Are there values that are General Education Statistics

Chapter 4 Displaying Quantitative Data

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs

Descriptive Statistics

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Name Date Period. E) Lowest score: 67, mean: 104, median: 112, range: 83, IQR: 102, Q1: 46, SD: 17

Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots

STAT 101 Assignment 1

Unit 3 - Data. Grab a new packet from the chrome book cart. Unit 3 Day 1 PLUS Box and Whisker Plots.notebook September 28, /28 9/29 9/30?

Bivariate Data. Frequency Table Line Plot Box and Whisker Plot

CHAPTER 2 Modeling Distributions of Data

Fundamentals of Machine Learning for Predictive Data Analytics

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data.

Descriptive Statistics Project Is there a home field advantage in major league baseball?

North Point - Advance Placement Statistics Summer Assignment

Solutionbank S1 Edexcel AS and A Level Modular Mathematics

Diameter in cm. Bubble Number. Bubble Number Diameter in cm

BASEBALL SALARIES: DO YOU GET WHAT YOU PAY FOR? Comparing two or more distributions by parallel box plots

Chapter 3.4. Measures of position and outliers. Julian Chan. September 11, Department of Mathematics Weber State University

Today s plan: Section 4.2: Normal Distribution

3.3 - Measures of Position

CHAPTER 2 Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data

Quantitative Literacy: Thinking Between the Lines

What s the difference between categorical and quantitative variables? Do we ever use numbers to describe the values of a categorical variable?

1wsSMAM 319 Some Examples of Graphical Display of Data

IHS AP Statistics Chapter 2 Modeling Distributions of Data MP1

AP Statistics Midterm Exam 2 hours

% per year Age (years)

Lab 5: Descriptive Statistics

Chapter 2 Displaying and Describing Categorical Data

STANDARD SCORES AND THE NORMAL DISTRIBUTION

Warm-Up: Create a Boxplot.

Unit 3 ~ Data about us

PRACTICE PROBLEMS FOR EXAM 1

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Lesson 2.1 Frequency Tables and Graphs Notes Stats Page 1 of 5

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Running head: DATA ANALYSIS AND INTERPRETATION 1

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

Lesson 3 Pre-Visit Teams & Players by the Numbers

Confidence Interval Notes Calculating Confidence Intervals

Math 243 Section 4.1 The Normal Distribution

Section 3.2: Measures of Variability

CHAPTER 1 ORGANIZATION OF DATA SETS

The pth percentile of a distribution is the value with p percent of the observations less than it.

Frequency Distributions

Age of Fans

Stat 139 Homework 3 Solutions, Spring 2015

Study Guide and Intervention

Quiz 1.1A AP Statistics Name:

AP STATISTICS Name Chapter 6 Applications Period: Use summary statistics to answer the question. Solve the problem.

Statistics. Wednesday, March 28, 2012

Statistics Class 3. Jan 30, 2012

Was John Adams more consistent his Junior or Senior year of High School Wrestling?

Descriptive Statistics. Dr. Tom Pierce Department of Psychology Radford University

Algebra 1 Unit 6 Study Guide

Practice Test Unit 6B/11A/11B: Probability and Logic

WorkSHEET 13.3 Univariate data II Name:

WHAT IS THE ESSENTIAL QUESTION?

Practice Test Unit 06B 11A: Probability, Permutations and Combinations. Practice Test Unit 11B: Data Analysis

0-13 Representing Data

Histogram. Collection

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

In the actual exam, you will be given more space to work each problem, so work these problems on separate sheets.

Math 230 Exam 1 Name October 2, 2002

Effective Use of Box Charts

ACTIVITY: Drawing a Box-and-Whisker Plot. a. Order the data set and write it on a strip of grid paper with 24 equally spaced boxes.

Box-and-Whisker Plots

Box-and-Whisker Plots

Math 146 Statistics for the Health Sciences Additional Exercises on Chapter 2

Year 10 Term 2 Homework

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Organizing Quantitative Data

DS5 The Normal Distribution. Write down all you can remember about the mean, median, mode, and standard deviation.

4. Fortune magazine publishes the list of the world s billionaires annually. The 1992 list (Fortune,

9.3 Histograms and Box Plots

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

That pesky golf game and the dreaded stats class

AP Stats Chapter 2 Notes

STAT 155 Introductory Statistics. Lecture 2: Displaying Distributions with Graphs

Chapter 2 - Frequency Distributions and Graphs

NCSS Statistical Software

Math 227 Test 1 (Ch2 and 3) Name

Box-and-Whisker Plots

Assessment Schedule 2016 Mathematics and Statistics: Demonstrate understanding of chance and data (91037)

CHAPTER 1 Exploring Data

Box-and-Whisker Plots

STA 103: Midterm I. Print clearly on this exam. Only correct solutions that can be read will be given credit.

MATH 118 Chapter 5 Sample Exam By: Maan Omran

NCAA March Madness Statistics 2018

Mrs. Daniel- AP Stats Ch. 2 MC Practice

Transcription:

Descriptive Stats Review

Categorical Data The Area Principal Distorts the data possibly making it harder to compare categories Everything should add up to 100% When we add up all of our categorical data, we should get 100% Know how to read a contingency table

Survival Contingency Table Class First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 Total 325 285 706 885 2201 The percentage of passengers who were both in first-class and survived? 203/2201 or 9.2% The percentage of first-class passengers who survived? 203/325 or 62.5% The percentage of the survivors who were in first-class? 203/711 or 28.6%

Survival Contingency Table Class First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 Total 325 285 706 885 2201 What are the marginal distributions? Survival (711 alive, 1490 dead) and Class (325 first, 285 second, 706 third, 885 crew) Conditional distributions? All the middle values! Is survival independent of class? Why? NO! b/c percent of first-class passengers that survived is 203/325 or 62.5%, where as the percentage of crew members that survived is 212/885 or 24.0%.

Categorical Data Make sure you have enough individuals in your data. I can make a free-throw 75% of the time. I just happened to make three out of four shots and then called quits. Don t overstate your claim. Independence is an important concept, but it is rare for two variable to be entirely independent. We can t conclude that that one variable has no effect whatsoever on another. Usually, all we know is that little effect was observed in our study. Other studies of other groups under other conditions might find different results.

Simpson s Paradox Ronnie Belliard 2002 61/289,.211 of his at-bats were hits 2003 124/447,.277 of his at-bats were hits Two-season average: 185/736, hits.2514 of the time Casey Blake 2002 4/20,.200 of his at-bats were hits 2003 143/557,.257 of his at-bats were hits Two-season average: 147/577, hits.2548 of the time The two season batting avg. for Belliard was lower than Blake s, but divided into separate seasons, Belliard s had a higher batting avg. both seasons. This is Simpson s Paradox.

Quantitative Data Have with you, and know how to use your calculator! Don t make a histogram of categorical data. Just because a zip code is a number does not automatically make it quantitative data Don t CUSS & BS a bar chart. Histograms are what causes us to swear

CUSS & BS Center Unusual features (gaps, outliers) Shape Spread & Be Specific

Measures of Center? Mean, median Shape? Unimodal, bimodal, multimodal, uniform, symmetric, skewed left, skewed right Spread? IQR, SD, consistent or varied?

Histograms Choose a bin width appropriate to the data. Too Small Too Large

The 5 number summary Min Q1 Median Q3 Max

Outliers! Resistant? Median, IQR Non-resistant? Mean, SD How do we calculate outliers? Upper Fence: Q3 + (1.5)IQR Lower Fence: Q1 (1.5)IQR

What is this variance you speak of? How are variance and SD related? The square root of variance is SD Or SD squared is variance Variance Standard Deviation

Oh great, Greek letters What is the difference between µ and µ - population average xҧ - sample average x? ҧ σ and s? σ population standard deviation s sample standard deviation

Comparing Distributions Avoid inconsistent scales! Label Clearly! Outliers! CUSS & BS all distributions! Always mention center, unusual features, shape, spread BE SPECIFIC! (provide a value where applicable)

Comparing Distributions When comparing histograms we can compare: Center Unusual features Shape Spread Shocking right?

Comparing Distributions When Comparing Boxplots: Compare the shapes. Do the boxes look symmetric or skewed? Compare medians. Which group has a larger center? Any idea why? Compare IQRs. Which groups is more varied? Consistent? Identify outliers if any Remember how to find the upper and lower fence!

Re-expressing Data What do we do if our real data sucks Make it not suck! But how? Re-express the data by applying a function to it Common functions are the square root, or log functions. Don t forget to convert our findings back from our re-expressed data!

Categorical Practice What percent of the class are females with democratic political views? What percent of the democratic are females? What percent of the females are democratic? What is the marginal frequency distribution of political views? What is the conditional relative frequency distribution of gender among republicans? Are gender and political view independent?

Weight of Pennies Make a Histogram (while we CUSS & BS) 2.57, 2.56, 3.14, 3.03, 3.13, 2.47, 2.43, 3.11, 3.06, 2.48 2.51, 2.50, 3.07, 3.08, 3.01, 2.45, 2.50, 3.13, 2.51, 3.12 3.10,3.08, 2.46, 2.44, 2.47, 2.54, 3.09, 3.13, 2.56, 2.49

Comparing Distributions Here are the weekly payrolls for two imaginary restaurants, Mooseburgers and McTofu. 1. Find the 5-number summaries for both 2. Create parallel boxplots. Label your graph 3. Write a few sentences comparing the distributions 4. Which restaurant pays the higher average salary? 5. Why is the mean salary misleading? 6. Where would you rather work? Explain with stats!