Chapter 5 ATE: Probability: What Are the Chances? Alternate Activities and Examples [Page 283] Alternate Activity: Whose Book is This? Suppose that 4 friends get together to study at Tim s house for their next test in AP Statistics. When they go for a snack in the kitchen, Tim s three-year-old brother makes a tower using their textbooks. Unfortunately, none of the students wrote his name in the book, so when they leave each student takes one of the books at random. When the students returned the books at the end of the year and the clerk scanned their barcodes, the students were surprised that none of the four had their own book. How likely is it that none of the four students ended up with the correct book? 1. On four equally sized slips of paper, write the numbers 1, 2, 3, 4. 2. Shuffle the papers and lay them down one at a time in a row. If the number on the paper matches it s position in the row (e.g. paper 2 ends up in the second position), this represents a student choosing his own book from the tower of textbooks. Count the number of students who get the correct book. 3. Repeat this several more times, recording the number of students who get the correct book in each trial. 4. Combine your results with your classmates and estimate how often none of the four end up with their own book. [Page 284] Alternate Activity: Random Babies The Whose Book is This? Activity can be explored using the Random Babies applet at www.rossmanchance.com/applets. This applet simulates a stork randomly delivering four babies to four different houses and counting the number of correct deliveries. Both of these problems involve derangements, and so if you want to do an Internet search and learn more, search for derangements. 1. Press Randomize to have the stork deliver the babies. If there is a correct match, the sun will shine on the house, otherwise there will be a storm cloud. The number of matches will be recorded in the histogram. 2. After animating several deliveries, change the Number of trials to 10 and press Randomize. Click inside the bar above 0 to see a plot that records the proportion of 0 s after each trial. 3. Keep pressing Randomize to add more trials to the plot. What seems to be happening to the proportion of 0 s? 4. The theoretical probability of 0 matches is 0.375. Based on this simulation, how would you interpret this value? (If we were to assign 4 babies to four houses at random over and over again, about 37.5% of the time none of the babies would end up at the correct house.)
[Page 285] Alternate Example: Whose Book is This? The graphs below show the short-run and long-run behavior of the proportion of trials in which there are no matches when 4 students choose a book at random. The blue line is the correct probability of 0.375. As you can see, in the first 20 trials, there is quite a bit of variability. However, after 500 trials, the proportion of times there was no match is quite close to the actual value. [Page 286] Alternate Example: Extended Warranties How much should a company charge for an extended warranty for a specific type of cell phone? Suppose that 5% of these cell phones under warranty will be returned and the cost to replace the phone is $150. If the company knew which phones would go bad, it could charge $150 for these phones and $0 for the rest. However, since the company can t know which phones will be returned but knows that about 1 in every 20 will be returned, they should charge at least 150/20 = $7.50 for the extended warranty. [Page 287] Alternate Activity: Streakiness Suppose that a basketball announcer suggests that a certain player is streaky. That is, the announcer believes that if the player makes a shot, then he is more likely to make his next shot. As evidence, he points to a recent game where the player took 30 shots and had a streak of 7 made shots in a row. Is this evidence of streakiness or could it have occurred simply by chance? Assuming this player makes 50% of his shots and the results of a shot don t depend on previous shots, how likely is it for the player to have a streak of 7 or more made shots in a row? 1. Using a coin, let heads represent a made shot and tails represent a missed shot. 2. Flip a coin 30 times, writing down the outcome after each flip. Record the length of the longest streak of made shots. 3. Repeat many times and combine your results with your classmates. In what proportion of the trials did the player have a streak of at least 7 in a row?
Measures from Collection 4 Dot Plot Here are the results of 50 trials of this simulation. The player had a streak of 7 or more made shots in 7 of the 50 simulated games (14%). 2 4 6 8 10 LongestStreak [Page 287] Alternate Example: Runs in Die Rolling Roll a die 12 times and record the result of each roll. Which of the following outcomes is more probable? 123456654321 154524336126 These outcomes are both equally (un)likely, even though the first set of rolls has a more noticeable pattern. [Page 288] Alternate Example: Joe DiMaggio s Hitting Streak There was an interesting discussion of the hot hand in an article in the New York Times written by Samuel Arbesman and Steven Strogatz (http://www.nytimes.com/2008/03/30/opinion/30strogatz.html). In the article, the authors claim that one of the most remarkable streaks in baseball history, Joe DiMaggio s 56-consecutive game hitting streak, was actually not very remarkable at all. Obviously it is extremely unlikely for any particular individual to have a hitting streak this long. But, when considering all the players and all the seasons in baseball history, we should expect some very unusual performances every now and then. To investigate, they simulated the performances of every baseball player in every season a total of 10,000 times. In each of those 10,000 simulated histories of baseball, they recorded the longest hitting streak. In about 42% of the trials of the simulation, someone had a hitting streak of at least 56 games in a row, with the longest being an amazing 109 games in a row!
[Page 289] Alternate Example: Red is Due! In casinos, there is often a large display next to every roulette table showing the outcomes of the last several spins of the wheel. Since the results of previous spins reveal nothing about the results of future spins, why do the casinos pay for these displays? Because many players use the previous results to determine what bets to make, even though it won t help them win. And as long as the players keep making bets, the casino keeps making money. [Page 290] Alternate Example: Stratified Sampling Suppose I want to choose a simple random sample of size 6 from a group of 60 seniors and 30 juniors. To do this, I write each person s name on an equally sized piece of paper and mix them up in a large grocery bag. Just as I am about to select the first name, a thoughtful student suggests that I should stratify by class. I agree, and we decide it would be appropriate to select 4 seniors and 2 juniors. However, since I already mixed up the names, I don t want to have separate them all again. Instead, I will select names one at a time from the bag until I get 4 seniors and 2 juniors. This means, however, that I may need to select more than 6 names (e.g. I may get more than 2 juniors before I get the 4 seniors). Design and carry out a simulation using Table D to estimate the probability that you must draw 8 or more names to get 4 seniors and 2 juniors. State: What is the probability that it takes 10 or more selections to get 4 seniors and 2 juniors? Plan: Using pairs of digits from Table D, we ll label the 60 seniors 01-60 and the 30 juniors 61-90. Numbers 00 and 91-99 will be skipped. Moving left to right across a row, we ll look at pairs of digits until we have 4 different labels from 01-60 and 2 different labels from 61-90. Then, we will count how many different labels from 01-90 we looked at. Do: Here is an example of one repetition, using line 101 from Table D: 19 (senior) 22 (senior) 39 (senior) 57 (senior) 34 (senior) 05 (senior) 75 (junior) 62 (junior) In this example, it took exactly 8 selections to get at least 4 seniors and at least 2 juniors. In this trial, it took nearly 8 selections to get at least 4 seniors and at least 2 juniors. Here are the results of 50 trials: Measures from Sample of Collecti... Dot Plot 6 8 10 12 14 Number Conclude: In the simulation, 11 of the 50 trials required 10 or more selections to get 4 seniors and 2 juniors, so the probability that it takes 8 or more selections is approximately 0.22.
[Page 291 Alternate Example: Picking Teams At a department picnic, 18 students in the Mathematics/Statistics department at a university decide to play a softball game. Twelve of the 18 students are Math majors and 6 are Stats majors. To divide into two teams of 9, one of the professors put all the players names into a hat and drew out 9 players to form one team, with the remaining 9 players forming the other team. The players were surprised when one team was made up entirely of Math majors. Is it possible that the names weren t adequately mixed in the hat, or could this have happened by chance? Design and carry out a simulation to help answer this question. State: What is the probability that when randomly assigning 12 Math majors and 6 Stats majors to two teams that there will be one team with all Math majors? Plan: Using 18 equally sized slips of paper, label 12 M to represent the Math majors and the other 6 S to represent the Stats majors. Shuffle the papers well and divide them into two piles of 9. Count the number of Math majors on each team and record the number of Math majors on the team with the most Math majors. Do: Here is an example of one trial: Team A: MMMSMMMMM (8 Math majors) Team B: MSMMSSSSM (4 Math majors) Since the team with the most Math majors had 8, we will record the value 8 for this trial. Here are the results of 30 trials: Collection 6 Dot Plot 6 7 8 9 10 NumMath Conclude: Since only 1 trial in 30 resulted in a team with all Math majors, the probability is only approximately 0.033. Since getting a team of all Math majors is unlikely, we can conclude that the names were probably not shuffled very well in the hat. [Page 299] Alternate Example: Flipping Coins Imagine flipping a fair coin three times. Problem: Give a probability model for this chance process. Solution: There are 8 possible outcomes when we flip the coin three times: HHH HHT HTH HTT TTT TTH THT THH Since the coin is fair, each of these eight outcomes will be equally likely and have a probability of 1/8.
[Page 302] Alternate Example: AP Statistics Scores Randomly select a student who took the 2010 AP Statistics exam and record the student s score. Here is the probability model: Score 1 2 3 4 5 Probability 0.223 0.183 0.235 0.224 0.125 Problem: (a) Show that this is a legitimate probability model. (b) Find the probability that the chosen student scored 3 or better. Solution: (a) All of the probabilities are between 0 and 1 and the sum of the probabilities is 1, so this is a legitimate probability model. (b) There are two ways to find this probability: By the addition rule, P(3 or better) = 0.235 + 0.224 + 0.125 = 0.584 By the complement rule and addition rule, P(3 or better) = 1 P(2 or less) = 1 (0.233 + 0.183) = 0.584. [Page 303] Alternate Example: Who Owns a Home? What is the relationship between educational achievement and home ownership? A random sample of 500 people who participated in the 2000 census was chosen. Each member of the sample was identified as a high school graduate (or not) and as a home owner (or not). The twoway table displays the data. High School Graduate Not a High School Graduate Total Homeowner 221 119 340 Not a Homeowner 89 71 160 Total 310 190 500 Problem: Suppose we choose a member of the sample at random. Find the probability that the member (a) is a high school graduate (b) is a high school graduate and owns a home (c) is a high school graduate or owns a home Solution: We will define event A as being a high school graduate and event B as being a homeowner. (a) Since 310 of the 500 members of the sample graduated from high school, P(A) = 310/500.
(b) Since 221 of the 500 members of the sample graduated from high school and own a home, P(A and B) = 221/500. (c) Since there are 221 + 89 + 119 = 429 people who graduated from high school or own a home, P(A or B) is 429/500. Note that is inappropriate to compute P(A) + P(B) to find this probability since the events A and B are not mutually exclusive there are 221 people who are both high school graduates and own a home. If you did add these probabilities, the result would be 650/500, which is clearly wrong since the probability is greater than 1. [Page 306] Alternate Example: Who Owns a Home? Here is the two-way table summarizing the relationship between educational status and home ownership from the previous example: High School Graduate Not a High School Graduate Total Homeowner 221 119 340 Not a Homeowner 89 71 160 Total 310 190 500 The four distinct regions in the Venn diagram below correspond to the four non-total cells in the two-way table as follows: Region in Venn Diagram In Words In Symbols Count In the intersection of two circles HS grad and owns home A B 221 Inside circle A, outside circle B HS grad and doesn t own home A B c 89 Inside circle B, outside circle A Not HS grad but owns home A c B 119 Outside both circles Not HS grad and doesn t own home A c B c 71 A B 89 221 119 71
[Page 307] Alternate Example: Phone Usage According to the National Center for Health Statistics, (http://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless200905_tables.htm#t1), in December 2008, 78% of US households had a traditional landline telephone, 80% of households had cell phones, and 60% had both. Suppose we randomly selected a household in December 2008. Problem: (a) Make a two-way table that displays the sample space of this chance process. (b) Construct a Venn diagram to represent the outcomes of this chance process. (c) Find the probability that the household has at least one of the two types of phones. (d) Find the probability the household has a cell phone only. Solution: We will define events A: has a landline and B: has a cell phone. (a) Cell Phone No Cell Phone Total Landline 0.60 0.18 0.78 No Landline 0.20 0.02 0.22 Total 0.80 0.20 1.00 (b) A: Landline B: Cell phone 0.18 0.60 0.20 0.02 (c) To find the probability that the household has at least one of the two types of phones, we need to find the probability that the household has a landline, a cell phone, or both. P(A B) = P(A) + P(B) P(A B) = 0.78 + 0.80 0.60 = 0.98. There is a 98% chance that the household has at least one of the two types of phones. (d) P(cell phone only) = P(A c B) = 0.20
[Page 313] Alternate Example: Who Owns a Home? High School Graduate Not a High School Graduate Total Homeowner 221 119 340 Not a Homeowner 89 71 160 Total 310 190 500 1. If we know that a person owns a home, what is the probability that the person is a high school graduate? There are a total of 340 people in the sample that own a home. Because there are 221 high school graduates among the 340 home owners, the desired probability is P(is a high school graduate given owns a home) = 221/340 or 65% 2. If we know that a person is a high school graduate, what is the probability that the person owns a home? There are a total of 310 people who are high school graduates. Because there are 221 home owners among the 310 high school graduates, the desired probability is P(owns a home given is a high school graduate) = 221/310 or about 71% [Page 315] Alternate Example: Who Owns a Home? The events of interest in this scenario were A: is a high school graduate and B: owns a home. We already learned that P(B) = 340/500 = 68% and that P(B A) = 221/310 = 71.2%. That is, we know that a randomly selected member of the sample has a 68% probability of owning a home. However, if we know that the randomly selected member is a high school graduate, the probability of owning a home increases to 71.2%. [Page 316] Alternate Example: Allergies Is there a relationship between gender and having allergies? To find out, we used the random sampler at the United States Census at School website (www.amstat.org/censusatschool) to randomly select 40 US high school students who completed a survey. The two-way table shows the gender of each student and whether the student has allergies. Female Male Total Allergies 10 8 18 No Allergies 13 9 22 Total 23 17 40 Problem: Are the events female and allergies independent? Justify your answer. Solution: To check if two events are independent, we need to check if knowing a student s gender affects the probability that the student has allergies. If a student is female, then the
probability she has allergies is P(allergies female) = 10/23= 0.435. However, the unconditional probability of having allergies is P(allergies) = 18/40 = 0.45. These two probabilities are close, but not equal, so the events female and allergies are not independent. Knowing that a student was female slightly lowered the probability that she has allergies. [Page 318] Alternate Example: Picking Two Sneezers In the previous alternate example, we used a two-way table that classified 40 students according to their gender and whether they had allergies. Here is the table again. Female Male Total Allergies 10 8 18 No Allergies 13 9 22 Total 23 17 40 Problem: Suppose we chose 2 students at random. (a) Draw a tree diagram that shows the sample space for this chance process. (b) Find the probability that both students suffer from allergies. Solution: (a) (b) To get two students who suffer from allergies, we need to get an allergy sufferer for the first student and an allergy sufferer for the second student. Following along the top branches of the tree, we see that the probability is: P(two allergy sufferers) = P(1 st student has allergies and 2 nd student has allergies) = P(1 st student has allergies) P(2 nd student has allergies 1 st student has allergies) = (18/40)(17/39) = 0.196 There is about a 20% chance of selecting two students with allergies.
[Page 319] Alternate Example: Playing in the NCAA About 55% of high school students participate in a school athletic team at some level and about 5% of these athletes go on to play on a college team in the NCAA. (http://www.washingtonpost.com/wp-dyn/content/article/2009/09/23/ar2009092301947.html, http://www.collegesportsscholarships.com/percentage-high-school-athletes-ncaa-college.htm) Problem: What percent of high school students play a sport in high school and go on to play a sport in the NCAA? Solution: We know P(high school sport) = 0.55 and P(NCAA sport high school sport) = 0.05, so P(high school sport and NCAA sport) = P(high school sport) P(NCAA sport high school sport) = (0.55)(0.05) = 0.0275. Almost 3% of high school students will play a sport in high school and in the NCAA. [Page 320] Alternate Example: Media Usage and Good Grades In January 2010, the Kaiser Family Foundation released a study about the influence of media in the lives of young people ages 8-18 (http://www.kff.org/entmedia/mh012010pkg.cfm). In the study, 17% of the youth were classified as light media users, 62% were classified as moderate media users and 21% were classified as heavy media users. Of the light users who responded, 74% described their grades as good (A s and B s), while only 68% of the moderate users and 52% of the heavy users described their grades as good. According to this study, what percent of young people ages 8-18 described their grades as good? State: What percent of young people get good grades? Plan: If we choose a subject in the study at random, P(light user) = 0.17, P(moderate user) = 0.62, P(heavy user) = 0.21, P(good grades light user) = 0.74, P(good grades moderate user) = 0.68, P(good grades heavy user) = 0.52. We want to find the unconditional probability P(good grades). A tree diagram should help. Do: There are three groups of students who get good grades, those who are light users and get good grades, those who are moderate users and get good grades, and those who are heavy users and get good grades. Because these groups are mutually exclusive, we can add the probabilities
of being in one of these three groups. P(good grades) = (0.17)(0.74) + (0.62)(0.68) + (0.21)(0.52) = 0.1258 + 0.4216 + 0.1092 = 0.6566. Conclude: About 66% of the students in the study described their grades as good. [Page 321] Alternate Example: Perfect Games In baseball, a perfect game is when a pitcher doesn t allow any hitters to reach base in all nine innings. Historically, pitchers throw a perfect inning an inning where no hitters reach base about 40% of the time (http://www.baseballprospectus.com/article.php?articleid=11110). So, to throw a perfect game, a pitcher needs to have nine perfect innings in a row. Problem: What is the probability that a pitcher throws nine perfect innings in a row, assuming the pitcher s performance in an inning is independent of his performance in other innings? Solution: The probability of having nine perfect innings in a row is P(inning 1 perfect and inning 2 perfect and and inning 9 perfect) = P(inning 1 perfect) P(inning 2 perfect) P(inning 9 perfect) = (0.4)(0.4) (0.4) = (0.4) 9 = 0.00026. With 30 teams playing 162 games per season, this means we would expect to see about 1.3 perfect games per season. However, in baseball history, perfect games are much more rare, occurring once every 4 or 5 seasons, on average. This discrepancy suggests that our assumption of independence may not have been a good one. [Page 322] Alternate Example: First Trimester Screen The First Trimester Screen is a non-invasive test given during the first trimester of pregnancy to determine if there are specific chromosomal abnormalities in the fetus. According to a study published in the New England Journal of Medicine in November 2005 (http://www.americanpregnancy.org/prenataltesting/firstscreen.html), approximately 5% of normal pregnancies will receive a positive result. Among 100 women with normal pregnancies, what is the probability that there will be at least one false positive? State: If 100 women with normal pregnancies are tested with the First Trimester Screen, what is the probability that at least one woman will receive a positive result? Plan: It is reasonable to assume that the test results for different women are independent. To find the probability of at least one false positive, we can use the complement rule and the probability that none of the women will receive a positive test result. P(at least one positive) = 1 P(no positive results) Do: For women with normal pregnancies, the probability that a single test is not positive is 1 0.05 = 0.95. The probability that all 100 women will get negative results is (0.95)(0.95) (0.95) = (0.95) 100 = 0.0059. Thus, P(at least one positive) = 1 0.0059 = 0.9941. Conclude: There is over a 99% probability that at least one of the 100 women with normal pregnancies will receive a false positive on the First Trimester Screen.
[Page 323] Alternate Example: Weather Conditions Hacienda Heights and La Puente are two neighboring suburbs in the Los Angeles area. According to the local newspaper, there is a 50% chance of rain tomorrow in Hacienda Heights and a 50% chance of rain in La Puente. Does this mean there is a (0.5)(0.5) = 0.25 probability that it will rain in both cities tomorrow? No. It is not appropriate to multiply the two probabilities since the events aren t independent. If it is raining in one of these locations, there is a very high probability that it is raining in the other location. However, suppose that there was also a 50% chance of rain in New York tomorrow. To find the probability it will rain in Hacienda Heights and New York it would be appropriate to multiply the probabilities since it is reasonable to believe that the weather in Hacienda Heights is independent of the weather in New York. [Page 324] Alternate Example: Phone Usage In an alternate example in section 5.2, we classified US households according to the types of phones they used. Cell Phone No Cell Phone Total Landline 0.60 0.18 0.78 No Landline 0.20 0.02 0.22 Total 0.80 0.20 1.00 Problem: What is the probability that a randomly selected household with a landline also has a cell phone? Solution: We want to find P(cell phone landline). Using the conditional probability formula, P(cell phone and landline) P(cell phone landline) = P(landline) that a landline user also has a cell phone. = 0.60 0.77. There is a 77% chance 0.78
[Page 325] Alternate Example: Media Usage and Good Grades In an earlier alternate example, we looked at the relationship between media usage and grades for youth ages 8-18. What percent of students with good grades are heavy users of media? The tree diagram below summarizes the probabilities given earlier. State: What percent of youth with good grades are heavy users of media? P(heavy user and good grades) Plan: We want to find P(heavy user good grades) = P(good grades) P(heavy user and good grades) Do: Using the tree diagram, P(heavy user good grades) = = P(good grades) 0.1092 0.1092 0.166. 0.1258 0.4216 0.1092 0.6566 Conclude: About 16.6% of youth with good grades are heavy users of media. Since this is less than the unconditional probability of being a heavy user of media (0.21), knowing that a youth gets good grades lowers the probability that the youth is heavy user of media.
[Page 326] Alternate Example: False Positives and Drug Testing Many employers require prospective employees to take a drug test. A positive result on this test indicates that the prospective employee uses illegal drugs. However, not all people who test positive actually use drugs. Suppose that 4% of prospective employees use drugs, the false positive rate is 5% and the false negative rate is 10%. (http://www.cbsnews.com/stories/2010/06/01/health/webmd/main6537635.shtml) Problem: What percent of people who test positive actually use illegal drugs? Solution: The tree diagram summarizes this situation. P(took drugs and tested positive) P(took drugs positive test) =. There are two groups who P(tested positive) tested positive, those who take drugs and test positive (probability = 0.036) and those who don t take drugs and test positive (probability = 0.048), so the probability of testing positive is 0.036 + 0.036 0.036 0.048 = 0.084. Thus, P(took drugs positive test) = 0.429. That is, 0.036 0.048 0.084 42.9% of the prospective employees who test positive actually took drugs.