Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Similar documents
Stat 139 Homework 3 Solutions, Spring 2015

Chapter 12 Practice Test

Section I: Multiple Choice Select the best answer for each problem.

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

3.3 - Measures of Position

Unit 3 - Data. Grab a new packet from the chrome book cart. Unit 3 Day 1 PLUS Box and Whisker Plots.notebook September 28, /28 9/29 9/30?

Running head: DATA ANALYSIS AND INTERPRETATION 1

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Unit 4: Inference for numerical variables Lecture 3: ANOVA

STANDARD SCORES AND THE NORMAL DISTRIBUTION

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Reminders. Homework scores will be up by tomorrow morning. Please me and the TAs with any grading questions by tomorrow at 5pm

Solutionbank S1 Edexcel AS and A Level Modular Mathematics

% per year Age (years)

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

1wsSMAM 319 Some Examples of Graphical Display of Data

Descriptive Stats. Review

Full file at

CHAPTER 2 Modeling Distributions of Data

Week 7 One-way ANOVA

Algebra 1 Unit 6 Study Guide

Quantitative Literacy: Thinking Between the Lines

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

Lower Columbia River Dam Fish Ladder Passage Times, Eric Johnson and Christopher Peery University of Idaho

One-way ANOVA: round, narrow, wide

Practice Test Unit 6B/11A/11B: Probability and Logic

Example 1: One Way ANOVA in MINITAB

STT 315 Section /19/2014

Unit 3 ~ Data about us

5.1 Introduction. Learning Objectives

Today s plan: Section 4.2: Normal Distribution

Practice Test Unit 06B 11A: Probability, Permutations and Combinations. Practice Test Unit 11B: Data Analysis

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Fundamentals of Machine Learning for Predictive Data Analytics

Palythoa Abundance and Coverage in Relation to Depth

ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010

Taking Your Class for a Walk, Randomly

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs

How are the values related to each other? Are there values that are General Education Statistics

AP Statistics Midterm Exam 2 hours

Name May 3, 2007 Math Probability and Statistics

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

Updated and revised standardized catch rate of blue sharks caught by the Taiwanese longline fishery in the Indian Ocean

Legendre et al Appendices and Supplements, p. 1

Chapter 2: Modeling Distributions of Data

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Sample Final Exam MAT 128/SOC 251, Spring 2018

The pth percentile of a distribution is the value with p percent of the observations less than it.

One-factor ANOVA by example

CHAPTER 1 ORGANIZATION OF DATA SETS

Youngs Creek Hydroelectric Project (FERC No. P 10359)

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

Effective Use of Box Charts

Descriptive Statistics Project Is there a home field advantage in major league baseball?

North Point - Advance Placement Statistics Summer Assignment

Diameter in cm. Bubble Number. Bubble Number Diameter in cm

SUPPLEMENTARY INFORMATION

Standardized catch rates of U.S. blueline tilefish (Caulolatilus microps) from commercial logbook longline data

The Five Magic Numbers

Political Science 30: Political Inquiry Section 5

Chapter 3.4. Measures of position and outliers. Julian Chan. September 11, Department of Mathematics Weber State University

Driv e accu racy. Green s in regul ation

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data.

BASEBALL SALARIES: DO YOU GET WHAT YOU PAY FOR? Comparing two or more distributions by parallel box plots

Standardized catch rates of yellowtail snapper ( Ocyurus chrysurus

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

Algebra 1 Unit 7 Day 2 DP Box and Whisker Plots.notebook April 10, Algebra I 04/10/18 Aim: How Do We Create Box and Whisker Plots?

Is lung capacity affected by smoking, sport, height or gender. Table of contents

IHS AP Statistics Chapter 2 Modeling Distributions of Data MP1

Addendum to SEDAR16-DW-22

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Select Boxplot -> Multiple Y's (simple) and select all variable names.

Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots

NAME: Math 403 Final Exam 12/10/08 15 questions, 150 points. You may use calculators and tables of z, t values on this exam.

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Descriptive Statistics

Confidence Intervals with proportions

Confidence Interval Notes Calculating Confidence Intervals

STAT 101 Assignment 1

Bivariate Data. Frequency Table Line Plot Box and Whisker Plot

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

Chapter 4 Displaying Quantitative Data

USING DELTA-GAMMA GENERALIZED LINEAR MODELS TO STANDARDIZE CATCH RATES OF YELLOWFIN TUNA CAUGHT BY BRAZILIAN BAIT-BOATS

Lab 5: Descriptive Statistics

Analyzing Categorical Data & Displaying Quantitative Data Section 1.1 & 1.2

Youngs Creek Hydroelectric Project

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

ISyE 6414 Regression Analysis

STA 103: Midterm I. Print clearly on this exam. Only correct solutions that can be read will be given credit.

Mrs. Daniel- AP Stats Ch. 2 MC Practice

Chapter 5: Methods and Philosophy of Statistical Process Control

A few things to remember about ANOVA

Year 10 Term 2 Homework

Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework

STAT 625: 2000 Olympic Diving Exploration

Lecture 16: Chapter 7, Section 2 Binomial Random Variables

Organizing Quantitative Data

Discussion on the Selection of the Recommended Fish Passage Design Discharge

Transcription:

Data Set 7: Bioerosion by Parrotfish Background Bioerosion of coral reefs results from animals taking bites out of the calcium-carbonate skeleton of the reef. Parrotfishes are major bioerosion agents, and excrete ingested coral or rock as sand. Ling Ong, a doctoral student in the Zoology department, has estimated the total rate of such bioerosion and sand production in Hanauma Bay by two species of parrotfishes: The Spectacled parrotfish Chlorurus perspicillatus. This species is endemic to Hawai i. It has two color phases: the initial phase (top at right), typically female and called ulu 'ahu'ula and the terminal phase (second at right), always male and called uhu uli uli. The Redlip parrotfish Scarus rubroviolaceus ( palukaluka ). It also has initial (third at right) and terminal (bottom right) color phases. A nice summary of Ling s research is at http://www.friendsofhanaumabay.org/parrotfish.html. To estimate the amount of bioerosion she has: estimated the number of fish in the bay, by species, phase and size class; observed fish to estimate the rate of bites on (dead) coral, by species, phase, size class, time of day, and season; and measured length, width, and depth of scars from bites by observed fish, again by fish species, phase, and size class. This handout will use data from the last of these studies to compare the volume of bites by the two species. The Spectacled parrotfish has what is called an excavator jaw and muscle morphology, which is more suited to removing substrate than the scraper morphology of the Redlip parrotfish, so Ling expected the former to have larger bite volumes. The question: Do the two species of parrotfish differ in the volume of material removed per bite? In particular, are Spectacled parrotfish bites larger than Redlip parrotfish bites?

The data Procedure Individual fish were followed by a scuba diver, and the first visible bite it made was marked. The species and phase of the fish was recorded and its length was estimated in cm size classes. The diver then returned to the marked bite and used a caliper to measure the length, width, and depth, in mm. From these measurements the volume, in mm 3, of each bite was calculated. Data were obtained for 71 fish of each species. The variable Bite volume, not surprisingly, is positively correlated with the size of the fish, and there was a wide range of fish sizes in the data set. The distribution of bite sizes was skewed, with a longer right tail, and the variability of the bite sizes was greater for larger fish (and larger bites). The variable used in the following analysis therefore is the natural log of the bite volume, adjusted to remove the effect of fish size. * The data Spectacled parrotfish: 3.90123 3.9296 4.23013 3.0990 2.26680 4.99827 1.98299 2.26031 4.7842 1.6206 4.02137 4.26984 3.34218 3.10231 4.307 3.97369 3.41949 3.99316 3.86271 2.10427 4.21161 2.2071 2.6483 3.08779 4.06863 2.3987 3.68881 1.8489 2.0810 2.66496 3.89647 4.14366 2.92932 2.1677 3.61244 4.0219 4.79243 3.3723 3.82729 4.08780.18730 4.62477 2.0772.3040 4.08 3.96068 4.1966 3.97734 3.712 3.18461 2.00191 2.82289 1.62649 4.62933 4.0108 4.72144 1.9074 2.21872 2.72849 4.11236 3.68848.02292 3.74172 4.4668 2.98126-0.431 3.7039 4.212 0.46403 2.72990 3.73361 Redlip parrotfish: 3.7222 2.71270 4.14807 2.1260 1.18742.31680 3.3437 3.7708 4.03709 4.33899 3.8496 4.48878 1.38480 3.09747 4.4060 2.806 3.4932 2.8036 4.22320 3.7798 3.8864 4.24197 3.4909 4.63324 4.10483 4.7647 1.84641 4.40461 4.9747 3.09007.049 3.06241 4.117 1.49617 3.98338 2.1846 3.2738 4.9638 3.21262 2.8966 3.02876 3.82693 3.4888 4.1311 2.8991 3.73866 2.6309 3.1481 4.13907 3.0837 3.48708 2.2681 4.407.086 2.8808 4.3780 4.7368 4.71623 3.0930.08019 4.4388 4.72041 2.16422 3.41687 3.7037 0.47616 3.69194 3.7491 3.20732 3.62361 4.3768 *. The size adjustment was based on a regression of log-transformed bite volume against fish size, with a common slope for the two species but separate intercepts. This regression model suited the data well, according to the usual residual plots and other diagnostics. The slope of this regression then was used to adjust all bite volumes as if all fish were the same size (the overall mean): to each observation was added the product of this slope times the difference between the mean size and the individual fish s size. Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200) 2

Data exploration Displays of the distributions Histograms and boxplots Frequency 20 1 10 0 0 1 2 3 4 anel variable: Species 0 1 2 3 4 Redlip Spectacled log of bite volume, adjusted for fish size log of bite volume, adjusted 6 4 3 2 1 0 Redlip Spectacled The distributions are similar between the two species. Both are skewed, more fish having large bites and smaller numbers of much smaller bites creating long left tails. The smallest bites are considered outliers in the boxplots; in the histograms the smallest Redlip value does not look outlying while the two smallest Spectacled values do. The centers of the two distributions appear to be very similar. The lower/left end of the Spectacled distribution is slightly more stretched out, but otherwise there seems to be little if any difference between the species. NQQ plots 99.9 99 9 90 Percent 80 70 60 0 40 30 20 10 Spectacled Redlip 1 0.1 0 1 2 3 Data 4 6 The superimposed NQQ plots show that the distributions are very similar, except that the Redlip distribution is smoother than the Spectacled distribution. Both clearly are skewed. Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200) 3

Statistical summary Species Mean StDev Minimum Q1 Median Q3 Maximum Redlip 3.601 0.997 0.476 3.062 3.722 4.377.0 Spectacled 3.409 1.11-0.436 2.648 3.734 4.144.304 These statistics suggest that bites by Spectacled parrotfish are slightly smaller than those by Redlip parrotfish: the Spectacled maximum, third quartile and mean all are about 0.2 units smaller than the corresponding Redlip statistics, and the first quartile and minimum are 0.4 and 0.9 units smaller. Interestingly, the Spectacled median is very slightly larger than the Redlip median. The distances between quantiles get smaller going from minimum Q1 up through Q3 maximum, reflecting the skew of the distributions; the differences between means and medians also show this. The spreads of the distributions also are similar, but slightly smaller for Redlips (IQR: 1.31 vs. 1.496; standard deviation: 0.997 vs. 1.11). Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200) 4

Inference The purpose of the study was to compare two populations, so the appropriate inference is two-sample hypothesis tests. Ling did have an a priori expectation that Spectacled parrotfish, with their excavator jaws, would have larger bites than the Redlip parrotfish, with scaper jaws. I feel, though, that it would be desirable to detect a difference regardless of which direction it was in. The tests therefore will be two-sided. They will be supplemented by estimates of the difference, with confidence intervals. Scope of inference The fish used in this study were selected haphazardly by the diver. Because they spanned a wide range of sizes this selection probably was effectively independent of their (size-adjusted) bite volumes. In addition, there were many of them over the full range of sizes present, and of both phases. I therefore expect that it is safe to generalize the results of this analysis to the two species populations in Hanauma Bay, which is what Ling intends to do. Fish sizes, abundances, and behavior, coral abundance and composition, and many other relevant factors are likely to be different in Hanauma Bay, which has been protected from fishing for many years, than elsewhere, so extending the conclusions beyond the Bay would be unwise. t procedures The test is of H 0 : µ R µ S = 0 vs. H a : µ R µ S 0. The results, from Minitab, are: Difference = mu (Red) - mu (Spec) Estimate for difference: 0.1914 9% CI for difference: (-0.19614, 0.4224) T-Test of difference = 0 (vs not =): T-Value = 1.08 P-Value = 0.283 DF = 138 This test does not give evidence to reject the null hypothesis of no difference. Although the Redlip sample mean is 0.2 units larger than the Spectacled sample mean, the 9% confidence interval for the difference between the population means extends from about -0.16 units to about 0.4 units: the margin of error is nearly twice the difference between the sample means. A retrospective power analysis can help interpret this non-significant result. Assuming parameters matching those observed in this data set (n R = n S = 71, σ R = σ S = 1.1, µ R µ S = 0.2, and α = 0.0), the power of the test is only 0.189. To get the power up to 0., with the same within-species variability, would require the true means to differ by 0.364 units, nearly twice the observed difference. Alternatively, with the same within-species variability and true means differing by 0.2 units, it would take 234 observations per species to achieve power of 0.. What these analyses indicate is that despite the large sample size 1420 bites marked and measured the test does not have much power for detecting small differences between the species. This is a result of the large within-species variability, relative to the between-species difference. Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200)

Nonparametric procedures These procedures test H 0 : M R M S = 0 vs. H a : M R M S 0 and estimate M R - M S (where M is the population median, which Minitab calls ETA). Rank-sum: Point estimate for ETA1-ETA2 is 0.109 9.0 Percent CI for ETA1-ETA2 is (-0.1977,0.4924) W = 289.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.3871 Mood s median test: Chi-Square = 0.03 DF = 1 P = 0.867 Species N<= N> Median Q3-Q1 Red 36 3 3.722 1.314 Spec 3 36 3.734 1.49 Overall median = 3.728 A 9.0% CI for median(red) - median(spec): (-0.392,0.340) These results are similar to those from the t procedures above. The P-values, especially for the median test (as expected due to its generally lower power), are even larger than for the t test. The confidence intervals for the difference in population medians are about as wide as the t CI for the difference in means, though they are shifted somewhat down. Resampling procedures The tests are of H 0 : µ R µ S = 0 vs. H a : µ R µ S 0. Minitab, bootstrap, unpooled: P = 0.270 randomization (pooled): P = 0.326 S-Plus, randomization (pooled): P = 0.288 These tests all give results similar to those from the t test: P-values around 0.28. The 9% confidence intervals for the difference in the population means, from S-Plus, are: Percentiles: -0.141279 0.301874 BCa: -0.141962 0.296647 Tilting: -0.182126 0.487306 T using Bootstrap SE: -0.167813 0.39690 These confidence intervals are slightly narrower than the standard t CI, but they really are quite like each other and the t CI. The resampling distributions produced by S-Plus (next page) are quite close to normal, for both the bootstrap estimation of the CIs (top pair of plots) and the randomization test (bottom pair of plots). The bootstrap distribution shows little bias (its mean is very close to the mean of the actual sample), while the randomization distribution shows that the observed value is somewhat off the center of the distribution but a large fraction of randomization producing statistics more extreme than the observed one. Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200) 6

bootstrap : parrotbites2$adjl... : mean : Red - Spec Density 0.0 0. 1.0 1. 2.0 Observed Mean mean -0.2 0.0 0.2 0.4 0.6 0.8-0.4-0.2 0.0 0.2 0.4 0.6 0.8 mean -2 0 2 Quantiles of Standard Normal permutation : parrotbites2$adjl... : mean : Red - Spec Density 0.0 0. 1.0 1. 2.0 Observed Mean Var -0.6-0.4-0.2 0.0 0.2 0.4 0.6-0.8-0.6-0.4-0.2 0.0 0.2 0.4 0.6 Var -2 0 2 Quantiles of Standard Normal Which procedure to use? The sample distributions clearly are skewed, but not terribly strongly. With sample sizes of 71, the Central Limit Theorem ensures that the t procedures will be valid. The rank-sum procedure, and the resampling procedures also should be valid. The median test is valid, but lacking in power. Given the preceding appeal to the CLT, supported by the near-normality of the resampling distributions, I feel the standard t test and CI are appropriate for these data. Conclusions Do the two species of parrotfish differ in the volume of material removed per bite? In particular, are Spectacled parrotfish bites larger than Redlip parrotfish bites? There is little if any difference between the species in mean bite size (adjusted for fish size). This study does not provide evidence of a difference at any reasonable level of statistical significance. Furthermore, the difference in observed means is opposite to that expected: Redlip bites were slightly larger than Spectacled bites. Other factors such as abundance and size distribution probably have much greater effects on the relative contributions of the two species to bioerosion in Hanauma Bay than does bite size. Data Set 7: Bioerosion by Parrotfish (rev. October 19, 200) 7