Page 1 Technical Math II Lab 5: Descriptive Stats Lab 5: Descriptive Statistics Purpose: To gain experience in the descriptive statistical analysis of a large (173 scores) data set. You should do most of this lab either in Excel or Winstats. The placement scores of MATC students enrolled in Elementary Algebra were as follows: 36 39 29 37 50 32 38 36 34 32 32 45 32 45 34 31 48 48 54 38 44 44 38 38 39 42 44 42 44 36 36 29 38 47 44 52 31 36 40 48 32 34 49 31 34 38 31 38 36 32 42 40 32 34 44 38 47 44 36 40 38 34 36 45 38 49 34 23 40 47 32 47 38 38 34 36 32 48 38 48 32 23 42 34 47 42 36 38 29 42 31 29 42 31 29 36 34 36 44 45 36 38 44 42 29 34 44 42 34 34 36 34 38 40 38 29 42 38 51 48 34 44 48 38 38 34 42 47 44 34 27 34 36 38 40 32 29 44 34 42 44 49 47 32 40 42 26 36 34 45 38 44 36 45 36 40 42 42 48 27 36 32 45 44 36 42 36 27 48 40 38 42 38 Use of the free Winstats program available at http://faculty.madisoncollege.edu/alehnen/winptut/install_winplot.html makes this analysis particularly simple, but the data must be pasted in as a single column.
Page 2 Technical Math II Lab 5: Descriptive Stats First organize the data into a stem-and-leaf diagram. For each stem use two lines of leaves. On the top line place the leaf digits 0 through 4 and on the bottom line the leaf digits 5 through 9. Stem Leaves 2 3 4 5 Next organize the data into a frequency distribution and enter the results into an Excel spreadsheet. This full data set will be referred to as the 'Ungrouped' Data. For this set of scores calculate and record to the nearest thousandth the descriptive statistics requested in the left side of Table 2. Now, group the data so that the lowest class is 20.5-24.5. Fill in Table 1. Using this grouped data, construct a histogram of the scores, plotting relative frequency along the vertical axis and the classes along the horizontal axis. Use the midpoint of each class to represent all of the scores in that class and repeat the same calculations as for the Ungrouped Data. Fill in the right side of Table 2, labeled as Grouped Data. Finally, make a box plot of both the Ungrouped and Grouped data. You should use Excel to present the frequency distributions, do the calculations, and graph the histogram. Excel does not have a built-in function to calculate the mean and standard deviation of a frequency distribution (the Excel functions AVERAGE, STDEV and STDEVP assume each score in the argument list occurs only once.) However, by setting up a column of f * x and a column of f*x 2, the mean and standard deviation can be calculated from the formula: ( fx ) 2 i i 2 fx fx i i i i x = ; s n x = ; n= fi n n 1. The Excel sample spread sheet shown below illustrates such a calculation for the following scores. x 5 3 6 13 7 23 8 31 9 19 10 4 f
Page 3 Technical Math II Lab 5: Descriptive Stats The output of the above formulas is shown below. To make the histogram bars fill up the class width as shown above, click on one of the rectangles in the histogram, then right click and select Format Data Series from the right-click menu. In the Format Data series menu select Series Options and set the Gap Width to 0%. To generate the ogive graph in Excel choose a chart type that is a line graph of connected points.
Page 4 Technical Math II Lab 5: Descriptive Stats Excel can even generate the frequency distribution of the classes. This requires the Data Analysis package be available under the Tools menu. If Data Analysis is not shown in the Data Menu, click on the Office Button and choose Excel Options, then choose Add-Ins. From the list of Add Ins available: check Anaylsis ToolPak and click Go. You will need to setup a column of left-class boundaries for the grouped data. Excel calls the column of these boundaries a Bin Range. Once the Data Analysis Tool is chosen from the Tools menu, select Histogram and click OK. From the Histogram menu select the Input Range as the cells in the column of the ungrouped data, the Bin Range as the column of Right class boundaries, and then pick a cell where you want the resulting frequency distribution to begin as the Output Range. Click OK to generate the frequency distribution. Table 1: Algebra Placement Scores Grouped Data Class f Class Mid Point Relative f Cumulative f Rel. Cum. f 20.5 24.5 22.5 24.5 28.5 28.5 32.5 32.5 36.5 36.5 40.5 40.5 44.5 44.5 48.5 48.5 52.5 52.5 56.5
Page 5 Technical Math II Lab 5: Descriptive Stats Table 2: Algebra Placement Scores Descriptive Statistic Minimum Maximum Range Mode Median, M d Mean, x Ungrouped Data Grouped Data Q 1 Q 3 IQR 60'th Percentile, P 60 Sample Standard Deviation, s x Population Standard Deviation, σ x Box Plot of Ungrouped Data Box Plot of Grouped Data
Page 6 Technical Math II Lab 5: Descriptive Stats Discussion Questions: How closely do the descriptive statistics of the grouped and ungrouped scores compare? How well does grouping the scores into classes represent the actual data? For the ungrouped data, what fraction of the scores is within one standard deviation of the mean? For the ungrouped data, what fraction of the scores is within two standard deviations of the mean? For the ungrouped data, what fraction of the scores is within three standard deviations of the mean?