1wsSMAM 319 Some Examples of Graphical Display of Data 1. Lands End employs numerous persons to take phone orders. Computers on which orders are entered also automatically collect data on phone activity. One variable useful in predicting staffing levels is the number of calls per shift handled by each employee. From the data collected on 25 workers, calls per shift are given in the Minitab output below. Worksheet size: 1 cells MTB > set c1 DATA> 118 69 118 16 57 91 96 92 93 82 127 94 72 19 12 18 96 15 73 68 1 73 14 DATA> end MTB > stem and leaf c1 Stem-and-leaf of C1 N = 25 Leaf Unit = 1. 2 5 7 4 6 89 7 7 233 9 8 2 (6) 9 123466 1 1 245689 3 11 88 1 12 7 MTB > histogram c1 Mail Order Firm 9 8 7 6 5 4 3 2 1.999.99.95....5.1.1 C1 11 1 Average: 91.32 Std Dev: 19.6654 N of data: 25 C1 1 11 1 1 Anderson-Darling Normality Test A-Squared:.314 p-value:.523 MTB > %NormPlot c1; SUBC> Title 'Mail Order Firm'. Executing from file: NormPlot.MAC Macro is running... please wait MTB > MTB > dotplot c1
Character Dotplot.....:.. :..:... :. +---------+---------+---------+---------+---------+-------C1 45 75 15 1 Descriptive Statistics Variable N Mean Median TrMean StDev SEMean C1 25 91.32 94. 91.57 19.67 3.93 Variable Min Max Q1 Q3 C1. 127. 73. 15. MTB > boxplot c1 1 1 11 1 Another Example Physical education researchers interested in the development of the over arm throw, measured the horizontal velocity of a thrown ball at the time of release. The results for first grade children ( in feet/sec) are given below.\
MTB > print c2 c3 Data Display Row males females 1 54.2.3 2 39.6 23.3 3 52.3 43. 4 48.4 23.3 5 35.9 25.7 6.4 37.8 7 25.2 26.7 8 45.4 39.5 9 48.9 27.3 1 48.9 33.5 11 45.8 31.9 12 44..4 13 52.5 53.7 14 48.3 28.5 15 59.9 32.9 16 51.7 19.4 17 38.6 23.7 18 39.1 19 49.9 38.3 MTB > name c2='males' MTB > name c3='females' MTB > stem andleaf c2 Stem-and-leaf of males N = Leaf Unit = 1. 1 2 5 2 3 7 3 58899 8 4 4 (7) 4 5588889 5 5 1224 1 5 9 MTB > stem and leaf c3 Stem-and-leaf of females N = 17 Leaf Unit = 1. 1 1 9 4 2 333 8 2 5678 (5) 3 123 4 3 79 2 4 3 1 4 1 5 3
1 5 9 8 7 6 5 4 3 2 1 males females MTB > histogram c2 c3 MTB > boxplot c2 c3 MTB > dotplot c2 c3 Character Dotplot... :... : ::..:.. -------+---------+---------+---------+---------+---------males 28. 35. 42. 49. 56. 63.. :..... :....... -----+---------+---------+---------+---------+---------+females 21. 28. 35. 42. 49. 56. MTB > describe c2 c3 Descriptive Statistics Variable N Mean Median TrMean StDev SEMean males 44.87 47.5 45.12 8.51 1. females 17 31.23..52 8.52 2.7 Variable Min Max Q1 Q3 males 25. 59. 38.72 51.25 females 19. 53. 24. 35.65
An example of simulated data that is not normally distributed. Thisdata is a simulation of data from the continuous uniform distribution 1 < X < f(x) = MTB > random 1 c3; SUBC> uniform. MTB > stem and leaf c3 Stem-and-leaf of C3 N = 1 Leaf Unit = 1. 5 2 11222 16 2 55567788889 25 3 122344 32 3 5567899 44 4 11222224 (1) 4 6667889999 46 5 1344 41 5 55556668899 6 1233 26 6 56666789999 15 7 12223334 5 7 67999 MTB > describe c3 Descriptive Statistics Variable N Mean Median TrMean StDev SEMean C3 1 49.52 49.7 49.45 16.87 1.69 Variable Min Max Q1 Q3 C3 21.28 79.49 35.1 65.99 MTB > nscores c3 c4 MTB > boxplot c3 MTB > plot c4*c3 MTB > 2.5 1.5.5 -.5-1.5-2.5 C3 Boxplot Normal Plot
Note that although the distribution is symmetric there is considerable departure from normality. The following scores represent the final examination grade for an elementary statistics course. 23 79 32 57 74 52 82 36 77 81 95 41 65 92 85 55 76 52 1 64 75 78 25 98 81 67 41 71 83 54 64 72 88 62 74 43 78 89 76 84 48 84 15 79 34 67 1`7 82 69 74 63 85 61 Using Minitab make A. a stem and leaf display; B. a boxplot.; C. a frequency histogram; D. a dotplot; E. a five number summary using the Describe command. Answer the following questions A. Does the data appear to be normally distributed? Is it skewed in any particular direction? Are there any outliers? Are they curve breakers or people who probably have not been studying? Is the mean and the median much different. If so what might that mean? B. Assign letter grades A-F on the curve based on: (1) the places where there are breaks in the distribution; (2) ranking the grades and giving 1% A % B % C % D and 1% F ; (3) making intervals using the estimates of the mean and the standard deviation to compute the students Z scores using the percentiles of the normal distribution. C. How do you account for whatever differences in the grade distribution that exist using each of the three methods above? Which method of grading on the curve if any do you think is most sensible? MTB > stem and leaf c5 Stem-and-leaf of C5 N = Leaf Unit = 1. 1 1 3 1 57 4 2 3 5 2 5 7 3 24 8 3 6 11 4 113 12 4 8 15 5 224 17 5 57 24 6 12344 28 6 5779 (6) 7 12444 26 7 56678899 18 8 1122344 8 8 5589 4 9 2 2 9 58
MTB > boxplot c5 MTB > histogram c5 MTB > dotplot c5 Character Dotplot...:.: :........ :.. ::. :::.:.:.:.::::.:.:... ---+---------+---------+---------+---------+---------+---C5 16 32 48 64 96 MTB > describe c5 Descriptive Statistics Variable N Mean Median TrMean StDev SEMean C5 65.48 71. 66. 21.13 2.73 Variable Min Max Q1 Q3 C5 1. 98. 54.25.75 MTB > %NormPlot c5; SUBC> Title 'Normal Plot for Grade Distribution'. Executing from file: NormPlot.MAC Macro is running... please wait MTB > let c6=65.48 MTB > let c7 =21.13 MTB > let c8=(c5-c6)/c7 MTB > sort c8 c9 MTB > print c9 Data Display C9-2.62565-2.382-2.29437-2.141-1.91576-1.58448-1.48983-1.39517-1.15854-1.15854-1.6389 -.82726 -.63796 -.63796 -.543 -.49598 -.133 -.25935 -.25935 -.212 -.16469 -.11737 -.4 -.4 -.2272.7194.7194.16659.21391.26124.857.322.322.322.454.49787.49787.545.59252.59252.63985.63985.68717.68717.68717.734.734.78183.78183.82915.87648.87648.923.923 1.6578 1.11311 1.144 1.259 1.397 1.534 MTB > sort c5 c1 MTB > print c1 Data Display C1 1 15 17 23 25 32 34 36 41 41 43 48 52 52 54 55 57 61 62 63 64 64 65 67 67 69 71 72 74 74 74 75 76 76 77 78 78 79 79 81 81 82 82 83 84 84 85 85 88 89 92 95 98
1 1 1 1 MTB > C5 Normal Plot for Grade Distribution.999.99.95....5.1.1 1 1 C5 Average: 65.4833 Std Dev: 21.1335 N of data: Anderson-Darling Normality Test A-Squared: 1.5 p-value:.