Background Information Every year, the National Baseball Hall of Fame conducts an election to select new inductees from candidates nationally recognized for their talent or association with the sport of baseball. A final ballot usually consists of 25-40 votes. Currently, 292 individuals are in the Hall of Fame. Problem Statement In this project, students will analyze statistics of National Baseball Hall of Fame inductees to determine if they truly had Hall of Fame-class performance. Project Instructions IMPORTANT: Complete the below steps in the order they are given. Completing the steps out of order may complicate the project or result in an incorrect result. 1. Download the following file onto your computer: a. inductees.xml Information on baseball statistics for selected National Baseball Hall of Fame inductees. Column Name Type Description Last Name Text Last name of player. First Name Text First name of player. Position Text Position this player generally played. Year Inducted Text Year the player was inducted into the National Baseball Hall of Fame. At Bats Number Times this player was at-bat. Games Number Games this player participated in. Runs Number Runs scored by this player. Hits Number Hits achieved by this player. Home Runs Number Home runs scored by this player. RBI Number Runs batted in. Walks Number Times player was walked (base on balls). On Base Number Percentage of time the batter reached first base. Slugging Number Weighted measure which takes into account how many bases a player achieved for each at-bat. 2. Begin by creating a new Microsoft Excel workbook named lastname_firstname_bhofp.xlsx. 3. We must adjust the sheets in our workbook. a. Rename Sheet1 to Player List. b. Rename Sheet2 to Analysis Questions. Page 1 of 6 Version 6.3
c. Delete Sheet3. 4. We must import the inductee statistics into the Player List sheet. Using the DATA ribbon, import the data from inductees.xml and place it starting in cell A3. Excel will have to create a schema based on the XML source data. The data will be imported as an XML table. 5. We wish to apply some additional formatting to the Player List sheet. a. We need to add additional columns to store calculated statistics. i. Insert one new table column to the left of existing column L. Insert two new table columns to the right of existing column N. b. For the table, turn on the Total Row option. c. Enter text in the cells as indicated below: i. A31: Average i iv. L3: Batting Avg O3: Hitter Class P3: Hitter Deviation v. A1: Baseball Hall of Fame Statistics d. Merge-and-center cells A1 through P1. e. Apply the Heading 1 formatting style to cell A1. f. AutoFit the widths of columns A through P. g. Format the cells as indicated below: i. D31 through K31: number with no decimal places i L4 through O31: number with 3 decimal places P4 through P31: percent with 1 decimal place h. Apply the Green-Yellow-Red color scale conditional formatting option to cells I4 through I30. i. Apply conditional formatting to the RBI data in cells J4 through J30. i. If the player has more than 1,500 RBIs ( ), change the cell fill color to green and the text color to white. If the player has fewer than 900 RBIs ( ), change the cell fill color to red and the text color to white. j. Apply conditional formatting to batting averages in cells L4 through L30. i. If the player has an average greater than 0.340 ( ), change the cell fill color to green and the text color to white. Page 2 of 6 Version 6.3
If the player has an average less than 0.300 ( ), change the cell fill color to red and the text color white. k. Sort the RBI data from largest to smallest. 6. We need to perform some additional calculations to analyze the Player List sheet data. a. In column L, calculate the player s batting average by using the formula: [ ] [ ] b. This step left intentionally empty. c. This step left intentionally empty. d. This step left intentionally empty. e. In column O, calculate the hitter class by nesting IF() functions to assign a ranking according to the following rules: i. Class of A if the player has more than 3,000 hits. i iv. Class of B if the player has between 2,001 and 3,000 hits. Class of C if the player has between 1,501 and 2,000 hits. Class of D if the player has 1,500 or fewer hits. f. We would like to summarize our Baseball Hall of Fame data. i. In the total row, individually average columns E through N, and P. g. In column P, calculate the hitter deviation by using the following formula. You must modify the cell references as necessary to be column-absolute, row-absolute, or fully absolute. There are two relative references that do not need to be modified. i. P4: =(H4 - H31)/(H4+H31) Fully Absolute References Relative Reference (Do Not Change) Page 3 of 6 Version 6.3
7. We wish to insert comments to fully define some statistics on the Player List sheet. a. M3: On-Base Percentage: A measure of how often a batter reaches base. b. N3: Slugging Percentage: A measure of the power of a hitter. 8. To better understand our data, we wish to create a PivotChart with an associated PivotTable. a. Create a new PivotChart based on the data in cells A3 to O30 of the Player List sheet. This will automatically create an associated PivotTable. Save the PivotChart to a new sheet named Player Stats PivotChart and the PivotTable to a new sheet named Player Stats PivotTable. b. On the PivotTable, do the following: i. Add the player position as a report filter field. i Add the at-bats data as a row labels field. Add the hits data as a values field. c. We need to perform some formatting on the PivotTable. i. Group the at-bats data into sets of 1,000. Start at 4,000 and end at 12,000. i iv. This step left intentionally empty. Summarize the hits value field by finding the maximum. This step left intentionally empty. d. We must perform some formatting on the PivotChart. i. Ensure that the chart is a 2-D clustered column chart. Specify appropriate chart and axis titles. 9. We wish to create a chart to plot the number of hits and runs versus number of At-Bats by players in the Hall of fame. a. Create a Scatter Plot with only markers based on the non-contiguous range of cells E3 through E30 and cells G3 through H30 of the Player List sheet. Save the chart to a new sheet named Batter Chart. Ensure that the At-Bats are shown as labels for the horizontal (category) axis, not plotted as chart data. Specify appropriate chart and axis titles. b. Add a trendline based on the number of hits. Use the trendline type that best fits the data. Display the R-Squared value on the chart. NOTE: You cannot use the Moving Average type for your trendline. Page 4 of 6 Version 6.3
c. Add a second trendline based on the number of runs. Use the trendline type that best fits the data. Display the R-Squared value on the chart. NOTE: You cannot use the Moving Average type for your trendline. 10. We need to set up the Analysis Questions sheet so that it can store responses to the analysis questions. a. Enter text in the cells as indicated below: i. A1: Question Number B1: Response b. Bold the contents of row 1. c. AutoFit the width of column A. Set column B to a width of 100. d. Set the height for rows 2 through 5 to 110. e. Change the vertical alignment setting for columns A and B so that the text is displayed at the top of each row. f. Turn on text wrapping for column B. 11. Beginning in cell B2 on the Analysis Questions sheet, type your answers to four of the five below questions. Respond to one question per row and indicate which question you are answering in column A. a. Based on the data on the Player List sheet, do you think there are any players that don t deserve to be in the Hall of Fame? If not, then why do you think they are actually in the Hall of Fame? b. What position is most represented in these 27 players? Can you say anything about why this position is represented the most? c. Is there any correlation between hits and at-bats? d. Our dataset only considers 27 players. Although there are currently 292 players in the hall of fame, is a dataset of only about 10% of the inductees sufficient to generalize the entire hall of fame? e. A full list of players inducted into the National Baseball Hall of Fame is at http://en.wikipedia.org/wiki/list_of_members_of_the_baseball_hall_of_f ame. Is the dataset provided with this project a representative sample of the full list of inductees? Curriculum Information Project Type Microsoft Excel spreadsheet Page 5 of 6 Version 6.3
Relationship to GEC Objective 2 In this project, students learn how the tools provided by Excel can be used to efficiently analyze and manage real-world statistical data. Relationship to GEC Objective 4 Sports are a major part of modern society. In this assignment, students apply objective measures to performance statistics to determine which players truly are great and stand out amongst their peers. Page 6 of 6 Version 6.3