Guide to Computing Minitab commands used in labs (mtbcode.out)

Similar documents
1wsSMAM 319 Some Examples of Graphical Display of Data

One-way ANOVA: round, narrow, wide

Navigate to the golf data folder and make it your working directory. Load the data by typing

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Running head: DATA ANALYSIS AND INTERPRETATION 1

Lab 11: Introduction to Linear Regression

Driv e accu racy. Green s in regul ation

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

CHAPTER 1 ORGANIZATION OF DATA SETS

Select Boxplot -> Multiple Y's (simple) and select all variable names.

Example 1: One Way ANOVA in MINITAB

Chapter 12 Practice Test

Section I: Multiple Choice Select the best answer for each problem.

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

Legendre et al Appendices and Supplements, p. 1

Lesson 14: Modeling Relationships with a Line

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

On the association of inrun velocity and jumping width in ski. jumping

ESP 178 Applied Research Methods. 2/26/16 Class Exercise: Quantitative Analysis

Building an NFL performance metric

Session 2: Introduction to Multilevel Modeling Using SPSS

Biostatistics & SAS programming

One-factor ANOVA by example

Note that all proportions are between 0 and 1. at risk. How to construct a sentence describing a. proportion:

Sample Final Exam MAT 128/SOC 251, Spring 2018

Setting up group models Part 1 NITP, 2011

AP Statistics Midterm Exam 2 hours

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Standard Operating Procedure

ANOVA - Implementation.

Lab 1. Adiabatic and reversible compression of a gas

Estimating the Probability of Winning an NFL Game Using Random Forests

Title: AJAE Appendix for Measuring Benefits from a Marketing Cooperative in the Copper

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

A Study of Olympic Winning Times

In addition to reading this assignment, also read Appendices A and B.

Ideal gas law. Introduction

A few things to remember about ANOVA

Addendum to SEDAR16-DW-22

S.S.K. Haputhantri. Abstract

Boyle s Law. Pressure-Volume Relationship in Gases. Figure 1

Experimental Design and Data Analysis Part 2

United States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic,

Boston Marathon Data. Instructor: G. William Schwert

STT 315 Section /19/2014

Name May 3, 2007 Math Probability and Statistics

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

GENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2

Lab 4: Transpiration

BIOL 101L: Principles of Biology Laboratory

MGB 203B Homework # LSD = 1 1

Exp. 5 Ideal gas law. Introduction

Updated and revised standardized catch rate of blue sharks caught by the Taiwanese longline fishery in the Indian Ocean

Guidelines for Applying Multilevel Modeling to the NSCAW Data

Math 230 Exam 1 Name October 2, 2002

APPENDIX A COMPUTATIONALLY GENERATED RANDOM DIGITS 748 APPENDIX C CHI-SQUARE RIGHT-HAND TAIL PROBABILITIES 754

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Climate Change Scenarios for the Agricultural and Hydrological Impact Studies

Correlation and regression using the Lahman database for baseball Michael Lopez, Skidmore College

Confidence Interval Notes Calculating Confidence Intervals

STATISTICS ELEMENTARY MARIO F. TRIOLA. Descriptive Statistics EIGHTH EDITION

Week 7 One-way ANOVA

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Draft. Hiroki Yokoi, Yasuko Semba, Keisuke Satoh, Tom Nishida

Math 121 Test Questions Spring 2010 Chapters 13 and 14

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Copy of my report. Why am I giving this talk. Overview. State highway network

Climate Change Scenarios for the Agricultural and Hydrological Impact Studies

Stat 139 Homework 3 Solutions, Spring 2015

Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11

Highway & Transportation (I) ECIV 4333 Chapter (4): Traffic Engineering Studies. Spot Speed

PRACTICAL EXPLANATION OF THE EFFECT OF VELOCITY VARIATION IN SHAPED PROJECTILE PAINTBALL MARKERS. Document Authors David Cady & David Williams

Stats 2002: Probabilities for Wins and Losses of Online Gambling

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Predicting Exchange Rates Out of Sample: Can Economic Fundamentals Beat the Random Walk?

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

Projecting Three-Point Percentages for the NBA Draft

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

INSTRUCTION FOR FILLING OUT THE JUDGES SPREADSHEET

Lecture 22: Multiple Regression (Ordinary Least Squares -- OLS)

Bayesian model averaging with change points to assess the impact of vaccination and public health interventions

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

If a fair coin is tossed 10 times, what will we see? 24.61% 20.51% 20.51% 11.72% 11.72% 4.39% 4.39% 0.98% 0.98% 0.098% 0.098%

Predicting the Total Number of Points Scored in NFL Games

Empirical Example II of Chapter 7

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Name Student Activity

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON

Lab 13: Hydrostatic Force Dam It

The pth percentile of a distribution is the value with p percent of the observations less than it.

Standardized catch rates of Atlantic king mackerel (Scomberomorus cavalla) from the North Carolina Commercial fisheries trip ticket.

Lecture 5. Optimisation. Regularisation

Excel Solver Case: Beach Town Lifeguard Scheduling

IDENTIFYING SUBJECTIVE VALUE IN WOMEN S COLLEGE GOLF RECRUITING REGARDLESS OF SOCIO-ECONOMIC CLASS. Victoria Allred

Boyle s Law: Pressure-Volume Relationship in Gases

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Lab 5: Descriptive Statistics

Transcription:

Guide to Computing Minitab commands used in labs (mtbcode.out) A full listing of Minitab commands can be found by invoking the HELP command while running Minitab. A reference card, with listing of available commands, can be purchased in the University Bookstore. Use the HELP command (or reference card) much as you would a dictionary or thesaurus, to find out how a command works or find a command to accomplish the calculation at hand. The commands listed here are for common routines used in this course: Reading Data into Minitab Writing Data and Output Files from Minitab Summarizing data Re-organizing Data for Analysis by the General Linear Model Executing the General Linear Model Calculating Residuals Using Residuals to Check Assumptions Randomization Tests with Minitab These routines are built up out of commands, just as sentences are built out of words. Most of the assignments in this course can be accomplished by modifying one of the following routines, depending on the situation or goals of computation. Many routines have been assigned a name, which is used in other parts of the lab manual. The name of the routine occurs in boldface italics to the right of the routine box.

Reading Data into Minitab. Data are read from a file (in quotes) into columns. For the data files in this course, you will need to state the number of lines of data, because information about the data is listed below the data in the same file. MTB > read 'srbx9_5.dat' c1 c2; SUBC> nobs = 7. MTB > name c1 'age(i)' c2 'age(ii)' Data are also typed directly from the keyboard. MTB > set into c1 DATA> 2.68 2.60 2.43 2.90 2.94 2.70 2.68 2.98 2.85 DATA> end MTB > set c2 DATA> 2.36 2.41 2.39 2.85 2.82 2.73 2.58 2.89 2.78 DATA> end MTB > name c1 'N(B)' c2 'N(13)' Define Data from file Define Data from keyboard Writing Data and Output Files from Minitab Data in columns can be saved by writing to a named system file, as in the box above. MTB > write [to] 'srbx1311.dat' c1 c2 Define Data to file This command prints output directly to the screen. MTB > print c1 c2 Output to screen Most commands send output to the screen automatically. MTB > describe c1 c2 MTB > histogram c1 c2 Results of computations need to be printed to the screen Compute a variance Output to screen Output that appears on the screen can be written to a named system file, this file can then be printed out, or moved to another computer. MTB > let k2 = std(c1) MTB > let k2 = k2*k2 MTB > print k2 MTB > outfile 'srbx9_5.out' MTB > print c1 c2 MTB > describe c1 c2 MTB > nooutfile Outfile open Outfile closed

Summarizing Data The following 4 commands calculate and display descriptive statistics on data in column 1 MTB > describe c1 MTB > histogram c1 MTB > dotplot c1 MTB > mean c1 This next routine calculates a cumulative relative relative frequency CRF(C1 < 2) = 0.09 from the following data: 1 2 3 3 4 4 4 5 5 6 7 MTB > histogram c1; MTB > start 2. MTB > let k1 = 1 MTB > let k2 = 11 MTB > let k3 = k1/k2 MTB > print k3 <--from screen display <--from screen display CRF(C1 < 2) Re-organizing Data for Analysis by the General Linear Model The following commands reorganize data from tabular format (7 columns) to model format (2 columns, response and explanatory variable). MTB > read 'srtab8_1.dat' c1-c7; SUBC> nobs = 5. MTB > stack c1-c7 c8 MTB > name c8 'wlength' MTB > set c9 DATA> (1 2 3 4 5 6 7)5 DATA> end MTB > name c9 'groups' MTB > print c8 c9 Reorganize

Executing the General Linear Model, Calculating Residuals ANOVA designs MTB > anova 'wlength' = 'groups'; SUBC> fits c10; SUBC> residuals c11. MTB > name c10 'fits' c11 'res' MTB > glm 'wlength' = 'groups'; SUBC> fits c10; SUBC> residuals c11. MTB > name c10 'fits' c11 'res' Run GLM ANOVA Run GLM ANOVA Regression designs MTB > read 'ryder.dat' c1 c2; SUBC> nobs = 23. MTB > name c1 'area' c2 'yield' MTB > regress 'yield' 1 predictor 'area'; SUBC> residuals c10. MTB > name c10 'res' MTB > let k1 = 837382 <---from screen display MTB > let k2 = 1.45 <---from screen display MTB > let c11 = k1 + k2*'area' MTB > name c11 'fits' Run GLM Regression MTB > read 'ryder.dat' c1 c2; SUBC> nobs = 23. MTB > name c1 'area' c2 'yield' MTB > glm 'yield' = 'area'; SUBC> covariate 'area'; SUBC> fits c9; SUBC> residuals c10. MTB > name c9 'fits' c10 'res' Run GLM Regression

Using Residuals to Check Assumptions A. linear relation of response to explanatory? (bowls and arches) MTB > plot 'res' vs 'fits' GLM linear? B1. Do errors sum to zero? Note that this assumption is listed for completeness. There is no need to check it if fitted values are calculated via least squares, as in statistical packages such as Minitab. MTB > let k1 = sum('res') MTB > print k1 B2. Are errors independent? MTB > let c20 = lag('res') MTB > plot c20 'res' MTB > runs 'res' MTB > corr c20 'res' <--graphical analysis <--runs test (optional) <--autocorrelation, lag 1(optional) Errors independent? MTB > acf 'res' <--autocorrelation, many lags (optional) B3. Are the errors homogeneous? (no cones facing left or right) MTB > plot 'res' vs 'fits' B4 Are the errors normally distributed? MTB > hist 'res' MTB > nscores 'res' c30 MTB > plot c30 'res' MTB > rootogram 'res' <---graphical analysis <--- 2nd graphical analysis <---3rd graphical analysis, shows confidence limits Errors homogeneous? Errors normal?

Randomization Tests with Minitab Statistic is mean (compared to zero) MTB > let k1 = mean(c1) MTB > nooutfile MTB > set into c2 DATA> -1 1 DATA> end MTB > let k2 = 0 Calculate statistic Set up MTB > store 'ran.ctl' STOR> let k2 = k2 + 1 STOR> sample 12 c2 c3; STOR> replace. STOR> let c4 = c1*c3 STOR> let k3 = mean(c4) - k1 STOR> let c5(k2) = k3 STOR> end Create control file MTB > execute 'ran.ctl' MTB > execute 'ran.ctl' 1000 MTB > outfile MTB > histogram c5; SUBC> start k1. Check file Run file CRF(c5 > k1) p-value is % of distribution greater than statistic in k1

Randomization Tests with Minitab Statistic is difference between two means. MTB > let k10 = mean(c6) - mean(c7) MTB > print k10 MTB > let c25(1) = k10 MTB > name c25 'F(st)' MTB > nooutfile MTB > store into 'srbx9_5.ctl' STOR> stack c1 c2 into c3; STOR> subscripts c4. STOR> sample 14 times from c3 into c5 STOR> unstack c5 c6 c7; STOR> subscripts c4. STOR> let k3 = mean(c6) - mean(c7) STOR> stack c25 k3 into c25 STOR> end Calculate statistic set up Create control file MTB > execute 'srbx9_5.ctl' MTB > print 'F(st)' MTB > execute 'srbx9_5.ctl' 10 times MTB > print 'F(st)' MTB > execute 'srbx9_5.ctl' 500 MTB > outfile MTB > hist 'F(st)' MTB > hist 'F(st)'; SUBC> start k10. Check file Run file CRF(c25 > k10) p-value is cumulative relative frequency (%) of outcomes in c25 that are larger than value in k10 CRF(c25 > k10) = %

Randomization Tests with Minitab Statistic is mean squared error (MSE) MTB > regress 'wloss' on 1 predictor 'humidity'; SUBC> residuals c5. MTB > name c5 'res' MTB > let k10 = std('res')**2 MTB > print k10 Calculate statistic MTB > let c25(1) = k10 MTB > name c25 'F(st)' MTB > nooutfile Set up MTB > store into 'srbx14_1.ctl' STOR> sample 9 times from 'wloss' into c8 STOR> regress c8 1 'humidity'; SUBC> residuals c5. STOR> let k3 = std('res')**2 STOR> stack c25 k3 into c25 STOR> end Create control file MTB > execute 'srbx14_1.ctl' MTB > print 'F(st)' MTB > execute 'srbx14_1.ctl' 10 times MTB > print 'F(st)' MTB > execute 'srbx14_1.ctl' 500 Check file Run file MTB > outfile MTB > hist 'F(stat)' MTB > hist 'F(stat)'; SUBC> start k10. CRF(c25 < k10) p-value is cumulative relative frequency (%) of outcomes in c25 that are smaller than value in k10. CRF(c25 < k10) = %