Session 2: Introduction to Multilevel Modeling Using SPSS

Similar documents
NCSS Statistical Software

Design of Experiments Example: A Two-Way Split-Plot Experiment

Chapter 12 Practice Test

Navigate to the golf data folder and make it your working directory. Load the data by typing

Lesson 14: Modeling Relationships with a Line

In addition to reading this assignment, also read Appendices A and B.

Section I: Multiple Choice Select the best answer for each problem.

Microsoft Windows Software Manual for FITstep Stream Version 4

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Combination Analysis Tutorial

by Robert Gifford and Jorge Aranda University of Victoria, British Columbia, Canada

ESP 178 Applied Research Methods. 2/26/16 Class Exercise: Quantitative Analysis

Multi Class Event Results Calculator User Guide Updated Nov Resource

DATA SCIENCE SUMMER UNI VIENNA

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Lab 11: Introduction to Linear Regression

Boyle s Law: Pressure-Volume Relationship in Gases

MTB 02 Intermediate Minitab

Managing Timecard Exceptions

Complete Wristband System Tutorial PITCHING

Mac Software Manual for FITstep Pro Version 2

Lab 13: Hydrostatic Force Dam It

McKnight Hockey Association

MJA Rev 10/17/2011 1:53:00 PM

The ICC Duckworth-Lewis-Stern calculator. DLS Edition 2016

[CROSS COUNTRY SCORING]

USA Jump Rope Tournament Software User Guide 2014 Edition

One-factor ANOVA by example

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

3. Select a colour and then use the Rectangle drawing tool to draw a rectangle like the one below.

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

The ICC Duckworth-Lewis Calculator. Professional Edition 2008

CENTER PIVOT EVALUATION AND DESIGN

Online League Management lta.tournamentsoftware.com. User Manual. Further support is available online at

UNITY 2 TM. Air Server Series 2 Operators Manual. Version 1.0. February 2008

Guide to Computing Minitab commands used in labs (mtbcode.out)

Name Student Activity

Fun with Gas Laws. Prepared by Vance O. Kennedy and Ross S. Nord, Eastern Michigan University PURPOSE

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

Taking Your Class for a Walk, Randomly

Lab 4: Root Locus Based Control Design

[CROSS COUNTRY SCORING]

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Objectives. Materials

WMS 8.4 Tutorial Hydraulics and Floodplain Modeling HY-8 Modeling Wizard Learn how to model a culvert using HY-8 and WMS

2017 Census Reporting To access the SOI s Census Reporting web site go to:

SWIM MEET MANAGER 5.0 NEW FEATURES

Access will be via the same Player Registration tab via the Player Registrations Officer role section.

How to Setup and Score a Tournament. May 2018

[MYLAPS INTEGRATION]

Lab 5: Descriptive Statistics

Guidelines for Applying Multilevel Modeling to the NSCAW Data

Example 1: One Way ANOVA in MINITAB

TEAM MANAGER LITE ENTRY INSTRUCTIONS

Working with Marker Maps Tutorial

GN21 Frequently Asked Questions For Golfers

Using the GHIN Handicap Allocation Utility with GHP Golfer

Allocation of referees, hours and pistes User manual of Engarde - August, 2013

16. Studio ScaleChem Calculations

Confidence Interval Notes Calculating Confidence Intervals

Transpiration. DataQuest OBJECTIVES MATERIALS

USA Jump Rope Regional Tournament Registration User Guide 2014 Edition

Setting up group models Part 1 NITP, 2011

BIOL 101L: Principles of Biology Laboratory

The Gas Laws: Boyle's Law and Charles Law

SIDRA INTERSECTION 6.1 UPDATE HISTORY

Boyle s Law: Pressure-Volume Relationship in Gases. PRELAB QUESTIONS (Answer on your own notebook paper)

Background Information. Project Instructions. Problem Statement. EXAM REVIEW PROJECT Microsoft Excel Review Baseball Hall of Fame Problem

CS Problem Solving and Object-Oriented Programming Lab 2 - Methods, Variables and Functions in Alice Due: September 23/24

SmartMan Code User Manual Section 5.0 Results

Lab #12:Boyle s Law, Dec. 20, 2016 Pressure-Volume Relationship in Gases

MIS0855: Data Science In-Class Exercise: Working with Pivot Tables in Tableau

Lab 3. The Respiratory System (designed by Heather E. M. Liwanag with T.M. Williams)

Lesson 20: Estimating a Population Proportion

Diameter in cm. Bubble Number. Bubble Number Diameter in cm

Overview 1. Handicap Overview 12. Types of Handicapping...12 The Handicap Cycle...12 Calculating handicaps...13 Handicap Nomenclature...

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Sample Final Exam MAT 128/SOC 251, Spring 2018

Fastball Baseball Manager 2.5 for Joomla 2.5x

Driv e accu racy. Green s in regul ation

GMS 10.0 Tutorial SEAWAT Viscosity and Pressure Effects Examine the Effects of Pressure on Fluid Density with SEAWAT

Lab 2: Probability. Hot Hands. Template for lab report. Saving your code

Oracle ebusiness CCTM Supplier: Rate Card

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

INSTRUCTIONS FOR USING HMS 2016 Scoring (v1)

Experiment P18: Buoyant Force (Force Sensor)

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

GEOG2113: Geographical Information Systems Week 7 Mapping Hazards & Threats Practical Task

How-to Pull & Schedule Club Reports

TRAP MOM FUN SHOOT 2011

Club s Homepage Welcome Club Calendar Logout Add a Request Play Date Requested Time Hole Selection # of Tee Times Break Link

Organizing Quantitative Data

Quick Start Guide. For Gold and Silver Editions

BVIS Beach Volleyball Information System

GolfLogix: Golf GPS. User Guide for: BlackBerry Curve. Version 1.0. Software Release , 8330, 8350i, 8800, 8820, 8830, 8900

Ameren Oracle ebusiness CCTM Supplier

Nucula. Nucula User Guide

Running head: DATA ANALYSIS AND INTERPRETATION 1

Transcription:

Session 2: Introduction to Multilevel Modeling Using SPSS Exercise 1 Description of Data: exerc1 This is a dataset from Kasia Kordas s research. It is data collected on 457 children clustered in schools. Child level variables: Case = Unique child identifier Sex = Gender of the child; 0 = female, 1= male Suppl = Supplementation treatment; 4 different levels ge = age of the child in years Interest = Parental interest in school work; 0 = yes, 1 = no Ownhome = Does the family own their home; 0 = no, 1 = yes SES = Socio economic status of the family; 0=low, 1=average, 2=high Crowding = continuous crowding measure Weight = weight of the child Height = height of the child Hb = hemoglobin level of the child MI = ody Mass Index of the child Pb = lead level of the child IQ = IQ of the child School level variables: School = unique school identifier Crowdings = average crowding of the families in the school Ownhomes = proportion of families that own their home in the school Distcat = distance from the foundry; 0=<=1700, 1=>1700 1

Data Exploration 1. The SPSS menus are convenient for preliminary data exploration. To open a data file in SPSS choose Open in the File menu and look for the file Exerc1. t the bottom of the spreadsheet you will see two tabs: data view and variable view. When clicking on the variable view tab, you will see a spreadsheet containing information pertaining to all the different variables, such as variable labels and value labels. You can add or change this information later if needed. 2. During the first phase of the data exploration you might want to look at one variable at a time. This will allow you to answer the following types of questions:. How many schools are there? How many children are there in each school? What is the mean age of these children and what is the range? How many males and females does this dataset contain? Useful menus: To obtain basic summary statistics for continuous variables, use the menu: nalyze>descriptive Statistics>Descriptives. For example, enter the following variables in the variable box: iq, pb. y clicking on the option button, you can obtain additional statistics. You can also graph these variables with a histogram. To obtain a histogram, use the menu: Graphs>Histogram. To obtain frequency tables for categorical variables, use the menu nalyze>descriptive Statistics>Frequencies. For example, enter the variable sex in the frequencies box. Clicking on the Statistics button will allow you to change some of the default output. You can easily obtain the corresponding bar graph with the Graph button. 3. In a second phase you can examine two variables at a time using nalyze>compare means>means or nalyze>descriptives>crosstabs or nalyze>correlate>ivariate. This will allow you to answer the following type of questions:. What is the mean IQ for each school? How many males and females are there in each school? re the IQ level and Pb level correlated? 2

To obtain means across different categories use nalyze>compare means>means. In the dependent variable enter for example IQ, in the independent variable list you can enter school. To obtain cross tabulations use the menu nalyze>descriptive Statistics>Crosstabs. Enter school in the row box and suppl in the column box. To obtain a correlation matrix showing the correlations between different variables use the menu nalyze>correlate>ivariate. In the variable box enter the following variables: Pb, IQ, Hb, and weight To obtain scatterplots corresponding to the above correlations use the menu Graphs>Scatter. 4. On the left hand side of the Output window you will see the Navigator output. This window records the various outputs you have requested. y clicking on any of these entries, you can easily retrieve or delete a previous analysis. 5. To run a simple regression, use the menu nalyze>regression>linear. In the independent variable box enter Pb and in the dependent variable box enter IQ. 6. If you wish to run a separate but similar analysis for each school, the afore-mentioned simple regression, split the file before running the analysis. To achieve this, use the menu Data>Split File. In the menu, choose the bullet Compare Groups and enter school in the groups based on: box. The data file will remain split and all the subsequent analysis and graphs will be replicated for each school until you go back to the split file menu and choose the bullet nalyze all cases, do not create groups. 3

nalysis 1. The menus in SPSS are convenient for exploratory analysis but sometimes it is more useful to use SPSS in command mode. Indeed, not all possible commands are available through the menus, and the commands are sometimes easier to use (certainly when you have to repeatedly use a similar command). Each menu has a paste button along with the OK button. If you hit the paste button rather than the OK button after you have filled out the menu, it will bring you to the syntax window where the menu is translated into a command file. You can then highlight the part of the command file you wish to run and hit the arrow in the toolbar. This will generate exactly the same output as if you had hit the OK button. From now on, we will use SPSS in the command mode. 2. Let us experiment with the SPSS connection between the menus and the commands by running an overall regression on all the children using IQ as the dependent variable and sex, age, interest, Pb, and as independent variables. This can be done in the menus using nalyze>general Linear Models>Univariate. In the dependent variable box put IQ, and all other variables can be brought over to the covariate box. Under the Option button, click on Parameter Estimates. fter clicking Paste, you should obtain the following command file in the syntax window. You could also have typed this file directly. In the SPSS syntax editor highlight the command and submit them together: UNINOV IQ WITH sex age interest Pb distcat /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PRMETER /CRITERI = LPH(.05) /DESIGN = sex age interest Pb distcat. When we run this analysis, we violate the assumption of the residuals being independent (i.e., uncorrelated) which would yield erroneous standard errors. 3. We could then run a separate regression for each school. To do this, we first split the data set by school. To do this, you could use the menu or the following commands: SORT CSES Y school. SPLIT FILE LYERED Y school. 4

4. Then we run a series of simple linear regressions for each individual school saving the parameter estimates to a new SPSS file called regout. REGRESSION /MISSING LISTWISE /STTISTICS COEFF /CRITERI=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT IQ /METHOD=ENTER sex age MI interest Hb Pb /OUTFILE=COV('c:\My Downloads \regout.sav'). You can open the regout.sav file using the menus: 5. fter examining the parameter estimates of the previous regressions close the regout.sav file and open the exerc1.sav data set again (if you closed it): 6. We could estimate the variability of IQ between schools and within schools with the following command file (after removing the split file if it is still on): MIXED IQ Y school /CRITERI = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0, SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED = SSTYPE(3) /RNDOM school /PRINT solutions /METHOD = REML. 7. Similar results can be obtained with the following commands that fit a random intercept model: MIXED IQ Y school /CRITERI = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0, SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED = SSTYPE(3) /RNDOM intercept SUJECT(school) /PRINT solutions /METHOD = REML. 8. We can then add covariates both at the child and school level: MIXED IQ WITH sex age distcat Y school /CRITERI = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0, SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED = sex age distcat SSTYPE(3) 5

/RNDOM school /PRINT solutions /METHOD = REML. or MIXED IQ WITH sex age distcat Y school /CRITERI = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0, SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED = sex age distcat SSTYPE(3) /RNDOM intercept SUJECT(school) /PRINT solutions /METHOD = REML. 9. better approach yet would be to fit a random coefficient (intercept and slope) model using restricted maximum likelihood with an unstructured covariance matrix. This can also be done using mixed. MIXED IQ WITH sex age distcat Y school /CRITERI = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0, SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED = sex age distcat SSTYPE(3) /RNDOM intercept age SUJECT(school) covtype(un) /PRINT solutions /METHOD = REML. 6

Exercise 2 This data is based on data collected by Julie Kikkert, a research associate in CLS. This was an experiment with beets. The data set is called Exerc4. lock = Row Width 1 24 in Plot = plant population or 2 18 in 3 20 in 4 22 in 5 18 in 6 20 in 7 24 in 8 22 in Subplot = Harvest Date 1, 2, or 3 - Four row widths (18, 20, 22, 24 in) were randomly assigned to 8 blocks. - Two plant populations (, ) were randomly assigned to each of the 4 plots within the blocks. - From each plot we take one measurement at each of the 3 harvest dates (1, 2, 3). - There are 96 observations. - The response is fresh weight (kg) of roots. You can also consider instead this alternate scenario: - Four trial medications for a lung disease were randomly assigned to be tested at 8 different hospitals. - Patients are categorized in two groups: patients in group are under 50 years old and patients in group are over 50 years old. - Each patient it repeatedly measured at the three stages of the disease: early, middle and advanced stage. - The response is a scaled measure of lung health. 7

1. The data set that we will be using is called exerc2. In an initial step let us analyze only the first harvest day. We will select the first harvest date with the following commands: USE LL. COMPUTE filter_$=(hrvest_dte =1). VRILE LEL filter_$ 'HRVEST_DTE =1 (FILTER)'. VLUE LELS filter_$ 0 'Not Selected' 1 'Selected'. FORMT filter_$ (f1.0). FILTER Y filter_$. EXECUTE. In addition, if we ignore the blocks in this experiment we would consider that this data comes from a completely randomized experiment. In such a case we would run the following two-way NOV using GLM or Mixed: UNINOV fw Y ROW_WIDTH POPULTION /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERI=LPH(0.05) /DESIGN=ROW_WIDTH POPULTION. MIXED fw Y ROW_WIDTH POPULTION /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION SSTYPE(3) /METHOD=REML. Not taking the blocking of this experiment into account does not yield an appropriate test for row_width. 2. Taking the blocks into account can be done using the following commands: UNINOV fw Y ROW_WIDTH POPULTION LOCK /RNDOM=LOCK /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERI=LPH(0.05) /DESIGN=ROW_WIDTH POPULTION LOCK*ROW_WIDTH. 8

MIXED fw Y ROW_WIDTH POPULTION LOCK /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION SSTYPE(3) /METHOD=REML /RNDOM=LOCK COVTYPE(VC). 3. Since the row_width is a significant variable, we would like to find the least square means for each row width and the corresponding standard errors. We can do this with a lsmeans statement in both Proc GLM and Proc Mixed as follows: UNINOV fw Y ROW_WIDTH POPULTION LOCK /RNDOM=LOCK /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /EMMENS=TLES(ROW_WIDTH) COMPRE DJ(LSD) /CRITERI=LPH(.05) /DESIGN=ROW_WIDTH POPULTION LOCK*ROW_WIDTH. MIXED fw Y ROW_WIDTH POPULTION LOCK /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION SSTYPE(3) /METHOD=REML /RNDOM=LOCK COVTYPE(VC) /EMMENS=TLES(ROW_WIDTH) COMPRE DJ(LSD). re the outputs identical? 4. Consider that we want to treat row_width as a continuous variable. We simply can do this by leaving row_width out of the class statement in the model. ut notice the error message you obtain with the following Proc GLM command using row_width in the random statement. UNINOV fw Y POPULTION LOCK WITH ROW_WIDTH /RNDOM=LOCK /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERI=LPH(.05) /DESIGN=ROW_WIDTH POPULTION LOCK*ROW_WIDTH. 9

MIXED fw Y POPULTION LOCK WITH ROW_WIDTH /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION SSTYPE(3) /METHOD=REML /RNDOM=LOCK COVTYPE(VC). 5. The dataset exerc2 is completely balanced. similar data set with some missing observations is called exerc2b. re the outputs from the two following programs still identical? UNINOV fw Y ROW_WIDTH POPULTION LOCK /RNDOM=LOCK /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERI=LPH(0.05) /DESIGN=ROW_WIDTH POPULTION LOCK*ROW_WIDTH. MIXED fw Y ROW_WIDTH POPULTION LOCK /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION SSTYPE(3) /METHOD=REML /RNDOM=LOCK COVTYPE(VC). 6. To analyze all harvest dates together we need to take one more level of clustering into account. MIXED fw Y ROW_WIDTH POPULTION LOCK PLOT HRVEST_DTE /CRITERI=CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULR(0.000000000001) HCONVERGE(0,SOLUTE) LCONVERGE(0, SOLUTE) PCONVERGE(0.000001, SOLUTE) /FIXED=ROW_WIDTH POPULTION HRVEST_DTE SSTYPE(3) /METHOD=REML /RNDOM=LOCK LOCK *PLOT COVTYPE(VC). 10