Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) SCIENCE & TECHNOLOGY Journl homepge: http://www.pertnik.upm.edu.my/ Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures in Struturl Eqution Models Ainur, A. K. 1 *, Syng, M. D. 1, Jnnoo, Z. nd Yp, B. W. 1 1 Fulty of Computer nd Mthemtil Sienes, Universiti Teknologi MARA, 40450 UiTM, Shh Alm, Selngor, Mlysi Fulty of Soil Studies nd Humnities, University of Muritius, Reduit 80837 Muritius ABSTRACT A Struturl Eqution Model (SEM) is often used to test whether hypothesised theoretil model grees with dt y exmining the model fit. This study investigtes the effet of smple size nd distriution of dt (norml nd non-norml) on goodness of fit mesures in struturl eqution model. Simultion results onfirm tht the GoF mesures re ffeted y smple size, wheres they re quite roust when dt re not norml. Asolute mesures (GFI, AGFI, RMSEA) re more ffeted y smple size while inrementl fit mesures suh s TLI nd CFI re less ffeted y smple size nd non-normlity. Keywords: Struturl eqution model, goodness-of-fit, non-normlity, simultion INTRODUCTION Struturl eqution model, lso known s simultneous eqution model, is one of the dvned multivrite regression models. Struturl eqution modeling (SEM) is sttistil tehnique tht omines elements in trditionl multivrite models, suh s regression nlysis, ftor nlysis, nd simultneous eqution modeling. In ontrst with the more trditionl multivrite liner model, in SEM, the response vrile in one regression eqution n pper s preditor in nother regression eqution. The vriles in SEM my influene other vriles either diretly or through other vriles s intermediries (Fox, 00). The im of SEM is to test whether the hypothesised theoretil model is onsistent with the dt in Artile history: Reeived: 7 My 016 Aepted: 14 Novemer 016 E-mil ddresses: memier87@yhoo.om (Ainur, A. K.), syng@tmsk.uitm.edu.my (Syng, M. D.), z.jnnoo@uom..mu (Jnnoo, Z.), eewh@tmsk.uitm.edu.my (Yp, B. W.) *Corresponding Author ISSN: 018-7680 017 Universiti Putr Mlysi Press. refleting the theory y exmining the model fit (Hir Jr, Blk, Bin, & Anderson, 010). The model fits the dt when the ovrine mtrix is equl or pprohes the smple ovrine mtrix (Lei & Wu, 007). The most ommon softwres used for modeling dt re LISREL, AMOS, EQS nd MPLUS. The Mximum Likelihood
Ainur, A. K., Syng, M. D., Jnnoo, Z. nd Yp, B. W. (ML) is the defult prmeter estimtion method, while other estimtion methods inlude the Generlized Lest Squre (GLS), Weighted Lest Squre (WLS) nd Asymptotilly Distriution Free (ADF). SEM is widely used in mny fields suh s usiness reserh (Lei & Wu, 007; Hooper, Coughln, & Mullen, 008), psyhology (Mulik et l., 1989; Currn et l., 1996; Bndlos, 00; Shermelleh-engel, Moosrugger, & Müller, 003) nd mrketing (Berden, Shrm, & Teel, 198). In SEM, the mesurement model nd the struturl model need to e tested. The Goodness of Fit (GoF) mesures ommonly used to onfirm if the model fit the dt re Chi-Squre ( χ ), Root Men Squre Error of Approximtion (RMSEA), Goodness-of-Fit Index (GFI), Adjusted Goodness-of-Fit Index (AGFI), Comprtive Fit Index (CFI), Normed Fit Index (NFI), nd Non-Normed Fit Index (NNFI) or lso known s Tuker- Lewis Index (TLI). Some GoF mesures re ffeted y the smple size (Mrsh, Bll, & MDonld, 1988; Roh & Chellduri, 01), s well s the distriution of dt (Shermelleh-engel et l., 003; Shrm, Mukherjee, Kumr, & Dillon, 005). Computer-intensive proedures re used in simultion study to generte rndom dt tht losely represent the rel-world dt. Monte Crlo simultion is one of the tehniques pplied for generting dt using rndom smpling from proility distriution of interest in order to fit the model in SEM. Berden et l., (198) performed simultion to exmine the effet of smple size on the Chi-Squre sttisti, NFI, Normlized Residul Index nd on the onstrut shred vrine s well s reliility estimtors. The simultion study ws onduted using two-onstrut nd four-onstrut models whih onsist of three inditors for eh onstrut. The simultion ws onduted for smple sizes of 5, 50, 75, 100, 500, 1,000,,500, 5,000 nd 10,000. It ws found tht the overll fit sttisti provided y LISREL is urte for lrge smple sizes under simple model nd the interprettion of Chi-Squre n e ffeted y the smple size nd model omplexity. However, this study ws limited to two usl models nd used the ssumption of multivrite normlity ssumption. Hene, reommendtions for future reserh inlude ssessing the effets of the violtion of normlity ssumption nd onduting studies on more omplex models y vrying the numer of onstruts nd inditors. The numer of inditors per ftor n ffet the fit mesures in struturl eqution modeling. A simultion study (Ding, Velier, & Hrlow, 1995) using the EQS omputer progrm ws onduted y inluding five different numers of inditors (, 3, 4, 5, nd 6), three levels of ftor loding size (0.5, 0.7 nd 0.9), four levels of smple size (50,100, 00 nd 500), nd two estimtion methods (ML nd WLS). The fit mesures onsidered in the study were Chi-Squre per degree of freedom rtio, NFI, NNFI, Centrlity m Index, RNI, nd CFI. It ws found tht GLS produed more improper solutions thn ML, while the GoF ws less ffeted y the improper solutions exept for NFI; NNFI ws found to e unffeted y the estimtion method. Improper solution (zero or negtive error vrines) ours when omintion of the numer of inditors per ftor is less thn three, the smple size is smll nd the ftor loding is low. A study y Shrm et l. (005) pplied simultion nlysis to investigte the effet of smple size, numer of inditors, ftor loding sizes nd ftor orreltion sizes on the men vlue nd perentge of models eptne when pre-speified utoff vlues were ompred 576 Pertnik J. Si. & Tehnol. 5 (): 575-586 (017)
Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures for Normed Version (NNCP), Reltive Non-Centrlity Index (RNI), TLI, RMSEA nd GFI. Corret nd misspeified models of orrelted two-ftor, four-ftor, six-ftor nd eightftor onfirmtory ftor models with four inditors for eh ftor were onsidered. Dt ws generted using GGNSM proedure in Fortrn progrmming involving 100, 00, 400 nd 800 smple sizes with 100 replitions for eh smple size. Three ftor lodings (0.3, 0.5 nd 0.7) nd three orreltions mong the ftors (0.3, 0.5 nd 0.7) were employed. This study found tht GFI, NNCP, RNI nd TLI were sensitive to smple sizes orresponding with the numer of inditors, while RMSEA ws dependent on smple size ut independent of numer of inditors. Bsed on the performne nd sensitivity of misspeifition, RNI nd TLI were found to e etter thn NNCP, followed y RMSEA nd GFI. These findings suggest tht GFI should not e used to evlute model fit. RNI nd TLI re reommended for ftor loding of 0.5 nd ove while NNCP re reommended to e used in onjuntion with RNI nd TLI. Additionlly, RMSEA is reommended to e used in onjuntion with NNCP, TLI nd RNI. A simultion study ws onduted to test the effet of smple size on ertin fit mesures suh s Chi-Squre, Stndrdized Root Men Squre (SRMR) nd CFI y Ioui (010). The study onsisted of six different smple sizes (30, 50, 100, 00, 500 nd 1000) with 000 replitions for eh smple size. It ws reported tht size of the smple should e more thn 50 sine the GoF mesures improve when smple size inreses. The idel numer of inditors for eh onstrut ws three nd Mximum Likelihood estimtion performed etter euse it ws reltively roust to the multivrite normlity ssumption. A Monte Crlo simultion proedure to ompre the performne of Covrine-Bsed SEM (CB-SEM) nd Prtil Lest Squre SEM (PLS-SEM) under normlity nd non-normlity onditions ws rried out y Jnnoo, Yp, Auhoyur, & Lzim (014). Under the normlity ondition, dt ws generted from norml distriution for the ltent onstruts nd residul. The model onsisted of one endogenous nd four exogenous onstruts with three inditors for eh. Under the non-normlity ondition, dt ws generted from Chi-Squre distriution for the inditors with degrees of freedom. The skewness s reported t 1.983 while kurtosis ws t 4.375. The smple sizes generted for oth dt sets were 0, 40, 90, 150 nd 00 nd replition ws done for 500 times for oth CB-SEM nd PLS-SEM y using R progrmming lnguge proedure. Their results reveled tht when ondition is norml nd smple size is smll, CB-SEM eomes inurte while PLS-SEM estimtes re loser to the true prmeter vlue. On the other hnd, when the smple is lrge, CB-SEM is etter thn PLS-SEM due to its lower vriility. Under non-normlity ondition, when the smple is smll, CB-SEM is inurte, ut hs etter ury ompred with PLS-SEM when smple size is lrge. Previous simultion studies involved different smple sizes nd GoF mesures. However, these studies ould not provide omprehensive nd onlusive findings on six ommon GoF mesures. Thus, this study exmined the effet of different smple sizes nd distriutions on GoF mesures in the struturl eqution model. The ojetives of this study were: () to identify the effet of different smple sizes on GoF mesures () to identify the effet of different distriutions on GoF mesures Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) 577
METHODS Ainur, A. K., Syng, M. D., Jnnoo, Z. nd Yp, B. W. Vrious types of model fit mesures reported y Hir Jr et l. (010) n e pplied to exmine the model. There re three GoF tegories: solute fit mesures, inrementl fit mesures nd prsimony fit mesures. The first tegory, solute fit mesures, determine how well priori model fits the smple dt nd demonstrtes whih proposed model hs the most superior fit. The methods used in this tegory re Chi-Squre test ( ) χ, RMSEA, Root Men Squre Residul (RMR), SRMR nd GFI. The seond tegory is inrementl fit mesures or known s omprtive fit mesures. These GoF mesures ompre the vlues to seline model. They inlude NFI, TLI nd CFI. The third tegory is prsimony fit mesures whih inludes AGFI, Prsimony Goodness-of-Fit Index (PGFI) nd Prsimonious Normed Fit Index (PNFI). Prsimony fit mesures djusts the numer of prmeters in the estimted model. However, this study only overed six ommon GoF mesures whih re GFI, AGFI, RMSEA, NFI, TLI nd CFI. Mximum Likelihood (ML) is the method of prmeter estimtion most ommonly used nd hd een set s defult prmeter estimtion tehnique in most of the sttistil softwre. It is otined y ompring the tul ovrine mtries representing the reltionship etween vriles with the estimted ovrine mtries of the est fitted model. The minimum fit funtion is: (1) where, is the minimum vlue of the fit funtion, is the estimted ovrine mtrix, S is the smple ovrine mtrix, is the tre of mtrix nd p is the numer of oserved vriles. The RMSEA is sed on the differene etween the S nd to indite the equivlene etween the struturl eqution model nd empiril dt. RMSEA ( ˆε ) is the squre root of the estimted disrepny due to the pproximtion over the degree of freedom (Shermellehengel et l., 003), s shown in the eqution elow: () where, is the numer of degree of freedom nd N is the smple size. The GFI nd AGFI re omprison mesures where the model of interest is ompred with seline model. The seline model is the null model in whih ll mesured vriles re unrelted to eh other. The GFI nd AGFI re otined using the following equtions (Shermelleh-engel et l., 003) [8] : t n Ft χ GFI = 1 = 1 (3) F χ n 578 Pertnik J. Si. & Tehnol. 5 (): 575-586 (017)
Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures where, χ t is the Chi-Squre of the trget model, χ n is the Chi-Squre of the null model (seline model) nd F is the orresponding minimum fit funtion vlue. (4) where, is the numer of degree of freedom for the trget model, is the numer of freedom for the null model. NFI nd NNFI, lso known s TLI, re otined using the formul elow (Shermellehengel et l., 003): i χt χi χt 1 χi χ Ft NFI = = = 1 F where, χ is the Chi-Squre of the independent model (seline model), i i (5) (6) The CFI is introdued to overome the prolem of underestimtion of fit when the smple is too smll for NFI (Bentler, 1990). The CFI n e otined using the following eqution (Shermelleh-engel et l., 003): (7) where, mx is the mximum of the vlues in rkets. The reommended levels to e used to represent n eptle model fit for the GoF mesures re presented in Tle 1. Tle 1 Aeptle Threshold Levels Fit Indies Aeptle Threshold Levels GFI >0.90 (Hooper et l., 008) AGFI >0.85 (Shermelleh-engel et l., 003) >0.95 (Hooper et l., 008) RMSEA <0.06 (Hu & Bentler, 1999) NFI >0.90 (Arukle, 1995) TLI >0.95 (Hu & Bentler, 1999) CFI >0.95 (Hu & Bentler, 1999) Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) 579
SIMULATION DESIGN Ainur, A. K., Syng, M. D., Jnnoo, Z. nd Yp, B. W. This study investigted the performne of GoF mesures under norml nd non-norml onditions in SEM. Dt were generted using speified model given y Goodhue, Lewis, & Thompson (01). The model inluded five onstruts (one endogenous vrile nd four exogenous vriles) with three inditors for eh onstruts. For eh of the three inditors, stndrdised loding ws fixed t 0.70, 0.80 nd 0.90 respetively. Dt ws generted from Norml distriution N(0,1), nd non-norml distriutions whih re Chi-Squre distriution with degrees of freedom, χ ( ) nd Chi-Squre distriution with 0.5 degrees of freedom, χ ( 0.5). The distriution of N(0,1) is symmetri norml distriution, while χ ( ) nd χ ( 0.5) re skewed distriution with skewness of nd 4, respetively. For eh dt set, vrious smple sizes were onsidered to investigte their effet on GoF mesures, suh s n= 100, 150, 00, 50, 300, 500, 1000, 1500 nd 000. The simultion involved 500 replitions for eh smple size nd ondition y using R, open soure progrmming lnguge. The minimum smple size of 100 ws suggested y Hir Jr et l. (010) for model ontining onstruts of less thn six with inditors of more thn three for eh onstrut nd possessing high item ommunlities ( 0.6). The theoretil model used in this study is shown in Figure 1. Figure 1. Theoretil model - (Soure: Goodhue, 01) RESULTS AND DISCUSSIONS This setion reports the results of GoF mesures under norml nd non-norml distriutions. Box-plots were drwn to represent the GoF mesures for vrious smple sizes under norml nd non-norml onditions. The performne of GoF mesures for vrious smple sizes under norml nd non-norml onditions re shown in Tle. The GoF mesures improved s the smple size 580 Pertnik J. Si. & Tehnol. 5 (): 575-586 (017)
Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures inresed for ll distriutions. The TLI nd CFI showed etter fit ompred with GFI nd AGFI for ll smple sizes. Thus, inrementl fit mesures were found to e less ffeted y smple sizes. Tle Men of GoF mesures n GFI AGFI RMSEA NFI TLI CFI 100 100 100 0.904 0.90 0.903 0.856 0.853 0.855 0.03 0.08 0.06 0.897 0.896 0.898 0.991 0.986 0.990 0.986 0.986 150 150 150 0.933 0.93 0.933 0.900 0.899 0.900 0.018 0.00 0.019 0.99 0.98 0.930 0.994 0.99 0.99 00 00 00 0.949 0.949 0.949 0.93 0.93 0.94 0.015 0.015 0.015 0.946 0.946 0.947 0.997 0.996 0.997 0.994 50 50 50 0.959 0.959 0.959 0.939 0.939 0.938 0.01 0.013 0.013 0.957 0.957 0.957 0.998 0.998 0.997 0.996 0.996 300 300 300 0.966 0.965 0.966 0.948 0.948 0.948 0.011 0.011 0.011 0.964 0.964 0.964 0.997 0.997 0.997 500 500 500 0.979 0.979 0.979 0.968 0.968 0.969 0.008 0.008 0.008 0.978 0.978 0.978 0.998 0.998 0.998 1000 1000 1000 0.984 0.984 0.984 0.005 0.006 0.006 1500 1500 1500 0.004 0.004 0.004 000 000 000 0.99 0.99 0.99 0.004 0.004 0.004 0.994 0.994 0.994 Note: Norml N(0,1) Chi-Squre () Chi-Squre (0.5) Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) 581
Ainur, A. K., Syng, M. D., Jnnoo, Z. nd Yp, B. W. The results in Tle were represented in ox-plots s shown in Figure for omprison purpose. Only the ox-plot for GFI, RMSEA, TLI nd CFI mesures were presented euse the pttern for AGFI nd NFI mesures were found to e similr with GFI. Results show tht GFI nd AGFI re ffeted y smple size. These results ontrdited the findings of Joreskog & Sorom (198) nd Bgozzi & Yi (1988) who reported tht GFI nd AGFI re independent of smple size. However, our results onfirm the findings of Fn, Thompson, & Wng (1999) who reported tht GFI nd AGFI re overly ffeted y smple size. GFI nd AGFI re found to e roust ginst non-normlity nd these results re onsistent with the findings y Joreskog & Sorom (198) nd Bgozzi & Yi (1988). The ehviour of NFI is similr to GFI nd AGFI. Figure (). Boxplots for GFI Figure () nd Figure () show the results of RMSEA nd TLI for vrious smple sizes under norml nd non-norml onditions. The oxplot (Figure ()) shows tht RMSEA is ffeted y smple size nd less ffeted y non-normlity when smple size is lrge. Figure () shows tht TLI is less ffeted y smple size nd this result supports the findings of Fn et l. (1999). However, TLI is not ffeted y distriution of dt. Aording to Bentler (1990), TLI is diffiult to interpret sine the vlue n exeed the rnge (0-1). Figure (). Boxplots for RMSEA 58 Pertnik J. Si. & Tehnol. 5 (): 575-586 (017)
Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures Figure (). Boxplots for TLI Figure (d) displys the performne of CFI for vrious smple sizes under norml nd non-norml onditions. The oxplot show tht CFI is less ffeted y smple size nd this result supports the findings of Fn et l. (1999). CFI is lso less ffeted y distriution of dt. Figure (d). Boxplots for CFI These oxplots show tht GoF mesures improved with the inrese in smple size s expeted. However, the simultion results indited tht GoF mesures re not severely ffeted if distriutions re non-norml. CONCLUSIONS This simultion study ompred the effet of smple size nd distriution of dt on GoF mesures for SEM. We found tht the inrementl fit mesures re less ffeted y smple size nd when distriution of dt is not norml. However, severe non-normlity due to the presene of outliers might ffet GoF for SEM. Future simultion study n investigte the GoF mesures in the presene of outliers nd when dt is non-norml with high multivrite skewness nd kurtosis. Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) 583
REFERENCES Ainur, A. K., Syng, M. D., Jnnoo, Z. nd Yp, B. W. Arukle, J. L. (1995). Amos 17 User s Guide (pp. 1995-005). Chigo, IL: SmllWters Corportion. Bgozzi, R. R., & Yi, Y. (1988). On the Evlution of Struturl Eqution Models. Journl of the Ademy of Mrketing Siene, 16(1), 74 94. Bndlos, D. L. (00). The Effets of Item Preling on Goodness-of-Fit nd Prmeter Estimte Bis in Struturl Eqution Modeling. Struturl Eqution Modeling, 9(1), 78 10. Berden, W. O., Shrm, S., & Teel, J. E. (198). Smple Size Effets on Chi Squre nd Other Sttistis Used in Evluting Cusl Models. Journl of Mrketing Reserh, 19(4), 45 430. Bentler, P. M. (1990). Comprtive Fit Indexes in Struturl Models. Psyhologil Bulletin, 107(), 38 46. Currn, P. J., West, S. G., Finh, J. F., Aiken, L., Bentler, P., & Kpln, D. (1996). The Roustness of Test Sttistis to Nonnormlity nd Speifition Error in Confirmtory Ftor Anlysis. Psyhologil Methods, 1(l), 16 9. Ding, L., Velier, W. F., & Hrlow, L. L. (1995). Effets of Estimtion Methods, Numer of Inditors per Ftor, nd Improper Solutions on Struturl Eqution Modeling Fit Indies. Struturl Eqution Modeling, (), 119 144. Fn, X., Thompson, B., & Wng, L. (1999). Effets of smple size, estimtion methods, nd model speifition on struturl eqution modeling fit indexes. Struturl Eqution Modeling, 6(1), 56 83. doi:10.1080/10705519909540119 Fox, J. (00). Struturl Eqution Models. Appendix to An R nd S-PLUS Compnion to Applied Regression, 1 0. Goodhue, D. L., Lewis, W., & Thompson, R. (01). Does PLS Hve Advntges for Smll Smple Size or Non-Norml Dt? MIS Qurterly, 36(3), 981 A16. Hir JR, J. F., Blk, W. C., Bin, B. J., & Anderson, R. E. (010). Multivrite Dt Anlysis (7th Ed.). Upper Sddle River, NJ: Person Prentie Hll. Hooper, D., Coughln, J., & Mullen, M. R. (008). Struturl Eqution Modelling : Guidelines for Determining Model Fit. The Eletroni Journl of Business Reserh Methods, 6(1), 53 60. Hu, L., & Bentler, P. M. (1999). Cutoff riteri for fit indexes in ovrine struture nlysis: Conventionl riteri versus new lterntives. Struturl Eqution Modeling: A Multidisiplinry Journl, 6(1), 1 55. doi:10.1080/10705519909540118 Ioui, D. (010). Struturl equtions modeling: Fit Indies, smple size, nd dvned topis. Journl of Consumer Psyhology, 0(010), 90 98. doi:10.1016/j.jps.009.09.003 Jnnoo, Z., Yp, B. W., Auhoyur, N., & Lzim, M. A. (014). The Effet of Nonnormlity on CB-SEM nd PLS-SEM Pth Estimtes. Interntionl Journl of Mthemtil, Computtionl, Physil nd Quntum Engineering, 8(), 85 91. Joreskog, K. G., & Sorom, D. (198). Reent developments in struturl eqution modeling. Journl of Mrketing Reserh, 19(4), 404 416. Lei, P. W., & Wu, Q. (007). Introdution to struturl eqution modeling: Issues nd prtil onsidertions. Edutionl Mesurement: issues nd prtie, 6(3), 33-43. 584 Pertnik J. Si. & Tehnol. 5 (): 575-586 (017)
Smple Size nd Non-Normlity Effets on Goodness of Fit Mesures Mrsh, H. W., Bll, J. R., & MDonld, R. P. (1988). Goodness-of-Fit Indexes in Confirmtory Ftor Anlysis: The Effet of Smple Size. Psyhologil Bulletin, 103(3), 391 410. doi:10.1037//0033-909.103.3.391 Mulik, S. A., Jmes, L. R., Alstine, J. Vn, Bennett, N., Lind, S., & Stilwell, C. D. (1989). Evlution of Goodness-of-Fit Indies for Struturl Eqution Models. Psyhologil Bulletin, 105(3), 430 445. Roh, C. M., & Chellduri, P. (01). Item Prels in Struturl Eqution Modeling: n Applied Study in Sport Mngement. Interntionl Journl of Psyhology nd Behviorl Sienes, (1), 46 53. doi:10.593/j.ijps.01001.07 Shermelleh-engel, K., Moosrugger, H., & Müller, H. (003). Evluting the Fit of Struturl Eqution Models : Tests of Signifine nd Desriptive Goodness-of-Fit Mesures. Methods of Psyhologil Reserh Online, 8(), 3 74. Shrm, S., Mukherjee, S., Kumr, A., & Dillon, W. R. (005). A simultion study to investigte the use of utoff vlues for ssessing model fit in ovrine struture models. Journl of Business Reserh, 58(7), 935-943. Pertnik J. Si. & Tehnol. 5 (): 575-586 (017) 585