Positive-Definite Sparse Precision Matrix Estimation

Advaces i Pure Mathematics, 7, 7, -3 http://www.scirp.org/joural/apm ISSN Olie: 6-384 ISSN Prit: 6-368 Positive-Defiite Sparse Precisio Matrix Estimatio i Xia, Xudog Huag, Guapeg Wag, Tao Wu School of Mathematics ad Computer Sciece, Ahui Normal Uiversity, Wuhu, Chia School of Mathematics ad Statistics, Huazhog Uiversity of Sciece ad Techology, Wuha, Chia How to cite this paper: Xia,., Huag, X.D., Wag, G.P. ad Wu, T. (7) Positive-Defiite Sparse Precisio Matrix Estimatio. Advaces i Pure Mathematics, 7, -3. http://dx.doi.org/.436/apm.7.7 Received: December 9, 6 Accepted: Jauary 9, 7 Published: Jauary 3, 7 Copyright 7 by authors ad Scietific Research Publishig Ic. This wor is licesed uder the Creative Commos Attributio Iteratioal icese (CC BY 4.). http://creativecommos.org/liceses/by/4./ Ope Access Abstract The positive-defiiteess ad sparsity are the most importat property of high-dimesioal precisio matrices. To better achieve those property, this paper uses a sparse lasso pealized D-trace loss uder the positive-defiiteess costrait to estimate high-dimesioal precisio matrices. This paper derives a efficiet accelerated gradiet method to solve the challegig optimizatio problem ad establish its coverges rate as O. The umerical simulatios illustrated our method have competitive advatage tha other methods. Keywords Positive-Defiiteess, Sparsity, D-Trace oss, Accelerated Gradiet Method. Itroductio I the past twety years, the most popular directio of statistics is highdimesioal data. I fuctioal magetic resoace imagig (MRI), bioiformatics, Web miig, climate research, ris maagemet ad social sciece, it ot oly has a wide rage of applicatios, but also the mai directio of scietific research at preset. I theoretical ad practical, high-dimesioal precisio matrix estimatio always plays a very importat role ad has wide applicatios i may fields. Thus, estimatio of high-dimesioal precisio matrix is icreasigly becomig a crucial questio i may field. However, estimatio of highdimesioal precisio matrix has two difficulty: ) sparsity of estimator; (ii) the positive-defiiteess costrait. Huag et al. [] cosidered usig Cholesy decompositio to estimate the precisio matrix. Although the regularized DOI:.436/apm.7.7 Jauary 3, 7

. Xia et al. Cholesy decompositio approach ca achieve a positive-semidefiiteess, it ca ot guaratee sparsity of estimator. Meishause et al. [] use a eigh- bourhood selectio scheme i which oe ca sequetially estimate the support of each row of precisio matrix by fittig lasso pealized least squares regressio model. Peg et al. [3] cosidered a joit eighbourhood estimator by usig the lasso pealizatio. Yua [4] cosidered the Datzig selector to replace the lasso pealized least squares i the eighbourhood selectio scheme. Cai et al. [5] cosidered a costraied miimizatio estimator for estimatig sparse precisio matrices. However, this methods metioed are ot always achieve a positive-defiiteess. To overcome the difficulty (ii), oe possible method is usig the eigedecompositio of ˆΘ ad desigig ˆΘ to satisfy coditio { Θ }. Assume p that ˆΘ has the eige-decompositio ˆ T Θ= λ i ivv = i i ad the a positive semip defiite estimator was gaied by settig ˆ T Θ= max ( λ,) i i vv = i i. However, this strategy destroys the sparsity patter of ˆΘ for sparse precisio matrix estimatio. Yua et al. [6] cosidered the lasso pealized lielihood criterio ad used the maxd et al. gorithm to compute the estimator. riedma et al. [7] cosidered the graphical lasso algorithm for solvig the lasso pealized Gaussia lielihood estimator. Witte et al. [8] optimized the graphical lasso. By the lasso or pealized Gaussia lielihood estimator, thoses methods simultaeously achieve positive-defiiteess ad sparsity. Recetly, Zhag et al. [9] cosider a costraied covex optimizatio framewor for high-dimesioal precisio matrix. They used lasso pealized D-trace loss replace traditioal lasso fuctio, ad eforced the positive-defiite costrait { Θ εi } for some arbitrarily small ε >. I their wor, focusig o solvig problem as follow: ˆ + Θ = arg mi Θ, Σ ˆ tr ( Θ ) + λ Θ (),off Θ ε I It is importat to ote that ε is ot a tuig parameter lie λ. We simply iclude ε i the procedure to esure that the smallest eigevalue of the estimator is at least ε. They developed a efficiet alteratig directio method of multipliers (ADMM) to solve the challegig optimizatio problem () ad establish its covergece properties. To gai a better estimator for high-dimesioal precisio matrix ad achieve the more optimal covergece rate, this paper maily propose a effective algorithm, a accelerated gradiet method ([]), with fast global covergece rates to solve problem (). This method maily basis o the Nesterov's method for acceleratig the gradiet method ([] []), showig that by exploitig the special structure of the trace orm, the classical gradiet method for smooth problems ca be adapted to solve the trace regularized osmooth problems. However, for our problem (), we have ot trace orm, istead of is orm form, but this method have the similar efficietly result for our problem. Numerical results show that this method for our problem () ot oly has

. Xia et al. sigificat computatioal advatages, but also achieves the optimal covergece rate as O. The paper is orgaized as follows: Sectio itroduces our methodology, icludig model establishig i Sectio.; step size estimatio i Sectio.; a accelerate gradiet method algorithm i Sectio.3; the covergece aalysis results of this algorithm i Sectio.4. Sectio 3 itroduced umerical results for our method i comparig with other methods. Ad discussio are made i Sectio 4. All proofs are give i the Appedix.. A Accelerated Gradiet Method.. Model Establishig Accordig to itroductio, our optimizatio problem D-trace oss fuctio as follow: mi ( Θ ) : = Θ, Σˆ tr ( Θ ) + λ Θ (),off Θ ε I where λ is a oegative pealizatio parameter, Σ ˆ is the sample cova- riace matrix. ˆ T Σ = XX i= i i, ad Θ = Θ =, off Θ i j ij is the off diagoal pealty. Defiig f ( Θ ) = Θ, Σˆ tr ( Θ ), ad f ( Θ ) is a co- tiuously differetiable fuctio. Cosiderig the gradiet step Θ =Θ f ( Θ ) (3) t t is a stepsize, f ( ˆ ˆ ) where Θ = ΘΣ + Σ Θ I. The smooth part (3) ca be reformulated equivalet as a proximal regularizatio of the liearized fuctio f ( Θ ) at Θ : Θ = arg miφ ΘΘ, (4) where t Θ ε I t Φt ( Θ, Θ ), = f Θ + Θ Θ f Θ + Θ Θ (5) Based o this equivalece relatioship, solvig the optimizatio problem () by the followig iterative step: (, ) arg mi (, ) Θ = arg miψ ΘΘ Φ ΘΘ + λ Θ t t Θ εi Θ εi = arg mi Θ, Σˆ tr Θ + Θ Θ, Θ Σ ˆ +Σˆ Θ Θ ε I t I + Θ Θ + λ Θ t ( ˆ ˆ ) I λ Θ ε I t = arg mi Θ Θ Θ Σ +Σ Θ + Θ (6) 3

. Xia et al. with equality i the last lie by igorig terms that do ot deped o Θ. Defiig + { ε } the + C as the projectio of a matrix C oto the covex coe C I. Assumig that C has the eige-decompositio T λ jvv j= j j, ad p C ca be obtaied as max (, T λj ε j ) vv = j j. Defiig a etry-wise soft-thresholdig rule for all the off-diagoal elemets of a matrix (, τ) = { ( j, τ) } with j, p ( j, τ) = sig ( j ) max ( j τ, ) I { } + j I j l { j= } S Z s z s z z z z. Thus, the above problem ca be summarized i the followig theorem: Theorem : et B S ( B) is give by ( B) ( ( B, )) S S λ + s B, λ = sig B max B λ,, ad Θ is symmetric covariace matrix, the: = arg mi Θ B + λ Θ Θ ε I =, where S( B, ) { s( B j, )} jl, ( j ) ( j) ( j ) I{ j } + B j I { j = }. λ = λ with The proof of this theorem is easy by applyig the soft-thresholdig method... Step Size Estimatio To guaratee the covergece rate of the resultig iterative sequece, irstly givig the relatioship betwee our proximal fuctio Ψ t ad the objectio fuctio at the certai poit. emma : et Θ = arg miψ ΘΘ, (8) T µ µ Θ ε I where Ψ is defied i Equatio (6). Assumig the followig iequality holds: T Θ Ψ T Θ, Θ (9) the for ay Θ, the: ( µ ) u ( µ ) µ ( µ ) µ µ, µ Θ T Θ T Θ Θ + Θ Θ T Θ Θ () This lemma is proved i the Appedix. At each iterative step of the algorithm, a appropriate step size µ is eeded Θ = T Θ ad to satisfy u (, ) µ p p (7) Θ Ψ Θ Θ () Sice the gradiet of f( ) satisfies ipschitz cotiuous, accordig to Nesterov et al. [] wor, havig follow lemma. emma : Supposig that f ( X ) is a covex fuctio, ad the gradiet of f ( X ) deote f ( X) is ipschitz cotiuous with costat, the: so, we have Hece, whe f ( X) f ( Y) + X Y, f ( Y) + X Y X, Y () ( ), f T Θ f Θ + T Θ Θ f Θ + T Θ Θ (3) µ, the: 4

( µ ( Θ) ) Φµ ( µ ( Θ), Θ ) + λ µ ( Θ ) = Ψµ ( µ ( Θ), Θ). Xia et al. T T T T (4) The above results show that the coditio i Equatio () is always satisfied whe the update rule Θ = T Θ (5).3. A Accelerate Gradiet Method Algorithm I practice, may be uow or it is expesive to compute. To use the followig step size estimatio method, usually, givig a iitial estimate of as ad icreasig this estimate with a multiplicative factor γ > repeatedly util the coditio i Equatio () is satisfied. It is well ow ([] []) that if the objectio fuctio is smooth, the the accelerate gradiet method ca achieve the optimal covergece rate of O. Recetly, there have other similar methods applyig i problems cosisted a smooth part ad a o-smooth part ([] [3] [4] [5]). The givig the accelerate gradiet algorithm to solve the optimizatio problem i Equatio (). Algorithm:A accelerate gradiet method algorithm for high-dimesioal precisio matrix ) Iitialize:, γ, Θ =Θ, α = ) Iterate: 3) set = 4) While ( T ( Θ ) ) ( ( ), >Ψ T Θ Θ ), set : = γ 5) Set = ad update Θ = T Θ + + 4α α + = α Θ =Θ + Θ Θ + α +.4. Covergece Aalysis I our method, two sequeces Θ ad Θ are updated recursively. I particular, Θ is the approximate solutio at the th step ad Θ is called the search poit ([] []), which is costructed as a liear combiatio of the latest two approximate solutios Θ ad Θ. I this sectio, the covergece rate of the method ca be showed as O. This result is sum- marized i the followig theorem. Theorem : et { Θ } ad { Θ } be the covariace matrices sequece geerated by our algorithm. The for ay, havig ( ) ( + ) γ Θ Θ Θ Θ (6) 5

. Xia et al. where arg mi Θ = Θ. Θ ε I 3. Simulatio I this sectio, providig umerical results for our algorithm which will show our algorithmic advatages by three model. I the simulatio study, data were geerated from N (,Σ ), where Θ = ( Σ ). Ad the sample size was tae to be = 4 i all models, ad let p =5 i Models ad, ad p = 484 i Model 3, which is similar to Zhag et al. [9]. The umerical results of three models as follow: Model : Θ ii, =, Θ i, j =. for i j ad Θ i, j= otherwise. Model : Θ ii, =, Θ i, j =. for i j 4 ad Θ i, j= otherwise. Model 3: Θ ii, =, Θ ii+, =. for mod ( i, p ), Θ =. ad ii, + p Θ i, j= otherwise; this is the grid model i Raviumar et al. [6] ad requires p to be a iteger. Simulatio results based o idepedet replicatios are showed i Table. This paper maily compare the three methods i terms of four quatities: the ˆΘ Θ, the matrix ˆ, Θ Θ, ad the operator ris E ris E (, ) percetages of correctly estimated ozeros ad zeros (TP ad TN), where, orm max i ( X j i, j ) is writte as X,. I the first two colums smaller umbers are better; i the last two colums larger umbers are better. I geeral, Table shows that our estimator performs better tha Zhag et al. s method estimator ad the lasso pealized Gaussia lielihood estimator. 4. Coclusio This paper maily estimate positive-defiite sparse precisio matrix estimatio Table. Compariso of our method with Zhag et al. s method ad graphical lasso. Operator, TP TN Model our method.77.8 94.9 99. Zhag et al. s method.77.6 88.8 98.77 Graphical lasso.78.6 88. 97.65 Model our method.59.6 7. 98.4 Zhag et al. s method.59.9 63.47 98.66 Graphical lasso.6. 64.88 97.4 Model 3 our method.56.8 99. Zhag et al. s method.56.9 99.4 98.57 Graphical lasso.58.6 99.76 97.48 6

. Xia et al. via lasso pealized D-trace loss by a efficiet accelerated gradiet method. The positive-defiiteess ad sparsity are the most importat property of large covariace matrices, our method ot oly efficietly achieves these property, but also shows a better covergece rate. Numerical results have show that our estimator also have a better performace, comparig to Zhag et al. s method ad the Graphical lasso method. udig This project was supported by Natioal Natural Sciece oudatio of Chia (763) ad the Natioal Statistical Scietific Research Projects (5Z54). Refereces [] Huag, J., iu, N., Pourahmadi, M. ad iu,. (6) Covariace Matrix Selectio ad Estimatio via Pealised Normal ielihood. Biometria, 93, 85-98. https://doi.org/.93/biomet/93..85 [] Meishause, N. ad Bühlma, P. (6) High-Dimesioal Graphs ad Variable Selectio with the asso. Aals of Statist, 34, 436-46. https://doi.org/.4/953668 [3] Peg, J., Wag, P., Zhou, N. ad Zhu, J. (9) Partial Correlatio Estimatio by Joit Sparse Regressio Models. Joural of the America Statistical Associatio, 4, 735-746. https://doi.org/.98/jasa.9.6 [4] Yua, M. () High Dimesioal Iverse Covariace Matrix Estimatio via iear Programmig. Joural of Machie earig Research,, 6-86. [5] Cai, T., iu, W. ad uo, X. () A Costraied Miimizatio Approach to Sparse Precisio Matrix Estimatio. Joural of the America Statistical Associatio, 6, 594-67. https://doi.org/.98/jasa..tm55 [6] Yua, M. ad i, Y. (7) Model Selectio ad Estimatio i the Gaussia Graphical Model. Biometria, 94, 9-35. https://doi.org/.93/biomet/asm8 [7] riedma, J.H., Hastie, T.J. ad Tibshirai, R.J. (8) Sparse Iverse Covariace Estimatio with the Graphical asso. Biostatistics, 9, 43-44. https://doi.org/.93/biostatistics/xm45 [8] Witte, D., riema, J.H. ad Simo, N. () New Isights ad aster Computatios for the Graphical asso. Joural of Computatioal ad Graphical Statistics,, 89-9. https://doi.org/.98/jcgs..5a [9] Zhag, T. ad Zou, H. (4) Sparse Precisio Matrix Estimatio via asso Pealized D-Trace oss. Biometria,, 3-. https://doi.org/.93/biomet/ast59 [] Ji, S. ad Ye, J. (9) A Accelerated Gradiet Method for Trace Norm Miimizatio. Iteratioal Coferece o Machie earig, 58, 457-464. https://doi.org/.45/553374.553434 [] Nesterov, Y. (983) A Method for Solvig a Covex Programmig Problem with Covergece Rate O. Soviet Mathematics Dolady, 7, 37-367. [] Nesterov, Y. (3) Itroductory ectures o Covex Optimizatio: A Basic Course. Kluwer Academic. [3] Toh, K.C. ad Yu, S. () A Accelerated Proximal Gradiet Algorithm for 7

. Xia et al. Nuclear Norm Regularized iear east Squares Problems. Pacific Joural of Optimizatio, 6, 65-64. [4] Tseg, P. (8) O Accelerated Proximal Gradiet Methods for Covex-Cocave Optimizatio. Siam Joural o Optimizatio. [5] Bec, A. ad Teboulle, M. (9) A ast Iterative Shriage-Thresholdig Algorithm for iear Iverse Problems. Siam Joural o Imagig Scieces,, 83-. https://doi.org/.37/87654 [6] Raviumar, P., Waiwrighr, M., Rasutti, G. ad Yu, B. () High-Dimesioal Covariace Estimatio by Miimizig -Pealized og-determiat Divergece. Electroic Joural of Statistics, 5, 935-98. https://doi.org/.4/-ejs63 8

. Xia et al. Appedix: Proff of Theorems ad emmas Appedix: Proof of emma Sice that both the trace fuctio ad orm are all covex fuctio, so Θ, Σˆ ( ˆ tr Θ = tr Θ Σ) tr ( Θ) tr ( Θ Σˆ ) tr, ( ˆ ˆ Θ + Θ Θ ΘΣ +ΣΘ ) I λ Θ λ T ( Θ ) + λ Θ T( Θ ), g( T( Θ )) (8) where g( T( Θ )) T( Θ ), is the sub-gradiet of orm at poit T ( Θ ). Sice ( Tµ ( Θ )) Ψµ ( Tµ ( Θ ), Θ ) ad combig i Equatios (7), (8) the ( Θ) ( ( Θ )) ( Θ) Ψ( ( Θ ), Θ ) Θ Θ, ( ΘΣ ˆ ˆ +ΣΘ ) I + λ Θ T( Θ ), g( T( Θ )) T T ˆ ( ˆ T Θ Θ, Θ +ΣΘ ) I T ( Θ ) Θ ( ˆ ˆ ) λ ( ) = Θ T Θ, ΘΣ +ΣΘ I + g T Θ T Θ Θ Sice T ( Θ ) is a miimizer of Ψ ( ΘΘ, ), thus ( ) I T ( ) λgt ( ) (7) (9) ΘΣ ˆ +Σˆ Θ + Θ Θ + Θ = () So the Equatio (9) ca be simplified as: ( ) ( ) λ ( ) Θ T Θ Θ T Θ, ΘΣ +Σ Θ I + g T Θ T Θ Θ, ( ) = Θ T Θ T Θ Θ T Θ Θ = Θ Θ, T( Θ ) Θ + T( Θ ) Θ Appedix: Proof of Theorem Defiig () U α α V = Θ Θ, easily obtaiig ( α V α + V + ) U + U () = Θ Θ Θ, + sice +. so αv α + V + U + U (3) + By applyig emma, easily obtaiig: ( Θ ) ( Θ ) = ( Θ ) ( ( Θ ) ) T Θ Θ Θ Θ (4) 9

. Xia et al. so: V Θ Θ Θ Θ (5) Applyig (3) (5), the: V + + Θ Θ (6) α + Combiig the Equatio (6) ad the relatio α taiig: ( ) ( + ) γ Θ Θ Θ Θ Θ Θ ( + ) ( + ) 4, easily ob- (7) Submit or recommed ext mauscript to SCIRP ad we will provide best service for you: Acceptig pre-submissio iquiries through Email, aceboo, iedi, Twitter, etc. A wide selectio of jourals (iclusive of 9 subjects, more tha jourals) Providig 4-hour high-quality service User-friedly olie submissio system air ad swift peer-review system Efficiet typesettig ad proofreadig procedure Display of the result of dowloads ad visits, as well as the umber of cited articles Maximum dissemiatio of your research wor Submit your mauscript at: http://papersubmissio.scirp.org/ Or cotact apm@scirp.org 3