Partial Correlation Screening for Estimating Large Precision Matrices, with Applications to Classifi

来源 :The 24th International Workshop on Matrices and Statistics(第 | 被引量 : 0次 | 上传用户:fox542
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  We propose Partial Correlation Screening (PCS) as a new row-by-row approach. To estimate the i-th row of Ω, 1 ≤ i ≤ p, PCS uses a Screen step and a Clean step. In the Screen step, PCS re-cruits a (small) subset of indices using a stage-wise algorithm, where in each stage, the algorithm updates the set of recruited indices by adding the index j that has the largest empirical partial correlation (in magnitude) with i, given the set of indices recruited so far. In the Clean step, PCS rst re-investigates all recruited indices in hopes of removing false positives, and then uses the resultant set of indices to reconstruct the i-th row of Ω.PCS is computationally efficient and modest in memory use: to estimate a row of Ω,it only needs a few rows (determined sequentially) of the empirical covariance matrix. This enables PCS to execute the estimation of a large precision ma-trix (e.g., p = 10K) in a few minutes, and open doors to estimating much larger precision matrices. We use PCS for classification. Higher Criticism Thresholding (HCT) is a recent classifier that enjoys opti-mality, but to exploit its full potential in practice, one needs a good estimate of the precision matrix. Combining HCT with any ap-proach to estimating Ω gives a new classifier: examples include HCT-PCS and HCT-glasso. We set up a general theoretical framework and show that in a broad context, PCS fully recovers the support of Ω and HCT-PCS yields optimal classification behavior. Our proofs shed interesting light on the behavior of stage-wise procedures.
其他文献
Active matter, which is a physical abstraction of running animals, flying birds, marching locusts, swimming bacteria, migrating cell and even cytoskeleton, has
会议
  Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consid
会议
  The need to estimate structured covariance matrices arises in a variety of applications and the problem is widely studied in statistics. A new method is pro
会议
Blood vessel systems and leaf venations are typical biological transport networks. The energy consumption for such a system to perform its biological functions
会议
The growth of solid tumors can be described at a number of different scales from the cell to the organ scales. For a large number of cells, the fluid mechanical
会议
I will discuss our recent model [1] that incorporates cell proliferation with differentiation into different cell types, allowing backward de-differentiation [2
会议
Any coupled network of nonlinear oscillators has a potential to exhibit largescale collective patterns and waves, which are often much larger than that of the c
会议
Cancer remains a leading cause of death with current therapeutic methods. Novel therapies are therefore in urgent needs. One potential method is to use bacteria
会议
会议
We will describe work on epithelial wound healing in drosophila pupae and some more recent work on gap closure in monolayers of MDCK cells or keratinocytes. The
会议