【摘 要】
:
We propose Partial Correlation Screening (PCS) as a new row-by-row approach. To estimate the i-th row of Ω, 1 ≤ i ≤ p, PCS uses a Screen step and a Clean
【机 构】
:
Carnegie Mellon University and National University of Singapore,Singapore
【出 处】
:
The 24th International Workshop on Matrices and Statistics(第
论文部分内容阅读
We propose Partial Correlation Screening (PCS) as a new row-by-row approach. To estimate the i-th row of Ω, 1 ≤ i ≤ p, PCS uses a Screen step and a Clean step. In the Screen step, PCS re-cruits a (small) subset of indices using a stage-wise algorithm, where in each stage, the algorithm updates the set of recruited indices by adding the index j that has the largest empirical partial correlation (in magnitude) with i, given the set of indices recruited so far. In the Clean step, PCS rst re-investigates all recruited indices in hopes of removing false positives, and then uses the resultant set of indices to reconstruct the i-th row of Ω.PCS is computationally efficient and modest in memory use: to estimate a row of Ω,it only needs a few rows (determined sequentially) of the empirical covariance matrix. This enables PCS to execute the estimation of a large precision ma-trix (e.g., p = 10K) in a few minutes, and open doors to estimating much larger precision matrices. We use PCS for classification. Higher Criticism Thresholding (HCT) is a recent classifier that enjoys opti-mality, but to exploit its full potential in practice, one needs a good estimate of the precision matrix. Combining HCT with any ap-proach to estimating Ω gives a new classifier: examples include HCT-PCS and HCT-glasso. We set up a general theoretical framework and show that in a broad context, PCS fully recovers the support of Ω and HCT-PCS yields optimal classification behavior. Our proofs shed interesting light on the behavior of stage-wise procedures.
其他文献
Active matter, which is a physical abstraction of running animals, flying birds, marching locusts, swimming bacteria, migrating cell and even cytoskeleton, has
Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consid
The need to estimate structured covariance matrices arises in a variety of applications and the problem is widely studied in statistics. A new method is pro
Blood vessel systems and leaf venations are typical biological transport networks. The energy consumption for such a system to perform its biological functions
The growth of solid tumors can be described at a number of different scales from the cell to the organ scales. For a large number of cells, the fluid mechanical
I will discuss our recent model [1] that incorporates cell proliferation with differentiation into different cell types, allowing backward de-differentiation [2
Any coupled network of nonlinear oscillators has a potential to exhibit largescale collective patterns and waves, which are often much larger than that of the c
Cancer remains a leading cause of death with current therapeutic methods. Novel therapies are therefore in urgent needs. One potential method is to use bacteria
We will describe work on epithelial wound healing in drosophila pupae and some more recent work on gap closure in monolayers of MDCK cells or keratinocytes. The