The use of a generalized fold change for ranking differentially expressed genes from RNA-seq data

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:wycgdxx86
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: Based on RNA-seq data, currently, there is a lack of satisfactory method for detecting differentially expressed genes when only a single biological replicate is available.Surprisingly, even as the sequencing cost decreases, most of published RNA-seq studies do not have biological replicates.For example, in the last four year, almost 70% of all the human RNA-seq samples in Gene Expression Omnibus (GEO) do not have biological replicates.From 2010 to 2011, the number of un-replicated RNA-seq samples increases even faster than the number of replicated RNA-seq samples.Methods: In this paper, we describe a technique for measuring fold change that takes into account the uncertainty of gene expression measurement by RNA-seq.Our representation of fold change is derived from the posterior distribution of the raw fold change.This representation, denoted as GFOLD, balances the estimated degree of change with the significance of this change.We also built a hierarchical model for cases in which biological replicates are available.The calculation is based on MCMC.Results: We applied GFOLD to five datasets (4 RNA-seq and 1 GRO-seq) with biological replicates and compared it with edgeR, DESeq, DEGseq, Poisson, Cufflinks and fold change with offset.Comparisons show that GFOLD outperforms all other methods in most cases when there is only a single replicate.When biological replicates are available, GFOLD provides comparable results to existing methods.Conclusions: GFOLD provides a more consistent and more biologically meaningful approach to ranking differentially expressed genes than other commonly used methods for RNA-seq data without biological replicates.The concept of GFOLD can be broadly applied, beyond RNA-seq or GRO-seq, to other types of genomic data, including ChIP-seq .
其他文献
  Background: Unlike Western medicine, Traditional Chinese Medicine (TCM), which is based on the doctrine and empirical practices of systems science, uses sim
  Rational: Cardiac conduction disease is multifactorial complex disease.Genetic factors play critical roles in formation of cardiac conductive tissue during
  A core task of drug discovery study is to identify the dependency between the genetic/ molecular makeups of the human body and disease phenotype.Here we pro
会议
  Recent studies of geothermally heated aquatic ecosystems have found widely divergent viruses with unusual morphotypes.Archaeal Viruses isolated from these h
  Background: The accumulation of knowledge on biological networks and high-throughput experimental data raises the need of robust, efficient, schematic and e
  Background: Non-structural protein 1, a highly conserved influenza virus protein, has been demonstrated previously to be a potential target for antiviral de
  Background: Knowledge of the detailed organization of nucleosomes across genomes and the mechanisms of nucleosome positioning is critical for understanding
  Backgroud: The Distal-less 3 (DLX3) is a Distal-less homeodomain protein that belongs to the members of the DLX vertebrate family.DLX3 acts as a transcripti
  Background: U6 snRNA, as a component of the spliceosomes, is involved in splicing of pre-mRNAs.There is no report about U6 in chicken.Methods: in this study
  Background: CTCF is a versatile zinc finger DNA-binding protein that functions as a highly conserved epigenetic transcriptional regulator.CTCF is known to a