A knowledge-based approach for KEGG pathway expansion by enrichment analysis of GO terms

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:z24514516210
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: Pathway databases, especially KEGG, have been widely used as a reference knowledge base for biomedical scientists to interpret their experimental findings.Nevertheless, our knowledge about the existing biological pathways is incomplete and a large number of pathways have to be further expanded.Computational pathway expansion offers us a cheap and reliable way to take the challenging task.Several developed approaches, which rely on analysis of large-scale datasets generated by genome sequencing and other high-throughput experiments, are limited by across-studies variations and information provided by single experiments.In this study, we proposed a pathway expansion algorithm by systematic learning of functional knowledge bases (PPI and GO) for genes (their products) and their relations with others.Methods: In essence, pathway expansion is equivalent to test whether a target gene belongs to a specific pathway(s) or not.First, we identified all the interacting genes with the target gene, by using two large protein-protein interaction databases, HPRD and BioGRID.Then, we identified all the candidate KEGG pathways that these interacting genes belong to.Finally, for each candidate pathway, all the contained genes and the target gene were subject to enrichment analysis at each GO term of the target gene.We claimed that the target gene belongs to the candidate pathway if all GO terms of the target gene were enriched with the genes of this pathway.Results: The proposed knowledge-based approach achieved excellent performance in predicting a genes pathway, based on either of two PPI databases.The average consistent rate (defined as the proportion of the right predicted pathways in the total annotated pathways) was increased with the number of interacting genes, and reached to the highest value of 0.95 when the number of interacting genes was 22.However, the relative precision rates (RP, defined as the proportion of genes which all the annotated pathway(s) of them are fight predicted in the total target genes) based on HPRD or BioGRID were largely kept in the same level regardless of the number of interacting genes, and were 0.867 and 0.802 for two PPI knowledge databases, respectively.Conclusions: The proposed knowledge-based approach for KEGG pathway expansion achieved high performance in inferring the pathway(s) that a gene belongs to, rendering it to be a useful tool for expanding our knowledge on both the target gene and the predicted pathways .
其他文献
会议
会议
会议
会议
会议
  Backgroud: The rapid growth of high-throughput experimental data of biology is providing more and more valuable information on genome-wide molecular enrichm
  Background: Spatiotemporal variation of gene expression can happen extensively.Thanks to current applications of high throughput technologies, e.g.microarra
  Due to the limitation on the calculating power of the computer, it is very difficult to simulate the whole folding process and the large-scal functional mot
  Background: Previous results indicated that the CDK2/Cyclin E1 protein complex, which plays a key role in regulating the cell cycle, could be disrupted by t
  Background: Nucleosome positioning plays an important role in regulation of the gene activity in eukaryotic cell.DNA sequence is believed to be one of the m