论文部分内容阅读
通过寻找共突变基因对,可以研究在癌症的发生与发展过程中被共同扰动的生物学功能,为揭示癌症的发生机制提供新的线索。目前,此类研究主要利用京都基因与基因组百科全书数据库(Kyoto Encyclopedia of Genes and Genomes,KEGG)。由于KEGG数据库倾向于定义粗泛的通路,因此,利用该数据库无法判定是通路整体还是其中的一部分与癌症相关。相反,Gene Ontology数据库在从宽泛到细致的不同层面上定义生物学功能,因此,基于GeneOntology功能类来研究癌症过程中生物学功能的共扰动是一种合理的选择。本文提出了一种算法,寻找Gene Ontology功能类间注释了非随机多的共突变基因对的功能对。由于GeneOntology功能类之间的依赖关系,导致找到的功能对之间存在冗余关系,本文提出了去冗余算法,以寻找非冗余的典型功能对。根据肺腺癌基因组体细胞突变扫查数据,我们找到了78对典型的共突变功能对。这些功能对包含宽泛和细致的生物学功能,更精确地定义了被共同扰动的生物学功能的范围,为研究肺腺癌的发生机制提供了新的线索。
By searching for co-mutation pairs, we can study the biological functions that are co-perturbed during the development and progression of cancer and provide new clues for revealing the mechanism of cancer. Currently, such studies mainly utilize the Kyoto Encyclopedia of Genes and Genomes (KEGG). Because KEGG databases tend to define extensive pathways, it is not possible to determine whether the pathways are whole or part of the pathways associated with cancer using this database. In contrast, the Gene Ontology database defines biological functions at various levels, from broadness to meticulousness. Therefore, studying co-perturbations of biological functions in cancers based on the GeneOntology functional class is a reasonable alternative. This paper presents an algorithm that looks for functional pairs annotating non-random, multiple co-mutation pairs between Gene Ontology functional classes. Due to the dependencies between GeneOntology functional classes, there is a redundant relationship between the found functional pairs. In this paper, a de-redundant algorithm is proposed to find non-redundant typical functional pairs. According to the somatic mutation screening data of lung adenocarcinoma genome, we found 78 pairs of typical co-mutation function pairs. These functions contain a wide range of detailed and detailed biological functions, more precisely define the scope of co-disturbing biological functions, and provide new clues for studying the mechanism of lung adenocarcinoma.