基于高通量测序的瑞香狼毒转录组数据分析

来源 :中草药 | 被引量 : 0次 | 上传用户:wanghai19881016
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
目的获得瑞香狼毒Stellera chamaejasme转录组数据库代谢途径基因序列、SSR以及转座子等信息。方法以瑞香狼毒根作为受试材料,采用二代测序方法中的Illumina Hi Seq 2000进行转录组测序,并进行系统的生物信息学分析。结果共获得26 785 872个Clean reads片段,拼接得到47 053条Unigenes,平均长度为419 nt。将拼装所得到的Unigene序列利用BLAST工具分别与Nr、Swiss-Prot、KEGG、COG和GO数据库进行比对,分别有11 138和24 744条Unigene在Nr和Swiss-Prot数据库中比对得到了注释信息,可归于36个GO分类,涉及119个KEGG标准代谢通路,进一步分析发现15条萜类生物合成途径的关键酶基因。利用MISA软件发现3 480个SSR,数量最高的SSR类型为单碱基重复,为1 986条,出现频率为57.07%,最少的是六碱基重复SSR,只有5条,出现频率仅为0.14%。利用Repeat Masker在线工具针对瑞香狼毒转录组序列进行转座子预测分析,结果共发现有1 497条转座子,其中E值<1×10-5的序列有827条,包含22种类型转座子,数目最多的为LINE/L1类型(405条),占比为48.97%,占比最少的为DNA/Ginger、DNA/h AT、DNA/PIF-ISL2EU和LINE/Jockey以及LTR/Lenti类型分别只有1条。结论对瑞香狼毒进行高通量测序,获得了大量基因序列信息以及SSR和转座子信息,为今后分离克隆瑞香狼毒中佛波酯等有效成分生物合成的关键酶基因以及开展相关分子机制研究提供了数据资源和理论基础。 Objective To obtain the sequence of metabolic pathway genes, SSR and transposon in Stellera chamaejasme transcriptome database of Stellera chamaejasme. Methods The root of Stellera chamaejasme was used as the tested material. The sequencing of the transcriptome was carried out by Illumina Hi Seq 2000 in the second-generation sequencing method and the bioinformatics analysis was carried out systematically. As a result, a total of 26 785 872 Clean reads were obtained, and 47 053 Unigenes were spliced, with an average length of 419 nt. The assembled Unigene sequences were compared with Nr, Swiss-Prot, KEGG, COG and GO databases using BLAST tools, with 11 138 and 24 744 Unigene annotated respectively in the Nr and Swiss-Prot databases The information, which can be attributed to 36 GO categories involving 119 KEGG standard metabolic pathways, was further analyzed to find 15 key enzyme genes for the terpenoid biosynthesis pathway. A total of 3 480 SSRs were found by MISA software. The highest number of SSRs were single base repeats (1 986) with a frequency of 57.07%. The least number of SSRs was SSR with only six SSRs, with a frequency of only 0.14% . A total of 1 497 transposons were found using the Repeat Masker online tool for transposon prediction of Stellera chamaejasme transcriptome sequences. There were 827 sequences with E value <1 × 10-5, including 22 types The largest number of loci was LINE / L1 (405), accounting for 48.97% of the total, with the least proportion being DNA / Ginger, DNA / P ATL2EU, LINE / Jockey and LTR / Lenti Only one each. Conclusions High-throughput sequencing of Stellera chamaejasme caused a great deal of gene sequence information and SSR and transposon information, which will be the key enzyme genes for biosynthesis of phorbol ester and other related molecules in the future. Research provides the data resources and theoretical basis.
其他文献
人物名片江民辉,男,1975年出生于景德镇。毕业于景德镇市陶瓷学院美术系。国家一级技师,高级工艺美术师,景德镇陶瓷技能大师,南昌陶瓷彩绘首席技师。江西省工艺美术学会会员,
请下载后查看,本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.
期刊
期刊
期刊
期刊
期刊
期刊
期刊
期刊
期刊