论文部分内容阅读
QSAR研究中,判断模型预测能力至关重要。长期以来,模型的预测能力是使用留一法或留k法等内部验证来确定,但在2004年形成的OECD规则中,已明确规定必须使用外部验证集去评价模型的预测能力。为了研究内部验证和外部验证与模型预测能力之间的关系,本文以45种睾酮和二氢睾酮衍生物以及37种萘锟酯衍生物为研究对象,以E-Dragon计算的分子描述符作为自变量,在增n减l算法选择变量的基础上,采用SVM算法对同种物质的不同活性以及不同物质的不同活性建立QSAR模型,研究QSAR/QSPR建模时的不同验证方式与模型预测能力的关系。研究结果表明,模型的预测能力与内部验证结果的好坏无必然联系,而结合外部验证的检验结果则是判断模型预测能力的可靠依据。
In the QSAR study, it is important to judge the ability of the model to predict. For a long time, the predictive ability of the model was determined by internal verification such as leave-one-leave or leave-one-k methods. However, in the OECD rules formed in 2004, it has been explicitly stipulated that an external verification set must be used to evaluate the predictive power of the model. In order to study the relationship between internal and external validation and model predictive power, 45 testosterone and dihydrotestosterone derivatives and 37 naphthalocyanine derivatives were studied in this paper. The molecular descriptors calculated by E-Dragon Based on the variables selected by the n-subtraction algorithm, the SVM algorithm was used to establish the QSAR model for the different activities of the same substances and the different activities of different substances, and to study the different verification methods and predictive ability of QSAR / QSPR modeling relationship. The results show that the predictive ability of the model is not necessarily related to the quality of the internal verification results, but the test results based on the external verification are reliable bases for judging the prediction ability of the model.