论文部分内容阅读
目的比较几种传统模型及机器学习方法,在甘肃省预测梅毒发病率的效果,并对未来发病率进行预测,为制定控制措施提供依据。方法应用MATLAB 2014a软件,对甘肃省2004-2015年梅毒发病率数据分别建立多项式回归、平滑样条插值、灰色系统GM(1,1)、自回归整合移动平均(ARIMA)、人工神经网络(ANN)和支持向量机(SVR)等数学模型,然后根据2016年实际发病率数据来检验预测效果以选择最佳预测模型,最后使用该模型预测2017-2020年发病率。结果构建的一次多项式、二次多项式、平滑样条方法、GM(1,1)、ARIMA、ANN和SVR模型,拟合2004-2015年梅毒发病率平均相对误差分别为20.04%、22.44%、8.10%、24.89%、11.00%、17.61%和24.72%,以平滑样条最小。7种模型预测2016年梅毒发病率,以ARIMA模型最佳,使用该模型预测2017-2020年发病率分别为19.11/10万、18.21/10万、18.57/10万和19.94/10万。结论不同数学模型拟合和预测效果不同,应根据实际数据选择合适的模型;ARIMA模型预测甘肃省近年梅毒发病率性能较好,预测2017-2020年发病率较为稳定。
Objective To compare the effects of several traditional models and machine learning methods in predicting the incidence of syphilis in Gansu Province and predict the future incidence of the disease so as to provide the basis for the formulation of control measures. Methods Using MATLAB 2014a software, polynomial regression, smooth spline interpolation, gray system GM (1,1), ARIMA, ANN (artificial neural network) ) And Support Vector Machine (SVR), and then predict the effect according to the 2016 actual incidence data to select the best prediction model, and finally use the model to predict the incidence of 2017-2020. Results The polynomial, quadratic polynomial, smooth spline method, GM (1,1), ARIMA, ANN and SVR models were constructed. The average relative errors of incidence of syphilis between 2004 and 2015 were 20.04%, 22.44%, 8.10 %, 24.89%, 11.00%, 17.61% and 24.72%, respectively, to smooth the spline minimum. The seven models predict the incidence of syphilis in 2016, and the ARIMA model is the best. Using this model to predict the incidence rates in 2017-2020 are 19.11 / 100,000, 18.21 / 100,000, 18.57 / 100000 and 19.94 / 100000 respectively. Conclusion Different mathematical models have different fitting and predicting effects, and the appropriate models should be selected according to the actual data. ARIMA model predicts better performance of syphilis in Gansu Province in recent years and predicts a stable incidence in 2017-2020.