,Proximal policy optimization with an integral compensator for quadrotor control

来源 :信息与电子工程前沿(英文版) | 被引量 : 0次 | 上传用户:kbens
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
We use the advanced proximal policy optimization (PPO) reinforcement leaing algorithm to optimize the stochastic control strategy to achieve speed control of the "model-free" quadrotor. The model is controlled by four leaed neural networks, which directly map the system states to control commands in an end-to-end style. By introducing an integral compensator into the actor-critic framework, the speed tracking accuracy and robustness have been greatly enhanced. In addition, a two-phase leaing scheme which includes both offline- and online-leaing is developed for practical use. A model with strong generalization ability is leaed in the offline phase. Then, the flight policy of the model is continuously optimized in the online leaing phase. Finally, the performances of our proposed algorithm are compared with those of the traditional PID algorithm.
其他文献
玉米籽粒产量主要受到籽粒干物质积累量的限制,籽粒灌浆过程直接影响着玉米籽粒干物质的积累,对籽粒产量起到决定性作用。研究玉米籽粒灌浆过程,对提高玉米产量是十分必要的。伴随城镇化快速发展,玉米生产全程机械化已成必然。玉米机械化收获是玉米生产全程机械化的重要部分,要求收获时玉米籽粒含水量较低,而现今东北地区普遍种植中晚熟和晚熟类型品种,不适宜机械化。研究籽粒脱水特性对于培育适宜机械化收获的优良品种具有重
小麦是世界范围内重要的粮食作物,但是,在其生长发育的各个阶段,病虫害的发生较为严重,极大地影响了小麦产量的稳定和提高。理论和实践都证明,通过选育和推广抗病虫品种是控制病虫
Point set registration has been a topic of significant research interest in the field of mobile intelligent unmanned systems. In this paper, we present a novel
20多年来,虽然不断有新组合出现,但是杂交水稻的产量一直徘徊不前,主要原因有二:亲本选育中以农艺性状选择为主,配合力选择进展不大;缺乏有效的优势预测方法。我们借鉴杂交玉米育种中的配子选择,设计了恢复系选育的组合判别法,尝试在农艺性状选择的同时进行配合力选择,同时提高组合早代筛选的效率。试验利用6个恢复系配制的10个单交F_1,然后再与不育系Ⅱ32A和新协黄A配制20个测交组合。对测交组合和测交父本