论文部分内容阅读
近年来,基于DFL[1]的agent学习已经受到了不少研究者的关注。文献[2]给出了一种即时回报的DF环境下的多agent学习模型,文献[3]给出DF环境下的单agent学习算法,文献[4]介绍了DF环境中agent的心智模型。本文主要是在这些文献的基础上,具体构建了一个基于DFL的非即时回报的合作型多agent学习模型。主要内容包括该模型的结构、主要数据结构,以及相应的算法,最后还给出了一个验证实例。
In recent years, agent learning based on DFL [1] has attracted the attention of many researchers. Literature [2] gives a real-time reward multi-agent learning model in DF environment. Literature [3] gives single-agent learning algorithm in DF environment. Literature [4] introduces the mental model of agent in DF environment. Based on these documents, this dissertation constructs a cooperative multi-agent learning model based on DFL, which is based on non-immediate feedback. The main contents include the structure of the model, the main data structure, and the corresponding algorithm, and finally gives a verification example.