论文部分内容阅读
多 agent 环境下 agent 的最优策略取决于其它 agent 的策略,这使得学习目标不易被清晰定义.基于客观观察行为建模的方法并不能很好体现智能体的个体理性.本文提出基于内省推理方法的多智能体环境下智能体高效在线学习方法,将基于对手模型的客观观察行为与基于换位思考推理的主观意图推测结合起来,智能体通过内省推理能够更多地得到对手的信息.针对经典协调博弈进行仿真实验,结果表明能取得较好的协调性能.
The optimal strategy of agents in multi-agent environment depends on the strategies of other agents, which makes the learning objectives difficult to be clearly defined.Objective observation based on behavioral modeling can not reflect the agent’s individual rationality.In this paper, Method based on the method of efficient online learning in the multi-agent environment, the objective observation behavior based on the opponent model is combined with the subjective intention estimation based on the transposition thinking reasoning, and the agent can obtain more opponents’ information through introspection reasoning. According to the classical coordination game simulation, the results show that better coordination can be achieved.