论文部分内容阅读
A primary challenge of agent-based policy leing in complex and uncertain environments is escalating computational complexity with the size of the task space (action choices and world states) and the number of agents.Nonetheless,there is ample evidence in the natural world that high-functioning social mammals le to solve complex problems with ease,both individually and cooperatively.This ability to solve computationally intractable problems stems from both brain circuits for hierarchical representation of state and action spaces and leed policies as well as constraints imposed by social cognition.Using biologically derived mechanisms for state representation and mammalian social intelligence,we constrain state-action choices in reinforcement leing in order to improve leing efficiency.Analysis results bound the reduction in computational complexity due to state abstraction,hierarchical representation,and socially constrained action selection in agent-based leing problems that can be described as variants of Markov decision processes.Investigation of two task domains,single-robot herding and multirobot foraging,shows that theoretical bounds hold and that acceptable policies emerge,which reduce task completion time,computational cost,and/or memory resources compared to leing without hierarchical representations and with no social knowledge.