AAII Day Presentation 13: Han Zheng
Competitive and Cooperative Heterogeneous Deep Reinforcement Learning
TIME: 4:20pm - 4:40pm
SPEAKER: Han Zheng, Australian Artificial Intelligence Institute
ABSTRACT:
Numerous deep reinforcement learning methods have been proposed, including deterministic, stochastic, and evolutionary-based hybrid methods. However, among these various methodologies, there is no clear winner that consistently outperforms the others in every task in terms of effective exploration, sample efficiency, and stability.
In this work, we present a competitive and cooperative heterogeneous deep reinforcement learning framework called C2HRL. C2HRL aims to learn a superior agent that exceeds the capabilities of the individual agent in an agent pool through two agent management mechanisms: one competitive, the other cooperative. The competitive mechanism forces agents to compete for computing resources and to explore and exploit diverse regions of the solution space. To support this strategy, resources are distributed to the most suitable agent for that specific task and random seed setting, which results in better sample efficiency and stability. The other mechanic cooperation asks heterogeneous agents to share their exploration experiences so that all agents can learn from a diverse set of policies. The experiences are stored in a two-level replay buffer and the result is an overall more effective exploration strategy.
We evaluated C2HRL on a range of continuous control tasks from the benchmark Mujoco. The experimental results demonstrate that C2HRL has better sample efficiency and greater stability than three state-of-the-art DRL baselines.