drugcopy61

Website

About

We present hierarchical Deep Q-Network with Forgetting (HDQF) that took first place in MineRL competition. HDQF works on imperfect demonstrations utilize hierarchical structure of expert trajectories extracting effective sequence of meta-actions and subgoals. We introduce structured task dependent replay buffer and forgetting technique that allow the HDQF agent to gradually erase poor-quality expert data from the buffer. In this paper we present the details of the HDQF algorithm and give the experimental results in Minecraft domain. Deep reinforcement learning (RL) has achieved compelling success on many complex sequential decision-making problems especially in simple domains. In such example as AlphaStar [6], AlphaZero [2], OpenAI Five human or super-human level of performance was attained. However, RL algorithms usually require a huge amount of environment-samples required for training t

drugcopy61

About

Good Things onthe Way

Watch your email for news and exclusive offers.

Before you go

Sign up to get 20% offyour first book.

Good Things on
the Way

Sign up to get 20% off
your first book.