Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

Shengyi Huang; Santiago Ontañón

doi:10.48550/arxiv.2010.03956

Back

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

Preprint

Open access

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

Shengyi Huang and Santiago Ontañón

arXiv (Cornell University)

05 Oct 2020

DOI: https://doi.org/10.48550/arxiv.2010.03956

Files and links (1)

url

https://doi.org/10.48550/arxiv.2010.03956View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Learning

Statistics - Machine Learning

Training agents using Reinforcement Learning in games with sparse rewards is a challenging problem, since large amounts of exploration are required to retrieve even the first reward. To tackle this problem, a common approach is to use reward shaping to help exploration. However, an important drawback of reward shaping is that agents sometimes learn to optimize the shaped reward instead of the true objective. In this paper, we present a novel technique that we call action guidance that successfully trains agents to eventually optimize the true objective in games with sparse rewards while maintaining most of the sample efficiency that comes with reward shaping. We evaluate our approach in a simplified real-time strategy (RTS) game simulator called $\mu$RTS.

Metrics

9 Record Views

Details

Title: Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games
Creators: Shengyi Huang
Santiago Ontañón
Publication Details: arXiv (Cornell University)
Resource Type: Preprint
Language: English
Academic Unit: Computer Science (Computing)
Other Identifier: 991021869009804721

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media