Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Shengyi Huang; Jiayi Weng; Rujikorn Charakorn; Min Lin; Zhongwen Xu; Santiago Ontañón

doi:10.48550/arxiv.2310.00036

Back

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Preprint

Open access

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu and Santiago Ontañón

arXiv.org

29 Sep 2023

DOI: https://doi.org/10.48550/arxiv.2310.00036

Files and links (1)

url

https://doi.org/10.48550/arxiv.2310.00036View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Learning

Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time. Despite recent progress in the field, reproducibility issues have not been sufficiently explored. This paper first shows that the typical actor-learner framework can have reproducibility issues even if hyperparameters are controlled. We then introduce Cleanba, a new open-source platform for distributed DRL that proposes a highly reproducible architecture. Cleanba implements highly optimized distributed variants of PPO and IMPALA. Our Atari experiments show that these variants can obtain equivalent or higher scores than strong IMPALA baselines in moolib and torchbeast and PPO baseline in CleanRL. However, Cleanba variants present 1) shorter training time and 2) more reproducible learning curves in different hardware settings. Cleanba's source code is available at \url{https://github.com/vwxyzjn/cleanba}

Metrics

8 Record Views

Details

Title: Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform
Creators: Shengyi Huang
Jiayi Weng
Rujikorn Charakorn
Min Lin
Zhongwen Xu
Santiago Ontañón
Publication Details: arXiv.org
Resource Type: Preprint
Language: English
Academic Unit: Computer Science (Computing)
Other Identifier: 991021869112604721

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media