An Efficient Distributed Reinforcement Learning Architecture for Long-Haul Communication Between Actors and Learner

Shin Morishima, Hiroki Matsutani

Research output: Contribution to journalArticlepeer-review

Abstract

A computing cluster that interconnects multiple compute nodes is used to accelerate distributed reinforcement learning that uses DQN (Deep Q-Network). In distributed reinforcement learning, actor nodes acquire experiences by interacting with a given environment and a learner node optimizes the DQN model. When distributed reinforcement learning is used in practical applications such as robotics, we can assume that actor nodes are located in edge side while the learner node is located in cloud side. In this case, the long-haul communication between them imposes significant communication overheads. However, most prior works simply assume that actors and learner are located closely, and do not take the overheads into account. In this paper, we focus on the practical environment where the actors and learner are located remotely, and they interact via a buffer node that collects information from multiple actor nodes. We implement a prototype system in which the buffer and learner nodes are connected via a 25GbE (Gigabit Ethernet) switch and a 10km optical fiber cable. Although a replay memory functionality is closely associated with the learner side, in this paper we propose to combine the replay memory into the buffer node. In our experiments using the prototype system, the proposed approach is compared with an existing approach in terms of the training efficiency (i.e., training loss) and the transfer efficiency over the long-haul communication (i.e., average priority of transferred experiences). As a result, the training loss of the proposed approach is reduced to 26% of the existing approach, and the average priority is 3.92 times higher than the existing approach after the training loss is converged. These results demonstrate that the proposed approach can improve the training/communication efficiency compared with the existing approach in a practical system that imposes long-haul communication between the actors and learner.

Original languageEnglish
Pages (from-to)71479-71491
Number of pages13
JournalIEEE Access
Volume12
DOIs
Publication statusPublished - 2024

Keywords

  • Distributed deep reinforcement learning
  • deep Q-network
  • prioritized experience replay

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'An Efficient Distributed Reinforcement Learning Architecture for Long-Haul Communication Between Actors and Learner'. Together they form a unique fingerprint.

Cite this