Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib (Ray RLlib 기반 QMIX와 RND를 이용한 희소 보상 전장 환경에서의 멀티에이전트 강화학습 협업)

Minkyoung Kim (김민경)

doi:10.9708/jksci.2024.29.01.011

Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib

Journal of The Korea Society of Computer and Information
Abbr : JKSCI
2024, 29(1), pp.11~19
DOI : 10.9708/jksci.2024.29.01.011
Publisher : The Korean Society Of Computer And Information
Research Area : Engineering > Computer Science
Received : November 23, 2023
Accepted : December 28, 2023
Published : January 31, 2024

Minkyoung Kim ¹

¹한화시스템

Accredited

ABSTRACT

Multi-agent systems can be utilized in various real-world cooperative environments such as battlefield engagements and unmanned transport vehicles. In the context of battlefield engagements, where dense reward design faces challenges due to limited domain knowledge, it is crucial to consider situations that are learned through explicit sparse rewards. This paper explores the collaborative potential among allied agents in a battlefield scenario. Utilizing the Multi-Robot Warehouse Environment(RWARE) as a sparse reward environment, we define analogous problems and establish evaluation criteria. Constructing a learning environment with the QMIX algorithm from the reinforcement learning library Ray RLlib, we enhance the Agent Network of QMIX and integrate Random Network Distillation(RND). This enables the extraction of patterns and temporal features from partial observations of agents, confirming the potential for improving the acquisition of sparse reward experiences through intrinsic rewards.

KEYWORDS

Multi-agent Reinforcement Learning, Sparse Reward, Cooperative Battlefield Engagement, Ray RLlib, QMIX, RND

Citation status

* References for papers published after 2025 are currently being built.

[journal] C. G. Lee / 2022 / Multi-agent based Manned/unmanned Collaboration System for Combatant’s Battlefield Situation Awareness / Journal of the Institute of Electronics and Information Engineers 59(2) : 126~134

[confproc] U. H. Choi / 2021 / A Study on Multi-agent Deep Reinforcement Learning for Distributed Cooperative Multi-UAV Mission / Proc. of the KSAS 2021 Fall Conf : 958~959

[confproc] A. P. Pope / 2021 / Hierarchical Reinforcement Learning for Air-to-Air Combat / Proc. of the International Confernse on Unmanned Aircraft System

[journal] M. H. Park / 2021 / The Specification of Air-to-Air Combat Tactics Using UML Sequence Diagram / Journal of the KIMST 24(6) : 664~675

[confproc] G. Papoudakis / 2021 / Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks / Proc. of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS)

[confproc] T. Rashid / 2018 / QMIX : Monotonic Value Funtion Factorisation for Deep Multi-Agent Reinforcement Learning / Proc. of the 35th International Conference on Machine Learning

[web] Y. Burda / 2018 / Exploration by Random Network Distillation / 10.48550/arXiv.1810.12894

[web] Ray Project / Anyscale / https://github.com/ray-project/ray

[confproc] M. Samvelyan / 2019 / The StarCraft Multi-Agent Challenge / Proc. Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS)

[confproc] Tan, M / 1993 / Multi-Agent Reinforcement Learning Independent vs Cooperative Agents / Proc. of the Tenth International Conference on Machine Learning : 330~337

[confproc] P. Sunehag / 2017 / Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward / Proc. of the 17th International Conference on Autonomous Agents and Multiagent Systems

[web] / Random network distillation / OpenAI / https://github.com/openai/random-network-distillation

[journal] H. J. Kim / 2022 / The Fault Diagnosis Model of Ship Fuel System Equipment Reflecting Time Dependency in Conv1D Algorithm Based on the Convolution Network / J. Navig. Port Res 46(4) : 367~374

KJCKorea
Journal Central

Journal of The Korea Society of Computer and Information 2025 KCI Impact Factor : 1.01

Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2025 are currently being built.

Journal of The Korea Society of Computer and Information 2025 KCI Impact Factor : 1.01

Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (1)

REFERENCES (13) * References for papers published after 2025 are currently being built.

Search PDF

Citation

* References for papers published after 2025 are currently being built.