Open Research Newcastle
Browse

A comparison study of cooperative Q-learning algorithms for independent learners

Download (304.77 kB)
journal contribution
posted on 2025-05-10, 12:27 authored by Bilal H. Abed-Alguni, David J. Paul, Stephan ChalupStephan Chalup, Frans HenskensFrans Henskens
Cooperative reinforcement learning algorithms such as BEST-Q, AVE-Q, PSO-Q, and WSS use Q-value sharing strategies between reinforcement learners to accelerate the learning process. This paper presents a comparison study of the performance of these cooperative algorithms as well as an algorithm that aggregates their results. In addition, this paper studies the effects of the frequency of Q-value sharing on the learning speed of the independent learners that share their Q-values among each other. The algorithms are compared using the taxi problem (multi-task problem) and different instances of the shortest path problem (single-task problem). The experimental results when learners have equal levels of experience suggest that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. However, the experimental results when learners have different levels of experience suggest that most of the cooperative Q-learning algorithms perform similarly, but better than single agent Q-learning, especially when Q-value sharing is highly frequent. This paper then places Q-value sharing in the context of modern reinforcement learning techniques and suggests some future directions for research.

History

Journal title

International Journal of Artificial Intelligence

Volume

14

Issue

1

Pagination

71-93

Publisher

Centre for Environment, Social and Economic Research Publications

Language

  • en, English

College/Research Centre

Faculty of Engineering and Built Environment

School

School of Electrical Engineering and Computer Science

Usage metrics

    Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC