2021 IEEE International Conference on Acoustics, Speech and Signal Processing

Technical Program

Paper ID	SPCOM-9.4
Paper Title	ON INFORMATION ASYMMETRY IN ONLINE REINFORCEMENT LEARNING
Authors	Ezra Tampubolon, Haris Ceribasic, Holger Boche, Technical University of Munich, Germany
Session	SPCOM-9: Online and Active Learning for Communications
Location	Gather.Town
Session Time:	Friday, 11 June, 14:00 - 14:45
Presentation Time:	Friday, 11 June, 14:00 - 14:45
Presentation	Poster
Topic	Signal Processing for Communications and Networking: [SPCN-NETW] Networks and Network Resource allocation
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	In this work, we study the system of two interacting non-cooperative Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which does not occur in an environment of general independent learners. Furthermore, we discuss the resulted post-learning policies, show that they are almost optimal in the underlying game sense, and provide numerical hints of almost welfare-optimal of the resulted policies.