| MLSP-27: Reinforcement Learning 3 | 
| Session Type: Poster | 
| Time: Thursday, 10 June, 13:00 - 13:45 | 
| Location: Gather.Town | 
| Virtual Session: View on Virtual Platform | 
| Session Chair: Seung-Jun Kim, University of Maryland, Baltimore County | 
| MLSP-27.1: GAUSSIAN PROCESS TEMPORAL-DIFFERENCE LEARNING WITH SCALABILITY AND WORST-CASE PERFORMANCE GUARANTEES | 
| Qin Lu; University of Minnesota | 
| Georgios B. Giannakis; University of Minnesota | 
| MLSP-27.2: SELF-INFERENCE OF OTHERS' POLICIES FOR HOMOGENEOUS AGENTS IN COOPERATIVE MULTI-AGENT REINFORCEMENT LEARNING | 
| Qifeng Lin; Sun Yat-sen University | 
| Qing Ling; Sun Yat-sen University | 
| MLSP-27.3: SEMI-SUPERVISED BATCH ACTIVE LEARNING VIA BILEVEL OPTIMIZATION | 
| Zalán Borsos; ETH Zurich | 
| Marco Tagliasacchi; Google | 
| Andreas Krause; ETH Zurich | 
| MLSP-27.4: KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING | 
| Rami Mowakeaa; University of Maryland, Baltimore County | 
| Seung-Jun Kim; University of Maryland, Baltimore County | 
| Darren Emge; Combat Capabilities Development Command | 
| MLSP-27.5: POLICY AUGMENTATION: AN EXPLORATION STRATEGY FOR FASTER CONVERGENCE OF DEEP REINFORCEMENT LEARNING ALGORITHMS | 
| Arash Mahyari; Florida Institute For Human and Machine Cognition (IHMC) | 
| MLSP-27.6: GRAPHCOMM: A GRAPH NEURAL NETWORK BASED METHOD FOR MULTI-AGENT REINFORCEMENT LEARNING | 
| Siqi Shen; Xiamen University | 
| Yongquan Fu; National University of Defense Technology | 
| Huayou Su; National University of Defense Technology | 
| Hengyue Pan; National University of Defense Technology | 
| Qiao Peng; National University of Defense Technology | 
| Yong Dou; National University of Defense Technology | 
| Cheng Wang; Xiamen University |