Memory-Based Reinforcement Learning
Date:
1.30pm – 4.30pm (UTC/GMT+8, AWST) on Monday, 05 December 2022
Location: Hyatt Regency Perth, Perth, WA, Australia.
Virtual: Microsoft Teams
Reinforcement learning (RL) is a branch of artificial intelligence wherein autonomous agents learn to maximise predefined rewards from the environment. Despite immense successes in breaking human records, the current training of RL agents is prohibitively expensive in terms of time, computing resources, and samples. For example, it requires trillions of playing sessions to reach human-level performance on simple video games. The problem of sample inefficiency is exacerbated in stochastic, partially observable, noisy or long-term real-world environments, whereas humans can show excellent performance under these circumstances without much training. That shortcoming of RL agents can be attributed to the lack of efficient human-like memory mechanisms that hasten learning by smartly utilising past observations.
This tutorial presents recent advances in memory-based reinforcement learning where emerging memory systems enable sample-efficient, adaptive and human-like RL agents. The first part of the tutorial covers the basics of RL and raises the sample inefficiency issue. The second part presents a taxonomy of memory mechanisms that recent lean RL employs to reduce the number of training samples and resemble human memory. The subsequent three sections study the benefits that memory can provide to RL agents, which can be categorised as (1) Quick access to critical experiences; (2) A better representation of observation contexts; (3) Intrinsic motivation to explore; and (4) Optimisation. Finally, the tutorial concludes with discussions on opening challenges and promising future research on memory-based RL.
Time | Topic |
---|---|
13:30 – 13:40 | Introduction and Background |
13:40 – 13:50 | Taxonomy of Memory in RL |
13:50 – 14:10 | Memory as Experiences |
14:10 – 14:30 | Memory for Better Context |
14:30 – 14:50 | QA and Break |
14:50 – 15:10 | Memory in Exploration |
15:10 – 15:30 | Memory for Optimisation |
15:30 – 15:50 | Demo |
15:50 – 16:30 | Conclusion and QA |
Hung Le is a research lecturer at Deakin University, Australia. He is a member of Applied Artificial Intelligence Institute (A2I2) where he works on various topics in machine learning, deep learning and artificial memory. In particular, Hung is keen to invent new deep models with access to artificial neural memory and has created a body of work in advancing this area including multi-modal and generative memory, theoretical foundation for memory operations, general-purpose neural computers and memory-based reinforcement learning agents. Applications include health, dialogue system, reinforcement learning, machine reasoning and natural language processing. He publishes regularly in top ML/RL/AI venues such as ICLR, NeurIPS, ICML, AAAI, KDD, NAACL, ECCV, AAMAS, ICPR, ICONIP and PAKDD. He obtained a Bachelor of Engineering (Honors) from Hanoi University of Science and Technology, and a PhD in Computer Science from Deakin University in 2015 and 2020, respectively.
Benchmark datasets
- Toy environments: Classic Control- Discrete-action: Atari games
- Continous-action: Mujoco