Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Reason on the Fly: How RL Boosts LLM Reasoning On the Spot

less than 1 minute read

Published: June 04, 2025

In our last post, we warmed up with why reinforcement learning (RL) is a powerful paradigm for building smarter AI reasoners. Today, we zoom in on an exciting approach: using RL at inference time to improve large language model (LLM) reasoning on the spot. In particular, we explore ways to inject real-time reasoning into static LLMs. Let’s break down how RL can transform a frozen LLM into a more dynamic, reasoned thinker at runtime. Read more

Think Before You Speak: Reinforcement Learning for LLM Reasoning

1 minute read

Published: May 21, 2025

Large Language Models (LLMs) have shown remarkable capabilities across a range of natural language tasks. Yet, when you give them a problem that needs a bit of careful thinking, like a tricky math question or understanding a complicated document, suddenly they can stumble. It’s like they can talk the talk, but when it comes to really putting things together step-by-step, they can get lost. Read more

The Best of Time-Series Forecasting (Part II): Advancements in Time Series Modeling Through Large Language Models

less than 1 minute read

Published: April 09, 2025

Part 1 of my blog looked at how time-series forecasting has evolved—from traditional models like ARIMA to deep learning methods like Transformers. These approaches brought big improvements, especially in handling complex and long-range patterns. However, they also have limits, especially when it comes to adapting to new data or working well across very different domains. Read more

The Best of Time-Series Forecasting (Part I): From Seasonal Patterns to Transformer Models

less than 1 minute read

Published: March 11, 2025

From finance to healthcare, energy, and climate science, time-series forecasting is a cornerstone of critical decision-making: Read more

Rethinking Memory: A Unified Linear Approach for Mindful Agents

less than 1 minute read

Published: January 22, 2025

In reinforcement learning (RL), memory isn’t just a bonus—it’s a necessity. When agents operate in environments where they can’t directly see everything they need (think navigating a maze), they must rely on memory to make decisions. This is where things get tricky: most current memory models fail under the weight of complex, long-term tasks where agents must selectively retain and erase memories based on relevance. Read more

Many Hands Make Light Work: Leveraging Collective Intelligence to Align Large Language Models

1 minute read

Published: December 16, 2024

Multi-Reference Preference Optimization (MRPO) for Large Language Models (AAAI 2025) Read more

Memory-Augmented Large Language Models

1 minute read

Published: December 03, 2024

Why and How Memory Matters for LLMs? Read more

Extending Neural Networks to New Lengths: Enhancing Symbol Processing and Generalization

1 minute read

Published: October 23, 2024

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory Read more

XLSTM vs LSTM: How the new LSTM Scale Sequence Prediction without Attention?

less than 1 minute read

Published: October 11, 2024

xLSTM: Extended Long Short-Term Memory Read more

Uncertainty, Confidence, and Hallucination in Large Language Models

1 minute read

Published: July 23, 2024

How to Spot When Your Large Language Model is Misleading You Read more

Cheap Large Language Models via Eliminating Matrix Multiplications

less than 1 minute read

Published: July 11, 2024

Scalable MatMul-free Language Modeling Read more

The Mamba Effect: State Space Models Taking on Transformers

less than 1 minute read

Published: July 05, 2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces Read more

Curious Agents Saga: Part 3, Beyond Surprise: Direct and Causal Exploration in Deep Reinforcement Learning

1 minute read

Published: June 27, 2024

Beyond Surprise: Direct and Causal Exploration in Deep Reinforcement Learning Read more

Human-Aligned Large Language Models

1 minute read

Published: February 19, 2024

About recent LLM alignment finetuning techniques such as RLHF, DPO, KTO, IPO and SPIN Read more

Curious Agents Saga (Part 2), Novelty or Surprise: How to Make Your Deep Reinforcement Learning Agents Curious?

less than 1 minute read

Published: February 03, 2024

Novelty or Surprise: How to Make Your Deep Reinforcement Learning Agents Curious? Read more

Curious Agents Saga (Part 1), Legacy of Exploration in Reinforcement Learning

less than 1 minute read

Published: January 25, 2024

Legacy of Exploration in Reinforcement Learning Read more

Complex System Forecasting with Expert Knowledge

3 minute read

Published: September 11, 2023

Why complex time-series system?

Read more

Memory in Reinforcement Learning: Overview

3 minute read

Published: September 19, 2022

Memory is just storage. Whenever computation needs to store interim results, it must ask for memory. This fundamental principle applies to any scenario where memory is required, yet a closer interpretation of memory’s role in each domain reveals a different understanding of its functionality and benefit. Read more

Program Memory: Method (part 2)

4 minute read

Published: July 12, 2022

When a human programmer codes, he often uses core libraries to construct his programs. Most of the time, the program memory stores these static libraries, and let big programs be created dynamically during computation. The libraries are unitary components constructing bigger programs. Maintaining small and functionally independent sub-programs such as libraries encourage program utilisations since an immense program must refer to different libraries to complete its task. Indeed, it also eliminates redundancies as the stored programs-the core libraries are not overlapping each other. Read more

Program Memory: Method (part 1)

6 minute read

Published: May 31, 2022

A neural network uses its weight to compute inputs and return outputs as computation results. Hence, the weight can be viewed as the neural network’s program. If we maintain a program memory of different weights responsible for various computation functions, we have a neural Universal Turing Machine. Obvious scenarios where a Program Memory may help: Read more

Program Memory: Introduction

4 minute read

Published: November 24, 2021

Memory-augmented neural networks (MANNs) store data in their external memory, resembling Turning Machines. Despite being theoretically Turing-Complete, MANNs cannot be trained flexibly to solve any task due to the lack of program memory. Without storing programs, it is hard to perform complicated tasks such as simulating recursive functional calls or implementing divide-and-conquer algorithms. As long as programs are not treated as data, the computing capability of neural networks is still limited. Read more

Multi-memory Architecture

1 minute read

Published: November 13, 2021

Imagine this, you only have short-term memory. You can only remember what happen during the day, and when you wake up, your mind refreshes. Without long-term memory, you cannot remember even your birthday, your last month’s payment or where had you been last week. To survive, you must note down every thing, and re-learn these facts every morning. That would be so inconvenient for dementia patients who suffer this kind of disease. In the same vein, if your mind only revolves around System 2 (slow and sophisticated), you will fail to think quickly and intuitively, and process life events effortfully. No matter how simple System 1, it is required and stands apart from System 2. It seems like a good model of memory should treat the memory as a bunch of modules, representing different functions and collaborating to deliver the desired outcome. Inspired by this observations, I wrote couple of papers based on multi-memory systems that analyze various aspects of memory functions such as item-relational storage, view/channel fusion, information encoder-decoder. Read more

Memory for Memorization and Reasoning

less than 1 minute read

Published: November 13, 2021

Check our papers: Read more

Deep Generative Memory

less than 1 minute read

Published: November 13, 2021

Check our papers: Read more

Human-AI Symbiosis

1 minute read

Published: November 13, 2021

AI should be our partner, not competitor

Read more

Lean Reinforcement Learning

1 minute read

Published: November 11, 2021

Despite huge successes in breaking human records, current training of RL agents is prohibitively expensive in terms of time, GPUs, and samples. For example, it requires hundreds of millions or even billions of environment steps to reach human-level performance on Atari games-a common benchmark in modern RL. That is only doable with simulation, not real-world problems like robotics or industrial planning. The problem of sample-inefficiency is exacerbated in real environments, which can be stochastic, partially observable, noisy or long-term. Another issue is model complexity. RL algorithms are getting more complicated, coupled with numerous hyperparameters that need to be tuned carefully. That again accelerates the cost of training RL agents. Read more

Neural Memory Architecture

2 minute read

Published: November 10, 2021

Memory is the core of intelligence. Thanks to memory, human can effortlessly recognize objects, recall past events, plan the future, explain surrounding environments and reason from facts. From cognitive perspective, memory can take many forms and functionalities (see figure below). Read more

grants

Elucidating Human Brain Connectivity Through Deep Learning and Network Analysis

Published: January 08, 2021

Mini ARC Analog Programme (MAAP)
Amount: $254,200 AUD
Duration: 2021-2023

Complex Time-series System Forecasting Reinforced by Expert Knowledge

Published: August 25, 2024

Discovery Early Career Researcher Award (DECRA)
Amount: $428,331 AUD
Duration: 2025-2028

Multi-Agent Collaborative Language Model Alignment

Published: December 21, 2024

Cohere For AI Research Grant
Amount: $10,000 USD
Duration: 2025-2026

Complex Time-series System Forecasting Reinforced by Expert Knowledge

Published: January 03, 2025

Deakin DECRA Project Support Grant
Amount: $413,712 AUD
Duration: 2025-2028

lectures

My Random Notes

less than 1 minute read

Published: February 15, 2022

Just some of my incomplete ideas: Read more

Variational Inference in Generative Models

less than 1 minute read

Published: September 15, 2022

Variational Inference (VI) first starts as a handy tool in Bayesian inference to approximate intractable posterior. Now, its usage goes beyond Bayesian inference, and we can see VI everywhere in classic learning, and deep learning-anywhere needs an approximation. After this lecture, we should: Read more

preprints

Memory-Augmented Neural Networks for Predictive Process Analytics

Published: March 02, 2019

Authors: Asjad Khan*, Hung Le*, Kien Do, Truyen Tran, Aditya Ghose, Hoa Dam, Renuka Sindhgatta.
Link

Improving Long Handwritten Text Line Recognition with Convolutional Multi-way Associative Memory

Published: November 05, 2019

Authors: Duc Nguyen, Nhan Tran, Hung Le.
Link

Plug and Play, Model-Based Reinforcement Learning

Published: August 20, 2021

Authors: Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh.
Link

publications

Dual Control Memory Augmented Neural Networks for Treatment Recommendations

Published/Accepted at PAKDD, 2018

Authors: Hung Le, Truyen Tran, Svetha Venkatesh
Code•PDF•Link

Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning

Published/Accepted at KDD, 2018

Authors: Hung Le, Truyen Tran, Svetha Venkatesh
Code•Poster•PDF•Link

Variational Memory Encoder-Decoder

Published/Accepted at NeurIPS, 2018

Authors: Hung Le, Truyen Tran, Thin Nguyen, Svetha Venkatesh
Code•Poster•PDF•Link

Learning to Remember More with Less Memorization

Published/Accepted at ICLR (Oral), 2019

Authors: Hung Le, Truyen Tran, Svetha Venkatesh
Code•Poster•Blog•PDF•Link

Neural Stored-Program Memory

Published/Accepted at ICLR, 2020

Authors: Hung Le, Truyen Tran, Svetha Venkatesh
Code•Talk•Slides•PDF•Link

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese Text Line Recognition

Published/Accepted at ICPR, 2020

Authors: Huu Tin Hoang, Chun-Jen Peng, Hung Tran, Hung Le, Huy Hoang Nguyen.
Link

Self-attentive Associative Memory

Published/Accepted at ICML, 2020

Authors: Hung Le, Truyen Tran, Svetha Venkatesh
Code•Talk•Slides•PDF•Link

A New Representation of Successor Features for Transfer across Dissimilar Environments

Published/Accepted at ICML (Spotlight), 2021

Authors: Majid Abdolshah, Hung Le, Thommen George Karimpanal, Sunil Gupta, Santu Rana, Svetha Venkatesh.
Link

DeepProcess: Supporting Business Process Execution Using a MANN-based Recommender System

Published/Accepted at ICSOC, 2021

Authors: Asjad Khan, Aditya Ghose, Hoa Dam, Hung Le, Truyen Tran, Kien Do.
Link

From Deep Learning to Deep Reasoning

Published/Accepted at KDD (Tutorial), 2021

Authors: Truyen Tran, Vuong Le, Hung Le, Thao M Le
Link

Robust Deep Reinforcement Learning for Extractive Legal Summarization

Published/Accepted at ICONIP, 2021

Authors: Duy-Hung Nguyen, Bao-Sinh Nguyen, Nguyen Viet Dung Nghiem, Dung Tien Le, Mim Amina Khatun, Minh-Tien Nguyen, Hung Le.
Link

Model-Based Episodic Memory Induces Dynamic Hybrid Controls

Published/Accepted at NeurIPS, 2021

Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Truyen Tran, Svetha Venkatesh
Code•Talk•Poster•Slides•PDF•Link

Episodic Policy Gradient Training

Published/Accepted at AAAI (Oral), 2022

Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Kien Do, Dung Nguyen, Svetha Venkatesh
Code•Talk•Slides•PDF•Link

Learning Theory of Mind via Dynamic Traits Attribution

Published/Accepted at AAMAS, 2022

Authors: Dung Nguyen, Phuoc Nguyen, Hung Le, Kien Do, Truyen Tran, Svetha Venkatesh
Link

Generative Pseudo-Inverse Memory

Published/Accepted at ICLR, 2022

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran, Bao Ho, Svetha Venkatesh
Link

Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback

Published/Accepted at NAACL-Findings, 2022

Authors: Duy-Hung Nguyen, Nguyen Viet Dung Nghiem, Bao-Sinh Nguyen, Tien Dung Le, Minh-Tien Nguyen, Shahab Sabahi, Hung Le
Link

Neurocoder: General-Purpose Computation Using Stored Neural Programs

Published/Accepted at ICML (Spotlight), 2022

Authors: Hung Le, Svetha Venkatesh.
Code•PDF•Link

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

Published/Accepted at ECCV, 2022

Authors: Kien Do, Haripriya Harikumar, Hung Le, Dung Nguyen, Truyen Tran, Santu Rana, Dang Nguyen, Willy Susilo, Svetha Venkatesh
Link

HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

Published/Accepted at ICONIP (Oral), 2022

Authors: Bao-Sinh Nguyen, Quang-Bach Tran, Tuan-Anh Nguyen Dang, Duc Nguyen, Hung Le.
Link

Improving Document Image Understanding with Reinforcement Finetuning

Published/Accepted at ICONIP (Oral), 2022

Authors: Bao-Sinh Nguyen, Dung Tien Le, Hieu M. Vu, Tuan-Anh D. Nguyen, Minh-Tien Nguyen, Hung Le
Link

Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Published/Accepted at NeurIPS, 2022

Authors: Kien Do, Hung Le, Dung Nguyen, Dang Nguyen, HARIPRIYA HARIKUMAR, Truyen Tran, Santu Rana, Svetha Venkatesh
Link

Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Published/Accepted at NeurIPS, 2022

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran
Link

Learning to Constrain Policy Optimization with Virtual Trust Region

Published/Accepted at NeurIPS (Spotlight), 2022

Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Dung Nguyen, Kien Do, Sunil Gupta, Svetha Venkatesh
PDF•Link

Memory-Augmented Theory of Mind Network

Published/Accepted at AAAI, 2023

Authors: Dung Nguyen, Phuoc Nguyen, Hung Le, Kien Do, Svetha Venkatesh, Truyen Tran
Link

Improving Out-of-distribution Generalization with Indirection Representations

Published/Accepted at ICLR, 2023

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran
Link

Social Motivation for Modelling Other Agents under Partial Observability in Decentralised Training

Published/Accepted at IJCAI, 2023

Authors: Dung Nguyen, Hung Le, Kien Do, Svetha Venkatesh, Truyen Tran
Link

The Application of Machine Learning in Micrometeoroid and Orbital Debris Impact Protection and Risk Assessment for Spacecraft

Published/Accepted at International Journal of Impact Engineering, 2023

Authors: Shannon Ryan, Neeraj Mohan Sushma, Hung Le, Arun Kumar A V, Santu Rana, Sevvandi Kandanaarachchi, Svetha Venkatesh
Link

Improving Domain Generalization with Interpolation Robustness

Published/Accepted at ACML, 2023

Authors: Ragja Palakkadavath, Thanh Nguyen-Tang, Hung Le, Svetha Venkatesh, Sunil Gupta
Link

Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets

Published/Accepted at Artificial Intelligence, 2023

Authors: Thommen George Karimpanal, Hung Le, Majid Abdolshah, Santu Rana, Sunil Gupta, Truyen Tran, Svetha Venkatesh.
Link

Universal Graph Continual Learning

Published/Accepted at Transactions on Machine Learning Research (TMLR) , 2023

Authors: Thanh Duc Hoang, Do Viet Tung, Duy-Hung Nguyen, Bao-Sinh Nguyen, Huy Hoang Nguyen, and Hung Le.
Link

Beyond Surprise: Improving Exploration Through Surprise Novelty

Published/Accepted at AAMAS (Oral), 2024

Authors: Hung Le, Kien Do, Dung Nguyen and Svetha Venkatesh
Code•PDF•Link

Diversifying Training Pool Predictability for Zero-shot Coordination: A Theory of Mind Approach

Published/Accepted at IJCAI, 2024

Authors: Dung Nguyen, Hung Le, Kien Do, Sunil Gupta, Svetha Venkatesh and Truyen Tran
Link

Variable-Agnostic Causal Exploration for Reinforcement Learning

Published/Accepted at ECML-PKDD, 2024

Authors: Minh Hoang Nguyen, Hung Le, Svetha Venkatesh
Code•Link

VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction

Published/Accepted at ISSTA, 2024

Authors: Thanh-Dat Nguyen, Tung Do-Viet, Hung Nguyen-Duy, Tuan-Hai Luu, Hung Le, Bach Le, Patanamon (Pick) Thongtanunam
Link

Revisiting the Dataset Bias Problem from a Statistical Perspective

Published/Accepted at ECAI, 2024

Authors: Kien Do, Dung Nguyen, Hung Le, Thao Le, Dang Nguyen, Haripriya Harikumar, Truyen Tran, Santu Rana and Svetha Venkatesh
Link

Large Language Model Prompting With Episodic Memory

Published/Accepted at ECAI (Oral), 2024

Authors: Van Dai Do, Quan Tran, Svetha Venkatesh and Hung Le
PDF•Link

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory

Published/Accepted at Transactions on Machine Learning Research (TMLR) , 2024

Authors: Hung Le, Dung Nguyen, Kien Do, Svetha Venkatesh, Truyen Tran
Link•Code

Fair Domain Generalization with Heterogeneous Sensitive Attributes Across Domains

Published/Accepted at WACV, 2024

Authors: Ragja Palakkadavath, Hung Le, Thanh Nguyen-Tang, Svetha Venkatesh, Sunil Gupta
Link

Multi-Reference Preference Optimization for Large Language Models

Published/Accepted at AAAI, 2025

Authors: Hung Le, Quan Hung Tran, Dung Nguyen, Kien Do, Saloni Mittal, Kelechi Ogueji, Svetha Venkatesh
Link•Code•Blog

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

Published/Accepted at AAMAS (Extended Abstract), 2025

Authors: Dung Nguyen, Hung Le, Kien Do, Sunil Gupta, Svetha Venkatesh, Truyen Tran
Link

SimSMoE: Toward Efficient Training Mixture of Experts via Solving Representational Collapse

Published/Accepted at NAACL-Findings, 2025

Authors: Giang Do, Hung Le, Truyen Tran
Link

Rapid Selection and Ordering of In-Context Demonstrations via Prompt Embedding Clustering

Published/Accepted at ICLR, 2025

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran
Link

Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning

Published/Accepted at ICLR, 2025

Authors: Hung Le, Dung Nguyen, Kien Do, Sunil Gupta, Svetha Venkatesh
Link•Code•Blog

SuperRAG: Beyond RAG with Layout-Aware Graph Modeling

Published/Accepted at NAACL (Industry Track), 2025

Authors: Chening Yang, Duy-Khanh Vu, Minh-Tien Nguyen, Xuan-Quang Nguyen, Linh Nguyen, Hung Le
Link

Automatic Prompt Selection for Large Language Models

Published/Accepted at PAKDD, 2025

Authors: Viet Tung Do, Xuan-Quang Nguyen, Van Khanh Hoang, Duy-Hung Nguyen, Shahab Sabahi, Chening Yang, Hajime Hotta, Minh-Tien Nguyen, Hung Le
Link

Accelerated Experimental Design Using a Human-Ai Teaming Framework

Published/Accepted at Knowledge-Based Systems, 2025

Authors: Arun Kumar Anjanapura Venkatesh, Alistair Shilton, Sunil Gupta, Shannon Ryan, Majid Abdolshah, Hung Le, Santu Rana, Julian Berk, Mahad Rashid, Svetha Venkatesh
Link

Lagr-seq: Language-guided Reinforcement Learning with Sample-Efficient Querying

Published/Accepted at Neural Computing and Applications, 2025

Authors: Thommen George Karimpanal, Laknath Buddhika Semage, Santu Rana, Hung Le, Truyen Tran, Sunil Gupta, Svetha Venkatesh
Link

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

Published/Accepted at IJCAI, 2025

Authors: Dung Nguyen, Hung Le, Kien Do, Sunil Gupta, Svetha Venkatesh, Truyen Tran
Link

Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Published/Accepted at IJCAI, 2025

Authors: Minh Hoang Nguyen, Linh Le Pham Van, Thommen George Karimpanal, Sunil Gupta, Hung Le
Link

ALIAS: DAG Learning with Efficient Unconstrained Policies

Published/Accepted at Transactions on Machine Learning Research (TMLR) , 2025

Authors: Bao Duong, Hung Le, Biwei Huang, Thin Nguyen
Link

Physics-Informed Machine Learning for Predicting the Ballistic Limit of Whipple Shields

Published/Accepted at International Journal of Impact Engineering, 2025

Authors: Shannon Ryan, Hung Le, Julian Berk, AV Arun Kumar, Svetha Venkatesh
Link

Dynamic Steering With Episodic Memory For Large Language Models

Published/Accepted at ACL-Findings, 2025

Authors: Van Dai Do, Quan Hung Tran, Svetha Venkatesh, and Hung Le
Link

Navigation in Brain Networks Is Possible Using Only Local Information

Published/Accepted at Scientific Reports, 2025

Authors: Alex Hocking, Lena Vogelsang, Frank Jiang, Hung Le, Kerri Morgan, Nicholas Parsons, Govinda Poudel, Sergiy Shelyag, Julien Ugon
Link•Project