Blog posts

2025 (5)

Reason on the Fly: How RL Boosts LLM Reasoning On the Spot

less than 1 minute read

Published: June 04, 2025

In our last post, we warmed up with why reinforcement learning (RL) is a powerful paradigm for building smarter AI reasoners. Today, we zoom in on an exciting approach: using RL at inference time to improve large language model (LLM) reasoning on the spot. In particular, we explore ways to inject real-time reasoning into static LLMs. Let’s break down how RL can transform a frozen LLM into a more dynamic, reasoned thinker at runtime. Read more

Think Before You Speak: Reinforcement Learning for LLM Reasoning

1 minute read

Published: May 21, 2025

Large Language Models (LLMs) have shown remarkable capabilities across a range of natural language tasks. Yet, when you give them a problem that needs a bit of careful thinking, like a tricky math question or understanding a complicated document, suddenly they can stumble. It’s like they can talk the talk, but when it comes to really putting things together step-by-step, they can get lost. Read more

The Best of Time-Series Forecasting (Part II): Advancements in Time Series Modeling Through Large Language Models

less than 1 minute read

Published: April 09, 2025

Part 1 of my blog looked at how time-series forecasting has evolved—from traditional models like ARIMA to deep learning methods like Transformers. These approaches brought big improvements, especially in handling complex and long-range patterns. However, they also have limits, especially when it comes to adapting to new data or working well across very different domains. Read more

The Best of Time-Series Forecasting (Part I): From Seasonal Patterns to Transformer Models

less than 1 minute read

Published: March 11, 2025

From finance to healthcare, energy, and climate science, time-series forecasting is a cornerstone of critical decision-making: Read more

Rethinking Memory: A Unified Linear Approach for Mindful Agents

less than 1 minute read

Published: January 22, 2025

In reinforcement learning (RL), memory isn’t just a bonus—it’s a necessity. When agents operate in environments where they can’t directly see everything they need (think navigating a maze), they must rely on memory to make decisions. This is where things get tricky: most current memory models fail under the weight of complex, long-term tasks where agents must selectively retain and erase memories based on relevance. Read more

2024 (11)

Many Hands Make Light Work: Leveraging Collective Intelligence to Align Large Language Models

1 minute read

Published: December 16, 2024

Multi-Reference Preference Optimization (MRPO) for Large Language Models (AAAI 2025) Read more

Memory-Augmented Large Language Models

1 minute read

Published: December 03, 2024

Why and How Memory Matters for LLMs? Read more

Extending Neural Networks to New Lengths: Enhancing Symbol Processing and Generalization

1 minute read

Published: October 23, 2024

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory Read more

XLSTM vs LSTM: How the new LSTM Scale Sequence Prediction without Attention?

less than 1 minute read

Published: October 11, 2024

xLSTM: Extended Long Short-Term Memory Read more

Uncertainty, Confidence, and Hallucination in Large Language Models

1 minute read

Published: July 23, 2024

How to Spot When Your Large Language Model is Misleading You Read more

Cheap Large Language Models via Eliminating Matrix Multiplications

less than 1 minute read

Published: July 11, 2024

Scalable MatMul-free Language Modeling Read more

The Mamba Effect: State Space Models Taking on Transformers

less than 1 minute read

Published: July 05, 2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces Read more

Curious Agents Saga: Part 3, Beyond Surprise: Direct and Causal Exploration in Deep Reinforcement Learning

1 minute read

Published: June 27, 2024

Beyond Surprise: Direct and Causal Exploration in Deep Reinforcement Learning Read more

Human-Aligned Large Language Models

1 minute read

Published: February 19, 2024

About recent LLM alignment finetuning techniques such as RLHF, DPO, KTO, IPO and SPIN Read more

Curious Agents Saga (Part 2), Novelty or Surprise: How to Make Your Deep Reinforcement Learning Agents Curious?

less than 1 minute read

Published: February 03, 2024

Novelty or Surprise: How to Make Your Deep Reinforcement Learning Agents Curious? Read more

Curious Agents Saga (Part 1), Legacy of Exploration in Reinforcement Learning

less than 1 minute read

Published: January 25, 2024

Legacy of Exploration in Reinforcement Learning Read more

2023 (1)

Complex System Forecasting with Expert Knowledge

3 minute read

Published: September 11, 2023

Why complex time-series system?

Read more

2022 (3)

Memory in Reinforcement Learning: Overview

3 minute read

Published: September 19, 2022

Memory is just storage. Whenever computation needs to store interim results, it must ask for memory. This fundamental principle applies to any scenario where memory is required, yet a closer interpretation of memory’s role in each domain reveals a different understanding of its functionality and benefit. Read more

Program Memory: Method (part 2)

4 minute read

Published: July 12, 2022

When a human programmer codes, he often uses core libraries to construct his programs. Most of the time, the program memory stores these static libraries, and let big programs be created dynamically during computation. The libraries are unitary components constructing bigger programs. Maintaining small and functionally independent sub-programs such as libraries encourage program utilisations since an immense program must refer to different libraries to complete its task. Indeed, it also eliminates redundancies as the stored programs-the core libraries are not overlapping each other. Read more

Program Memory: Method (part 1)

6 minute read

Published: May 31, 2022

A neural network uses its weight to compute inputs and return outputs as computation results. Hence, the weight can be viewed as the neural network’s program. If we maintain a program memory of different weights responsible for various computation functions, we have a neural Universal Turing Machine. Obvious scenarios where a Program Memory may help: Read more

2021 (7)

Program Memory: Introduction

4 minute read

Published: November 24, 2021

Memory-augmented neural networks (MANNs) store data in their external memory, resembling Turning Machines. Despite being theoretically Turing-Complete, MANNs cannot be trained flexibly to solve any task due to the lack of program memory. Without storing programs, it is hard to perform complicated tasks such as simulating recursive functional calls or implementing divide-and-conquer algorithms. As long as programs are not treated as data, the computing capability of neural networks is still limited. Read more

Multi-memory Architecture

1 minute read

Published: November 13, 2021

Imagine this, you only have short-term memory. You can only remember what happen during the day, and when you wake up, your mind refreshes. Without long-term memory, you cannot remember even your birthday, your last month’s payment or where had you been last week. To survive, you must note down every thing, and re-learn these facts every morning. That would be so inconvenient for dementia patients who suffer this kind of disease. In the same vein, if your mind only revolves around System 2 (slow and sophisticated), you will fail to think quickly and intuitively, and process life events effortfully. No matter how simple System 1, it is required and stands apart from System 2. It seems like a good model of memory should treat the memory as a bunch of modules, representing different functions and collaborating to deliver the desired outcome. Inspired by this observations, I wrote couple of papers based on multi-memory systems that analyze various aspects of memory functions such as item-relational storage, view/channel fusion, information encoder-decoder. Read more

Memory for Memorization and Reasoning

less than 1 minute read

Published: November 13, 2021

Check our papers: Read more

Deep Generative Memory

less than 1 minute read

Published: November 13, 2021

Check our papers: Read more

Human-AI Symbiosis

1 minute read

Published: November 13, 2021

AI should be our partner, not competitor

Read more

Lean Reinforcement Learning

1 minute read

Published: November 11, 2021

Despite huge successes in breaking human records, current training of RL agents is prohibitively expensive in terms of time, GPUs, and samples. For example, it requires hundreds of millions or even billions of environment steps to reach human-level performance on Atari games-a common benchmark in modern RL. That is only doable with simulation, not real-world problems like robotics or industrial planning. The problem of sample-inefficiency is exacerbated in real environments, which can be stochastic, partially observable, noisy or long-term. Another issue is model complexity. RL algorithms are getting more complicated, coupled with numerous hyperparameters that need to be tuned carefully. That again accelerates the cost of training RL agents. Read more

Neural Memory Architecture

2 minute read

Published: November 10, 2021

Memory is the core of intelligence. Thanks to memory, human can effortlessly recognize objects, recall past events, plan the future, explain surrounding environments and reason from facts. From cognitive perspective, memory can take many forms and functionalities (see figure below). Read more