RLG Lab Research Publications

CG 2026

WallZero: Mastering the Game of WallGo with Strategic Analysis

This paper presents WallZero, an AlphaZero-based agent for WallGo, a board game popularized by the Netflix series The Devil's Plan. WallZero defeats professional Go players and is further used to assess game fairness and identify key strategies for mastering WallGo.

ICLR 2026

Regret-Guided Search Control for Efficient Learning in AlphaZero

This paper proposes regret-guided search control, extending AlphaZero with regret-guided restarts that yield more efficient and robust learning in board games.

IEEE ToG 2025

A Study of Solving Life-and-Death Problems in Go Using Relevance-Zone Based Solvers

This paper analyzes the behavior of solving Life-and-Death (L&D) problems in the game of Go using current state-of-the-art computer Go solvers.

NeurIPS 2025

Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization

This paper aims to learn human-like RL by distilling demonstrations into macro actions and optimizing trajectories for both reward and human-likeness, achieving top human-likeness on D4RL Adroit.

IJCAI 2025

Bridging Local and Global Knowledge via Transformer in Board Games

This paper proposes ResTNet, an AlphaZero backbone that interleaves residual and Transformer blocks to fuse local and global board knowledge, improving strength on Go/Hex and better recognizing long-sequence patterns.

IEEE TAI 2025

Demystifying MuZero Planning: Interpreting the Learned Model

This paper interprets the MuZero planning using the learned latent states, analyzing across two board games: 9x9 Go and Gomoku, and three Atari games: Breakout, Ms. Pacman, and Pong.

Strength Estimation and Strength Adjustment

ICLR 2025

Strength Estimation and Human-Like Strength Adjustment in Games

This paper proposes a strength system that can estimate the strength from games and provide various playing strengths while simultaneously offer a human-like behavior in both Go and chess.

ICLR 2025

OptionZero: Planning with Learned Options

This paper presents OptionZero, a method that integrates options into the MuZero algorithm, which autonomously discovers options through self-play games and utilizes options during planning.

IEEE ToG 2025

MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero.

NeurIPS 2023

Game Solving with Online Fine-Tuning

This paper proposes methods for game solving with online fine-tuning; namely, while solving, we simultaneously use an online fine-tuning trainer to fine-tune heuristics to provide higher accurate evaluations.