Notes

2026.06

Two Theoretical Spectra of Modern Optimizers: Fisher-Approximated and Norm-Constrained

A note revisiting two theoretical lenses for modern optimizer design, connecting Fisher-approximated methods, norm-constrained updates, and recent optimizer trends.

2026.03

Mano: Restriking Manifold Optimization for LLM Training

Introduction to the Mano optimizer, which we proposed, including the theoretical analytics and empirical results.

2025.12

Latent Reasoning & Iterative Refinement in Large Language Models

Latent reasoning methodologies for LLM reasoning from layer skipping/looping to token refinement, and the diffusion language model paradigm.

2025.09

Matrix Decomposition & Dimensionality Reduction in Deep Learning

Matrix decomposition and dimensionality reduction techniques and visualization techniques for high-dimensional features or spaces in deep learning.

2025.08

Data-centric Methods in Deep Learning

An overview of the data-centric methods in deep learning, including data valuation, selection, synthesis, and online pruning.

2025.08

From Adam to Muon: Effective 'Deep' Optimizer Preconditioners

Introduction of the different routines of preconditioner design in the development of optimizers for training deep neural networks.