PhD Student, Learning Theory, Optimization

Yufei Gu / 顾宇飞

Bridging the gap between the theoretical understandings and empirical practises in the field of deep learning.

My enthusiasm lies in the computational aspects of deep learning theory and advanced methodologies for optimization and generalization. My current research interest focuses on analyzing the learning dynamics and developing efficient optimizers for pre-training and post-training LLMs.

Publications Notes Reading Lists CV & Contact

Recent Updates

2026-05Starting Research Internship at Hunyuan Pretraining, Rhinoceros Bird Elite Talent Program.
2026-03Release of Mano_v2 at GitHub:Mano-Restriking-Manifold-Optimization-for-LLM-Training!
2026-02Release of the Mano optimizer and manuscript preprint at arXiv:2601.23000!
2026-01Two accepted papers at [ICLR'26] and [CVPR'26].
2025-09Starting PhD at HKUST-GZ, XLeaf Lab.

Highlighted Work

Tang Qian-Yuan†, Yufei Gu†, Cai Yunfeng, Sun Mingming, Li Ping, Xie Zeke, et al. Investigating the Overlooked Hessian Structure: From CNNs to LLMs. In Proceedings of the 42nd International Conference on Machine Learning (ICML 2025). <poster>
Yufei Gu, Xiaoqing Zheng, Tomaso Aste. Unraveling the Enigma of Double Descent: an in-depth Analysis Through the Lens of Learned Feature Space. In Proceedings of the 12th International Conference on Learning Representations (ICLR 2024). <poster>
Zhao Ji, Yufei Gu, Shao Shitong, Zhou Xun, Xiang Liang, Xie Zeke. Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better. In Proceedings of the 14th International Conference on Learning Representations (ICLR 2026). <poster>

See the full publication list →