Reading List · Archived · 2026.01
Efficient Diffusion Language Models — Paper List
Inference acceleration, sampling, caching, and post-training techniques for diffusion language models.
Foundational Models
- LLaDA: Large Language Diffusion Models. <arXiv 2025.2> <NIPS 2025 Oral>
- LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models. <arXiv 2025.5>
- Dream: Diffusion Large Language Models. <arXiv 2025.8>
Analytics
- Diffusion Language Models Know the Answer Before Decoding. <arXiv 2025.8>
- The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models. <arXiv 2026.1>
- Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models. <arXiv 2026.1>
KV-Cache / Sparse Attention
- dKV-Cache: The Cache for Diffusion Language Models. <arXiv 2025.5>
- Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding. <arXiv 2025.5>
- Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction. <arXiv 2025.8>
- SparseD: Sparse Attention for Diffusion Language Models. <arXiv 2025.9>
- dCache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching. <arXiv 2025.9>
- Attention is All You Need for KV Cache in Diffusion LLMs. <arXiv 2025.10>
Step Distillation
- Progressive Distillation for Fast Sampling of Diffusion Models. <ICLR 2022>
- Distillation of Discrete Diffusion through Dimensional Correlations. <ICML 2025>
- Learning Few-Step Diffusion Models by Trajectory Distribution Matching. <arXiv 2025.3>
- DLM-One: Diffusion Language Models for One-Step Sequence Generation. <arXiv 2025.6>
- Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing. <arXiv 2025.8>
- FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models. <arXiv 2025.9>
- Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Steps. <arXiv 2025.9>
- Adaptive (Block) Length:
- CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation. <arXiv 2025.5>
- Beyond Fixed: Training-free Variable-Length Denoising for Diffusion Large Language Models. <arXiv 2025.8>
- AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size. <arXiv 2025.9>
Training-free Sampler
- Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking. <arXiv 2025.5>
- Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs. <arXiv 2025.7>
- Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States. <arXiv 2025.10> <ICLR 2026 Review>
- KLASS: KL-Guided Fast Inference in Masked Diffusion Models. <arXiv 2025.11>
- Beyond Confidence: Adaptive and Coherent Decoding for Diffusion Language Models. <arXiv 2025.11>
- Optimal Inference Schedules for Masked Diffusion Models. <arXiv 2025.11>
- Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules. <arXiv 2025.12>
- Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty. <arXiv 2025.12>
- Decoding Large Language Diffusion Models with Foreseeing Movement. <arXiv 2025.12>
Long Context
- LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs. <arXiv 2025.6>
- UltraLLaDA: Scaling the Context Length to 128k for Diffusion Large Language Models. <arXiv 2025.10>
Unmasking / Remasking
- Unmasking:
- Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models. <NIPS 2025>
- Path Planning for Diffusion Language Model Sampling. <ICLR 2026 Review>
- Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes. <ICLR 2026 Review>
- Learning Unmasking Policies for Diffusion Language Models. <arXiv 2025.12>
- dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning. <arXiv 2025.12>
- Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models. <arXiv 2026.1>
- Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models. <NIPS 2025>
- Remasking:
- Remasking Discrete Diffusion Models with Inference-Time Scaling. <NIPS 2025>
- Don’t Settle Too Early: Self-Reflective Remasking for Diffusion Language Models. <ICLR 2026 Review>
- Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model. <arXiv 2025.10>
- Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs. <arXiv 2025.8>
AR-to-DLM Transfer / Transformation
- Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation. <arXiv 2026.1>
- Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models. <arXiv 2026.1>