Hi there!

I am Xuanlei Zhao, a third-year PhD student in Computer Science at National University of Singapore advised by Yang You, where I also completed my master’s studies. I obtained my bachelor’s degree in CS & EE from Huazhong University of Science and Technology. Previously, I interned at Tencent Hunyuan with Kai Wang, Adobe Research with Yan Kang and Yuanjun Xiong, Pika with Chenlin Meng, Colossal-AI with Jiarui Fang.

My current research mainly focuses on efficient AI, including:

  • Efficient parameter generation, for scaling and customizing foundation models.
  • Efficient diffusion and autoregressive models, e.g., for video generation.
  • Efficient machine learning system, with parallelism and low-level optimization.
  • Co-optimization of algorithm and infrastructure.

📝 Selected Publications (all)

🕹️ Efficient Parameter Generation

  • HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing sym
    Tencent HY Team
  • NeurIPS 2025 Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights sym
    Zhiyuan Liang*, Dongwen Tang, Yuhao Zhou, Xuanlei Zhao, Mingjia Shi, Wangbo Zhao, Zekai Li, Peihao Wang, Konstantin Schürholt, Damian Borth, Michael M. Bronstein, Yang You, Zhangyang Wang*, Kai Wang*

🎬 Efficient Video Generation

  • ICLR 2025 Real-Time Video Generation with Pyramid Attention Broadcast sym
    Xuanlei Zhao*, Xiaolong Jin*, Kai Wang*†, Yang You
  • ICML 2025 DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers
    Xuanlei Zhao, Shenggan Cheng, Chang Chen, Zangwei Zheng, Ziming Liu, Zheming Yang, Yang You
  • Training Variable Sequences with Data-Centric Parallel
    Geng Zhang*, Xuanlei Zhao*, Kai Wang, Yang You

⚙️ Efficient System Optimization

  • ICLR 2024 AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference
    Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You
  • MLSys 2024 HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
    Xuanlei Zhao*, Bin Jia*, Haotian Zhou*, Ziming Liu, Shenggan Cheng, Yang You
  • PPoPP 2024 FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters sym
    Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, Ruidong Wu, Xiwen Zhang, Jian Peng, Yang You

💡 Open-Source Projects

  • HY-WU (Lead for algo and infra): An Extensible Functional Neural Memory Framework sym
  • VideoSys (Project Lead): An Easy and Efficient System for Video Generation sym
  • Colossal-AI (Top Contributor): Making large AI models cheaper, faster and more accessible sym
  • FastFold (Top Contributor): Optimizing AlphaFold Training and Inference on GPU Clusters sym

💻 Internships

📖 Educations

  • 2024.01 - now, PhD in Computer Science, National University of Singapore
  • 2022.08 - 2023.12, Master in Computer Science, National University of Singapore
  • 2018.09 - 2022.06, Bachelor in Computer Science & Electrical Information, Huazhong University of Science and Technology

💬 Invited Talks

  • 2024.07, Real-Time Video Generation with Pyramid Attention Broadcast, Ventures [video]
  • 2024.07, Speedup for Video Generation, Bytedance internal talk