Publications
See a full list on Google Scholar.
2024
Training Any-Size Videos with Data-Centric Parallel
Geng Zhang*, Xuanlei Zhao*, Kai Wang†, Yang You†
arXiv
| code | blog |Real-Time Video Generation with Pyramid Attention Broadcast
Xuanlei Zhao*, Xiaolong Jin*, Kai Wang*, Yang You
arXiv
| paper | code | blog |Wallfacer: Guiding transformer model training out of the long-context dark forest with n-body problem
Ziming Liu, Shaoyu Wang, Shenggan Cheng, Zhongkai Zhao, Kai Wang, Xuanlei Zhao, James Demmel, Yang You
arXiv
| paper |DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers
Xuanlei Zhao, Shenggan Cheng, Chang Chen, Zangwei Zheng, Ziming Liu, Zheming Yang, Yang You
arXiv
| paper | code |HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Xuanlei Zhao*, Bin Jia*, Haotian Zhou*, Ziming Liu, Shenggan Cheng, Yang You
MLSys 2024
| paper |FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters
Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, Ruidong Wu, Xiwen Zhang, Jian Peng, Yang You
PPoPP 2024
| paper | code |AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference
Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You
ICLR 2024
| paper | code |
* indicates equal contribution, and † indicates equal corresponding.