I am a second year Ph.D. student at University of Michigan, advised by Prof. Mosharaf Chowdhury. I received my Bachelor’s and Master’s degree in computer science at Renmin University of China (RUC) under the supervision of Prof. Feng Zhang. My research interests lie in machine learning compilers and scalable machine learning systems, with recent and upcoming work aiming to build energy-efficient execution stacks for large model training, particularly for generative AI workloads.
Email: ruofanw@umich.edu
Experience
- 2023 - 2024: Microsoft,
DeepSpeedBing, Research Intern, Mentor: Dr. Zhen Zheng - 2022 - 2023: Alibaba Cloud, Platform of Artificial Intelligence (PAI), Research Intern, Mentor: Dr. Zhen Zheng
- 2021: Microsoft Research Asia (MSRA), Systems Research Group, Research Intern, Mentor: Dr. Fan Yang, Dr. Jilong Xue
- 2019 - 2020: North Carolina State University (NCSU), PICTure Research Group, Remote Intern, Advisor: Prof. Xipeng Shen
- 2019: DELL EMC China Technology R&D Center, Intern
News
- Dec 2025: Presented a talk at the NeurIPS 2025 Tutorial on Energy and Power as First‑Class ML Design Metrics.
Selected Publications
Where Do the Joules Go? Diagnosing Inference Energy Consumption
Jae-Won Chung, Ruofan Wu, Jeff J. Ma, Mosharaf Chowdhury
Jae-Won Chung, Ruofan Wu, Jeff J. Ma, Mosharaf Chowdhury
In
Preprint,
2026.
Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training
Ruofan Wu, Jae-Won Chung, Mosharaf Chowdhury
Ruofan Wu, Jae-Won Chung, Mosharaf Chowdhury
In
Preprint,
2026.
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung, Jeff J. Ma, Ruofan Wu, Jiachen Liu, Oh Jun Kweon, Yuxuan Xia, Zhiyu Wu, Mosharaf Chowdhury
Jae-Won Chung, Jeff J. Ma, Ruofan Wu, Jiachen Liu, Oh Jun Kweon, Yuxuan Xia, Zhiyu Wu, Mosharaf Chowdhury
In
NeurIPS Datasets and Benchmarks (Spotlight),
2025.
TetriServe: Efficient DiT Serving for Heterogeneous Image Generation
Runyu Lu*, Shiqi He*, Wenxuan Tan, Shenggui Li, Ruofan Wu, Jeff J. Ma, Ang Chen, Mosharaf Chowdhury
Runyu Lu*, Shiqi He*, Wenxuan Tan, Shenggui Li, Ruofan Wu, Jeff J. Ma, Ang Chen, Mosharaf Chowdhury
In
ASPLOS,
2026.
PluS: Highly Efficient and Expandable ML Compiler with Pluggable Graph Schedules
Ruofan Wu, Zhen Zheng, Feng Zhang, Chuanjie Liu, Zaifeng Pan, Jidong Zhai, Xiaoyong Du
Ruofan Wu, Zhen Zheng, Feng Zhang, Chuanjie Liu, Zaifeng Pan, Jidong Zhai, Xiaoyong Du
In
USENIX ATC,
2025.
ROLLER: Fast and Efficient Tensor Compilation for Deep Learning
Hongyu Zhu, Ruofan Wu, Yijia Diao, Shanbin Ke, Haoyu Li, Chen Zhang, Jilong Xue, Lingxiao Ma, Yuqing Xia, Wei Cui, Fan Yang, Mao Yang, Lidong Zhou, Asaf Cidon, Gennady Pekhimenko
Hongyu Zhu, Ruofan Wu, Yijia Diao, Shanbin Ke, Haoyu Li, Chen Zhang, Jilong Xue, Lingxiao Ma, Yuqing Xia, Wei Cui, Fan Yang, Mao Yang, Lidong Zhou, Asaf Cidon, Gennady Pekhimenko
In
OSDI,
2022.
DREW: Efficient Winograd CNN Inference with Deep Reuse
Ruofan Wu, Feng Zhang, Jiawei Guan, Zhen Zheng, Xiaoyong Du, Xipeng Shen
Ruofan Wu, Feng Zhang, Jiawei Guan, Zhen Zheng, Xiaoyong Du, Xipeng Shen
In
WWW/TheWebConf,
2022.
Cite Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training
@misc{kareus:arxiv26,
author = {Ruofan Wu and Jae-Won Chung and Mosharaf Chowdhury},
title = {{Kareus}: Joint Reduction of Dynamic and Static Energy in Large Model Training},
year = {2026},
eprint = {2601.17654},
archivePrefix = {arXiv},
primaryClass = {cs.LG},
}
Cite The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
@inproceedings{mlenergy-benchmark:neuripsdb25,
title = {The {ML.ENERGY} Benchmark: Toward Automated Inference Energy Measurement and Optimization},
author = {Jae-Won Chung and Jeff J. Ma and Ruofan Wu and Jiachen Liu and Oh Jun Kweon and Yuxuan Xia and Zhiyu Wu and Mosharaf Chowdhury},
year = {2025},
month = {Dec},
booktitle = {NeurIPS D\&B},
}
Cite Where Do the Joules Go? Diagnosing Inference Energy Consumption
@misc{joules:arxiv26,
author = {Jae-Won Chung and Ruofan Wu and Jeff J. Ma and Mosharaf Chowdhury},
title = {Where Do the Joules Go? Diagnosing Inference Energy Consumption},
year = {2026},
eprint = {2601.22076},
archivePrefix = {arXiv},
primaryClass = {cs.LG},
}
Cite TetriServe: Efficient DiT Serving for Heterogeneous Image Generation
@inproceedings{tetriserve:asplos26,
author = {Runyu Lu and Shiqi He and Wenxuan Tan and Shenggui Li and Ruofan Wu and Jeff J. Ma and Ang Chen and Mosharaf Chowdhury},
booktitle = {ASPLOS},
title = {{TetriServe}: Efficiently Serving Mixed {DiT} Workloads},
year = {2026},
month = {March},
}
Cite PluS: Highly Efficient and Expandable ML Compiler with Pluggable Graph Schedules
@inproceedings{plus:atc25,
author = {Ruofan Wu and Zhen Zheng and Feng Zhang and Chuanjie Liu and Zaifeng Pan and Jidong Zhai and Xiaoyong Du},
booktitle = {USENIX ATC},
title = {{PluS}: Highly Efficient and Expandable {ML} Compiler with Pluggable Graph Schedules},
year = {2025},
month = {July},
}
Cite ROLLER: Fast and Efficient Tensor Compilation for Deep Learning
@inproceedings{roller:osdi22,
author = {Hongyu Zhu and Ruofan Wu and Yijia Diao and Shanbin Ke and Haoyu Li and Chen Zhang and Jilong Xue and Lingxiao Ma and Yuqing Xia and Wei Cui and Fan Yang and Mao Yang and Lidong Zhou and Asaf Cidon and Gennady Pekhimenko},
booktitle = {OSDI},
title = {{ROLLER}: Fast and Efficient Tensor Compilation for Deep Learning},
year = {2022},
month = {July},
}
Cite DREW: Efficient Winograd CNN Inference with Deep Reuse
@inproceedings{drew:www22,
author = {Ruofan Wu and Feng Zhang and Jiawei Guan and Zhen Zheng and Xiaoyong Du and Xipeng Shen},
booktitle = {WWW},
title = {{DREW}: Efficient Winograd {CNN} Inference with Deep Reuse},
year = {2022},
month = {April},
}