Boyuan Zhang

Boyuan Zhang (张博源)

PhD Candidate, Intelligent Systems Engineering · Indiana University Bloomington

About

I am a Ph.D. candidate in the Department of Intelligent Systems Engineering at Indiana University Bloomington, advised by Dr. Fengguang Song and previously by Dr. Dingwen Tao.

I earned a B.Eng. in Information Engineering from Shanghai Jiao Tong University (2018) and an M.S. in Electrical Engineering from the University of Southern California (2020).

I collaborate with Argonne National Laboratory (Dr. Sheng Di, Dr. Franck Cappello) on scientific data compression, Pacific Northwest National Laboratory (Dr. Nathan R. Tallent) on quantum simulation and AI-based compression, and Meta (Dr. Min Si) on GPU-based compression for distributed training. In the summer of 2025, I interned at ByteDance (Seed, ML Systems) working on LLM inference optimization.

For further details or to contact me, please email me at bozhan@iu.edu.

Education

2022 – Present

Indiana University Bloomington

Ph.D. in Intelligent Systems Engineering
Advisor: Prof. Dingwen Tao and Prof. Fengguang Song
2021 – 2022

Washington State University

Ph.D. in Computer Science (transferred to IU)
Advisor: Prof. Dingwen Tao
2018 – 2020

University of Southern California

M.S. in Electrical Engineering
2014 – 2018

Shanghai Jiao Tong University

B.Eng. in Information Engineering

Research

My research primarily focuses on High-Performance Computing (HPC), with particular emphasis on:

Data Compression

Design of GPU-based lossy and lossless compression algorithms for various workloads, with an emphasis on efficiency, fidelity, and scalability.

Parallel Computing

Development of optimized GPU kernels, parallel algorithms, and distributed workflows to accelerate large-scale scientific and engineering applications.

ML Systems

Research on system-level optimizations for training and inference, including memory footprint reduction and GPU-accelerated data pipelines.

Quantum Computing

Investigation of high-performance methods to support large-scale quantum circuit simulation and emerging quantum computing applications.

Experience

May 2025 – Aug. 2025

ByteDance

Research Scientist Intern, Seed — Machine Learning Systems
Developed Vayne, a unified GPU kernel for quantization and compression of LLM KV caches, enabling high-throughput inference with reduced memory footprint.
May 2023 – Aug. 2023

Pacific Northwest National Laboratory

Ph.D. Intern in High-Performance Computing
Advised by Dr. Nathan R. Tallent. Evaluated lossy compression for microscopy images and built a GPU workflow for AI-based compression targeting high ratio, quality, and throughput.

Selected Publications

IPDPS'26
Near-Zero Cost KV Cache Compression for Large Language Model Inference
Boyuan Zhang, Ding Zhou, Yafan Huang, Shihui Song, Hao Feng, Jinda Jia, Chengming Zhang, and Zhi Zhang.
Proceedings of the 40th IEEE International Parallel and Distributed Processing Symposium, 2026.
IPDPS'26
Accelerating AI Compression through Lightweight Lossless Encoding and Pipelined Workflows
Boyuan Zhang, Luanzheng Guo, Jiannan Tian, Jinyang Liu, Daoce Wang, Chengming Zhang, Bo Fang, Fengguang Song, Jan Strube, Nathan R. Tallent, and Dingwen Tao.
Proceedings of the 40th IEEE International Parallel and Distributed Processing Symposium, 2026.
ICS'25 Best Paper Runner-Up
BMQSim: Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework
Boyuan Zhang, Bo Fang, Fanjiang Ye, Luanzheng Guo, Fengguang Song, Nathan R. Tallent, and Dingwen Tao.
Proceedings of the 39th ACM International Conference on Supercomputing, pp. 689-704, 2025.
ICS'25 Best Paper Candidate
Pushing the Limits of GPU Lossy Compression: A Hierarchical Delta Approach
Boyuan Zhang, Yafan Huang, Sheng Di, Fengguang Song, Guanpeng Li, and Franck Cappello.
Proceedings of the 39th ACM International Conference on Supercomputing, pp. 654-669, 2025.
SC'24
Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression
Hao Feng*, Boyuan Zhang*, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, and Dingwen Tao.
SC24: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-16, IEEE, 2024.
HPDC'23
FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs
Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Yunhe Feng, Xin Liang, Dingwen Tao, and Franck Cappello.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, pp. 129–142, 2023.
ICS'23
GPULZ: Optimizing LZSS Lossless Compression for Multi-Byte Data on Modern GPUs
Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Martin Swany, Dingwen Tao, and Franck Cappello.
Proceedings of the 37th ACM International Conference on Supercomputing, pp. 348–359, 2023.

* denotes equal contribution. Please refer to the full list on Google Scholar.

Awards & Honors

2025
Best Paper Runner-Up, ACM ICS'25 (BMQSim) — Top 3 of 320 submissions
2025
Best Paper Candidate, ACM ICS'25 (Aatrox) — Top 6 of 320 submissions
2023
Student Travel Grant ($1,500), HPDC'23
2022
Graduate Conference Grant ($3,000), Indiana University Bloomington

Teaching

Spring 2026
Assistant Instructor — ENGR-516: “Engineering Cloud Computing,” Indiana University
Fall 2024
Assistant Instructor — ENGR-516: "Engineering Cloud Computing," Indiana University
Spring 2022
Teaching Assistant — CPTS 360: "Systems Programming," Washington State University
Fall 2021
Teaching Assistant — CPTS 360: "Systems Programming," Washington State University

Professional Service