Boyuan Zhang (张博源)

About

I am a Ph.D. candidate in the Department of Intelligent Systems Engineering at Indiana University Bloomington, advised by Dr. Fengguang Song and previously by Dr. Dingwen Tao.

I earned a B.Eng. in Information Engineering from Shanghai Jiao Tong University (2018) and an M.S. in Electrical Engineering from the University of Southern California (2020).

I collaborate with Argonne National Laboratory (Dr. Sheng Di, Dr. Franck Cappello) on scientific data compression, Pacific Northwest National Laboratory (Dr. Nathan R. Tallent) on quantum simulation and AI-based compression, and Meta (Dr. Min Si) on GPU-based compression for distributed training. In the summer of 2025, I interned at ByteDance (Seed, ML Systems) working on LLM inference optimization.

For further details or to contact me, please email me at bozhan@iu.edu.

Education

2022 – Present

Indiana University Bloomington

Ph.D. in Intelligent Systems Engineering

Advisor: Prof. Dingwen Tao and Prof. Fengguang Song

2021 – 2022

Washington State University

Ph.D. in Computer Science (transferred to IU)

Advisor: Prof. Dingwen Tao

2018 – 2020

University of Southern California

M.S. in Electrical Engineering

2014 – 2018

Shanghai Jiao Tong University

B.Eng. in Information Engineering

Research

My research primarily focuses on High-Performance Computing (HPC), with particular emphasis on:

Data Compression

Design of GPU-based lossy and lossless compression algorithms for various workloads, with an emphasis on efficiency, fidelity, and scalability.

Parallel Computing

Development of optimized GPU kernels, parallel algorithms, and distributed workflows to accelerate large-scale scientific and engineering applications.

ML Systems

Research on system-level optimizations for training and inference, including memory footprint reduction and GPU-accelerated data pipelines.

Quantum Computing

Investigation of high-performance methods to support large-scale quantum circuit simulation and emerging quantum computing applications.

Experience

May 2025 – Aug. 2025

ByteDance

Research Scientist Intern, Seed — Machine Learning Systems

Developed Vayne, a unified GPU kernel for quantization and compression of LLM KV caches, enabling high-throughput inference with reduced memory footprint.

May 2023 – Aug. 2023

Pacific Northwest National Laboratory

Ph.D. Intern in High-Performance Computing

Advised by Dr. Nathan R. Tallent. Evaluated lossy compression for microscopy images and built a GPU workflow for AI-based compression targeting high ratio, quality, and throughput.

Selected Publications

IPDPS'26

Near-Zero Cost KV Cache Compression for Large Language Model Inference

Boyuan Zhang, Ding Zhou, Yafan Huang, Shihui Song, Hao Feng, Jinda Jia, Chengming Zhang, and Zhi Zhang.

Proceedings of the 40th IEEE International Parallel and Distributed Processing Symposium, 2026.

IPDPS'26

Accelerating AI Compression through Lightweight Lossless Encoding and Pipelined Workflows

Boyuan Zhang, Luanzheng Guo, Jiannan Tian, Jinyang Liu, Daoce Wang, Chengming Zhang, Bo Fang, Fengguang Song, Jan Strube, Nathan R. Tallent, and Dingwen Tao.

Proceedings of the 40th IEEE International Parallel and Distributed Processing Symposium, 2026.

ICS'25 Best Paper Runner-Up

BMQSim: Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework

Boyuan Zhang, Bo Fang, Fanjiang Ye, Luanzheng Guo, Fengguang Song, Nathan R. Tallent, and Dingwen Tao.

Proceedings of the 39th ACM International Conference on Supercomputing, pp. 689-704, 2025.

ICS'25 Best Paper Candidate

Pushing the Limits of GPU Lossy Compression: A Hierarchical Delta Approach

Boyuan Zhang, Yafan Huang, Sheng Di, Fengguang Song, Guanpeng Li, and Franck Cappello.

Proceedings of the 39th ACM International Conference on Supercomputing, pp. 654-669, 2025.

SC'24

Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

Hao Feng*, Boyuan Zhang*, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, and Dingwen Tao.

SC24: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-16, IEEE, 2024.

HPDC'23

FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Yunhe Feng, Xin Liang, Dingwen Tao, and Franck Cappello.

Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, pp. 129–142, 2023.

ICS'23

GPULZ: Optimizing LZSS Lossless Compression for Multi-Byte Data on Modern GPUs

Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Martin Swany, Dingwen Tao, and Franck Cappello.

Proceedings of the 37th ACM International Conference on Supercomputing, pp. 348–359, 2023.

* denotes equal contribution. Please refer to the full list on Google Scholar.

Awards & Honors

2025

Best Paper Runner-Up, ACM ICS'25 (BMQSim) — Top 3 of 320 submissions

2025

Best Paper Candidate, ACM ICS'25 (Aatrox) — Top 6 of 320 submissions

2023

Student Travel Grant ($1,500), HPDC'23

2022

Graduate Conference Grant ($3,000), Indiana University Bloomington

Teaching

Spring 2026

Assistant Instructor — ENGR-516: “Engineering Cloud Computing,” Indiana University

Fall 2024

Assistant Instructor — ENGR-516: "Engineering Cloud Computing," Indiana University

Spring 2022

Teaching Assistant — CPTS 360: "Systems Programming," Washington State University

Fall 2021

Teaching Assistant — CPTS 360: "Systems Programming," Washington State University

Professional Service

Reviewer: IEEE TPDS (2024–2025), CCGRID'25, QCE (2024–2025), CLOUD'23, ISSRE'23
Program Committee: ISSRE'23 Artifact Evaluation
Web Chair: QCCC'24
Student Volunteer: SC'25 (St. Louis, MO), SC'23 (Denver, CO)