Hi there! I’m a Research Scientist at ByteDance (San Jose), focusing on AI infrastructure and system efficiency at scale. I received my Ph.D. in Computer Science from City University of Hong Kong. My work centers on designing next-generation data compression techniques (including neural approaches) and building high-performance systems for neural network acceleration and edge computing.

Research interests

Neural & Learned Compression System Efficiency at Scale Hardware-Aware ML Acceleration

Google Scholar: Cited by

🔥 News

  • 2026.04:   Joined ByteDance San Jose as Research Scientist.
  • 2026.01:   BAHOP and ECM accepted at ICASSP 2026.
  • 2025.12:   Hitcher accepted at KDD 2026, Parallel-SA at DATE 2026.
  • 2025.05:   WISE accepted at CVPR 2025, PMR at USENIX ATC 2025, Easz and DAWN at DAC 2025.

📔 Blog

Hello, Blog · 开张第一篇
这个博客用来记录压缩、系统效率和 AI 基础设施方向的随手笔记。支持中英双语,评论由 GitHub Discussions 驱动。

View all posts →

📝 Publications

Compression · LLM Efficiency

ECM: Enhancing Compressibility of Quantized Vision Encoder and LLM for Large Vision-Language Models
ECM: Enhancing Compressibility of Quantized Vision Encoder and LLM for Large Vision-Language Models
W. Wang, Yu Mao, D. Tang, N. Guan, C.J. Xue
ICASSP 2026 Corresponding
WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression
WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression
Yu Mao, J. Wang, N. Guan, C.J. Xue
CVPR 2025
Easz: A Transformer-based Image Compression Framework for IoT
Easz: A Transformer-based Image Compression Framework for IoT
Yu Mao, J. Li, J. Wang, H. Xu, T.W. Kuo, N. Guan, C.J. Xue
DAC 2025
Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction
Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction
Yu Mao, H. Pirk, C.J. Xue
Preprint 2025
When Compression Meets Model Compression
When Compression Meets Model Compression
W. Wang, Yu Mao, D. Tang, H. Du, N. Guan, C.J. Xue
EMNLP Findings 2024 Corresponding
On the Compressibility of Quantized Large Language Models
On the Compressibility of Quantized Large Language Models
Yu Mao, W. Wang, H. Du, N. Guan, C.J. Xue
Preprint 2024
Faster and Stronger Lossless Compression with Optimized Autoregressive Framework
Faster and Stronger Lossless Compression with Optimized Autoregressive Framework
Yu Mao, J. Li, Y. Cui, C.J. Xue
DAC 2023
Accelerating General-purpose Lossless Compression via Simple and Scalable Parameterization
Accelerating General-purpose Lossless Compression via Simple and Scalable Parameterization
Yu Mao, Y. Cui, T.-W. Kuo, C.J. Xue
ACM MM 2022
Trace: A Fast Transformer-based General-purpose Lossless Compressor
Trace: A Fast Transformer-based General-purpose Lossless Compressor
Yu Mao, Y. Cui, T.-W. Kuo, C.J. Xue
The Web Conference (WWW) 2022
Variational Nested Dropout
Variational Nested Dropout
Y. Cui, Yu Mao, Z. Liu, Q. Li, A.B. Chan, X. Liu, T.-W. Kuo, C.J. Xue
IEEE TPAMI 2023

Systems · Edge AI

DATE
Parallel-SA: Point Cloud Processing Acceleration via Parallel Set Abstraction
D. Tang, W. Wang, Yu Mao, W. Xie, N. Guan, T.-W. Kuo, C.J. Xue
DATE 2026 Corresponding
PMR: Fast Application Response via Parallel Memory Reclaim on Mobile Devices
PMR: Fast Application Response via Parallel Memory Reclaim on Mobile Devices
W. Li, L.-P. Chang, Yu Mao, L. Shi
USENIX ATC 2025
DAWN: Accelerating Point Cloud Object Detection
DAWN: Accelerating Point Cloud Object Detection
D. Tang, Yu Mao, W. Wang, N. Guan, T.W. Kuo, C.J. Xue
DAC 2025 Corresponding
STEM: Streaming-based FPGA Acceleration for Large-Scale Compactions in LSM KV
STEM: Streaming-based FPGA Acceleration for Large-Scale Compactions in LSM KV
D. Tang, W. Wang, Yu Mao, J. Yu, T.-W. Kuo, C.J. Xue
ICDE 2024 Corresponding

Whole-Slide-Image · Medical Imaging

BAHOP: Similarity-based Basin Hopping for A fast hyper-parameter search in WSI classification
BAHOP: Similarity-based Basin Hopping for A fast hyper-parameter search in WSI classification
J. Wang, Yu Mao, Y. Cui, N. Guan, C.J. Xue
ICASSP 2026 Corresponding
SHAP-CAT: An Interpretable Multi-Modal Framework Enhancing WSI Classification via Virtual Staining and Shapley-Value-Based Multimodal Fusion
SHAP-CAT: An Interpretable Multi-Modal Framework Enhancing WSI Classification via Virtual Staining and Shapley-Value-Based Multimodal Fusion
J. Wang, Yu Mao, N. Guan, C.J. Xue
Preprint 2024 Corresponding
Advances in Multiple Instance Learning for Whole Slide Image Analysis: Techniques, Challenges, and Future Directions
Advances in Multiple Instance Learning for Whole Slide Image Analysis: Techniques, Challenges, and Future Directions
J. Wang, Yu Mao, N. Guan, C.J. Xue
Preprint 2024 Corresponding

🎖 Honors and Awards

  • 2025.11 DAAD Interpretable AI Postdoctoral Fellowship
  • 2024.12 NeurIPS Outstanding Reviewer
  • 2023.10 TinyML Contest ICCAD 2023, Second Place, San Francisco, USA
  • 2023.09 Outstanding Academic Performance Award, City University of Hong Kong
  • 2023.07 DAC Young Research Fellow, San Francisco, USA
  • 2022.10 EDAthon 2022, Second Place, Hong Kong

🤝 Services

  • ML conference referee: NeurIPS 24/25/26, ACM MM 23/24, ICLR 25/26, ICML 25/26, CVPR 25/26, AAAI 26, ARR
  • System TPC: USENIX ATC 25, GLSVLSI, RTCSA
  • Journal referee: ACM TECS, TMLR, IEEE TKDE

👩‍🏫 Supervision

  • Shashwat Jaiswal, summer intern (Ph.D. student at UIUC), 2026
  • Yusheng Zheng, summer intern (Ph.D. student at UCSC), 2026
  • Jun Wang, Ph.D. student with Prof. Jason Xue (2024–present)
  • Weilan Wang, Ph.D. student with Prof. Jason Xue (2024–2026)
  • Dongdong Tang, Ph.D. student with Prof. Jason Xue (2024–2025)