Publications
2026
- ICML’26WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM ServingIn 43rd International Conference on Machine Learning (ICML 26) (to appear), 2026
- NSDI’26HydraServe: Minimizing Cold Start Latency for Serverless LLM Serving in Public CloudsIn 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26), 2026
2025
- TON
2024
- ASPLOS’24SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge ServersIn Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 24), 2024
- TMCEfficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge ServersIEEE Transactions on Mobile Computing, 2024