Publications

2026

  1. NSDI’26
    HydraServe: Minimizing Cold Start Latency for Serverless LLM Serving in Public Clouds
    Chiheng Lou, Sheng Qi, Chao Jin, and 5 more authors
    In 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26) (to appear), 2026

2025

  1. arXiv
    WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving
    Chiheng Lou, Sheng Qi, Rui Kang, and 6 more authors
    arXiv:2512.09472, 2025
  2. TON
    Efficient Far Memory-Aware Scheduling With FaMAS
    Chiheng Lou and Xin Jin
    IEEE Transactions on Networking, 2025

2024

  1. ASPLOS’24
    SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge Servers
    Daliang Xu, Mengwei Xu, Chiheng Lou, and 4 more authors
    In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 24), 2024
  2. TMC
    Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge Servers
    Mengwei Xu, Daliang Xu, Chiheng Lou, and 4 more authors
    IEEE Transactions on Mobile Computing, 2024