Skip to main content

Showing 1–4 of 4 results for author: Wilkins, G

  1. arXiv:2407.04014  [pdf, other

    cs.DC

    Offline Energy-Optimal LLM Serving: Workload-Based Energy Models for LLM Inference on Heterogeneous Systems

    Authors: Grant Wilkins, Srinivasan Keshav, Richard Mortier

    Abstract: The rapid adoption of large language models (LLMs) has led to significant advances in natural language processing and text generation. However, the energy consumed through LLM model inference remains a major challenge for sustainable AI deployment. To address this problem, we model the workload-dependent energy consumption and runtime of LLM inference tasks on heterogeneous GPU-CPU systems. By con… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 7 pages, appearing at HotCarbon 2024

  2. arXiv:2407.00010  [pdf, other

    cs.DC cs.AI

    Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads

    Authors: Grant Wilkins, Srinivasan Keshav, Richard Mortier

    Abstract: Both the training and use of Large Language Models (LLMs) require large amounts of energy. Their increasing popularity, therefore, raises critical concerns regarding the energy efficiency and sustainability of data centers that host them. This paper addresses the challenge of reducing energy consumption in data centers running LLMs. We propose a hybrid data center model that uses a cost-based sche… ▽ More

    Submitted 25 April, 2024; originally announced July 2024.

  3. arXiv:2404.02840  [pdf, ps, other

    cs.DC

    A Survey on Error-Bounded Lossy Compression for Scientific Datasets

    Authors: Sheng Di, Jinyang Liu, Kai Zhao, Xin Liang, Robert Underwood, Zhaorui Zhang, Milan Shah, Yafan Huang, Jiajun Huang, Xiaodong Yu, Congrong Ren, Hanqi Guo, Grant Wilkins, Dingwen Tao, Jiannan Tian, Sian Jin, Zizhe Jian, Daoce Wang, MD Hasanur Rahman, Boyuan Zhang, Jon C. Calhoun, Guanpeng Li, Kazutomo Yoshii, Khalid Ayed Alharthi, Franck Cappello

    Abstract: Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. These lossy compressors are designed with distinct compression models and design principles, such that each… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: submitted to ACM Computing journal, requited to be 35 pages including references

  4. arXiv:2312.13461  [pdf, other

    cs.DC

    FedSZ: Leveraging Error-Bounded Lossy Compression for Federated Learning Communications

    Authors: Grant Wilkins, Sheng Di, Jon C. Calhoun, Zilinghan Li, Kibaek Kim, Robert Underwood, Richard Mortier, Franck Cappello

    Abstract: With the promise of federated learning (FL) to allow for geographically-distributed and highly personalized services, the efficient exchange of model updates between clients and servers becomes crucial. FL, though decentralized, often faces communication bottlenecks, especially in resource-constrained scenarios. Existing data compression techniques like gradient sparsification, quantization, and p… ▽ More

    Submitted 24 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Appearing at 44th IEEE International Conference on Distributed Computing Systems (ICDCS)