Open Research Newcastle
Browse

Optimizing the utilization of Large Language Models via schedule optimization: an exploratory study

Download (2.59 MB)
conference contribution
posted on 2025-05-09, 22:22 authored by Yueyue Liu, Hongyu Zhang, Zhiqiang Li, Yuantian Miao
Background: Large Language Models (LLMs) have gained significant attention in machine-learning-as-a-service (MLaaS) offerings. In-context learning (ICL) is a technique that guides LLMs towards accurate query processing by providing additional information. However, longer prompts lead to higher costs of LLM service, creating a performance-cost trade-off. Aims: We aim to investigate the potential of combining schedule optimization with ICL to optimize LLM utilization. Method: We conduct an exploratory study. First, we consider the performance-cost trade-off in LLM utilization as a multi-objective optimization problem, aiming to select the most suitable prompt template for each LLM job to maximize accuracy (the percentage of correctly processed jobs) and minimize invocation cost. Next, we investigate three methods for prompt performance prediction to address the challenge of evaluating the accuracy objective in the fitness function, as the result can only be determined after submitting the job to the LLM. Finally, we apply widely used search-based techniques and evaluate their effectiveness. Results: The results indicate that the machine learning-based technique is an effective approach for prompt performance prediction and fitness function calculation. Schedule optimization can achieve higher accuracy or lower cost by selecting a suitable prompt template for each job, compared to simply submitting all jobs using a single prompt template, e.g., saving costs from 21.33% to 86.92% in our experiments on LLM-based log parsing. However, the performance of the evaluated search-based techniques varies across different instances and metrics, with no single technique consistently outperforming the others. Conclusions: This study demonstrates the potential of combining schedule optimization with ICL to improve the utilization of LLMs. However, there is still ample room for improving the searched-based techniques and prompt performance prediction techniques for more cost-effective LLM utilization.

Funding

ARC

DP200102940

DP220103044

History

Source title

Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Name of conference

ESEM '24: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Location

Spain

Start date

2024-10-24

End date

2024-10-25

Pagination

84-95

Editors

Franch, X., & Daneva, M.

Publisher

Association for Computing Machinery

Place published

United States

Language

  • en, English

College/Research Centre

College of Engineering, Science and Environment

School

School of Information and Physical Sciences

Rights statement

2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Usage metrics

    Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC