Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-30279
Publication type: Conference paper
Type of review: Peer review (publication)
Title: So you want your private LLM at home? : a survey and benchmark of methods for efficient GPTs
Authors: Tuggener, Lukas
Sager, Pascal
Taoudi-Benchekroun, Yassine
Grewe, Benjamin F.
Stadelmann, Thilo
et. al: No
DOI: 10.21256/zhaw-30279
Conference details: 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024
Issue Date: 31-May-2024
Publisher / Ed. Institution: ZHAW Zürcher Hochschule für Angewandte Wissenschaften
Language: English
Subjects: Large language model; LlamaV2; Fine-tuning; LLM quantization; LLM deployment
Subject (DDC): 006: Special computer methods
Abstract: At least since the introduction of ChatGPT, the abilities of generative large language models (LLMs), sometimes called GPTs, are at the center of the attention of AI researchers, entrepreneurs, and others. However, for many applications, it is not possible to call an existing LLM service via an API due to data protection concerns or when no task-appropriate LLM exists. On the other hand, deploying or training a private LLM is often prohibitively computationally expensive. In this paper, we give an overview of the most important recent methodologies that help reduce the computational footprint of LLMs. We further present extensive benchmarks for seven methods from two of the most important areas of recent progress: model quantization and low-rank adapters, showcasing how it is possible to leverage state-of-the-art LLMs with limited resources. Our benchmarks include resource consumption metrics (e.g. GPU memory usage), a state-of-the-art quantitative performance evaluation as well as a qualitative performance study conducted by eight individual human raters. Our evaluations show that quantization has a profound effect on GPU memory requirements. However, we also show that these quantization methods, contrary to how they are advertised, cause a noticeable loss in text quality. We further show that low-rank adapters allow effective model fine-tuning with moderate compute resources. For methods that require less than 16 GB of GPU memory, we provide easy-to-use Jupyter notebooks that allow anyone to deploy and fine-tune state-of-theart LLMs on the Google Colab free tier within minutes without any prior experience or infrastructure.
URI: https://digitalcollection.zhaw.ch/handle/11475/30279
Fulltext version: Accepted version
License (according to publishing contract): Licence according to publishing contract
Departement: School of Engineering
Organisational Unit: Centre for Artificial Intelligence (CAI)
Published as part of the ZHAW project: Practical data efficient deep learning trough contrastive self-supervised learning
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2024_Tuggener-etal_Survey-and-benchmark-of-methods-for-efficient-GPTs_SDS.pdf1.99 MBAdobe PDFThumbnail
View/Open
Show full item record
Tuggener, L., Sager, P., Taoudi-Benchekroun, Y., Grewe, B. F., & Stadelmann, T. (2024, May 31). So you want your private LLM at home? : a survey and benchmark of methods for efficient GPTs. 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024. https://doi.org/10.21256/zhaw-30279
Tuggener, L. et al. (2024) ‘So you want your private LLM at home? : a survey and benchmark of methods for efficient GPTs’, in 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024. ZHAW Zürcher Hochschule für Angewandte Wissenschaften. Available at: https://doi.org/10.21256/zhaw-30279.
L. Tuggener, P. Sager, Y. Taoudi-Benchekroun, B. F. Grewe, and T. Stadelmann, “So you want your private LLM at home? : a survey and benchmark of methods for efficient GPTs,” in 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024, May 2024. doi: 10.21256/zhaw-30279.
TUGGENER, Lukas, Pascal SAGER, Yassine TAOUDI-BENCHEKROUN, Benjamin F. GREWE und Thilo STADELMANN, 2024. So you want your private LLM at home? : a survey and benchmark of methods for efficient GPTs. In: 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024. Conference paper. ZHAW Zürcher Hochschule für Angewandte Wissenschaften. 31 Mai 2024
Tuggener, Lukas, Pascal Sager, Yassine Taoudi-Benchekroun, Benjamin F. Grewe, and Thilo Stadelmann. 2024. “So You Want Your Private LLM at Home? : A Survey and Benchmark of Methods for Efficient GPTs.” Conference paper. In 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024. ZHAW Zürcher Hochschule für Angewandte Wissenschaften. https://doi.org/10.21256/zhaw-30279.
Tuggener, Lukas, et al. “So You Want Your Private LLM at Home? : A Survey and Benchmark of Methods for Efficient GPTs.” 11th IEEE Swiss Conference on Data Science (SDS), Zurich, Switzerland, 30-31 May 2024, ZHAW Zürcher Hochschule für Angewandte Wissenschaften, 2024, https://doi.org/10.21256/zhaw-30279.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.