Efficient Fine-Tuning for Llama-v2-7b on a Single GPU
The first problem you’re likely to encounter when fine-tuning an LLM is the “host out of memory” error. It’s more difficult for fine-tuning the 7B parameter Llama-2 model which requires more memory. In this talk, we are having Piero Molino and Travis Addair from the open-source Ludwig project to show you how to tackle this problem.
The good news is that, with an optimized LLM training framework like , you can get the host memory overhead back down to a more reasonable host memory even when training on multiple GPUs.
In this hands-on workshop, we‘ll discuss the unique challenges in finetuning LLMs and show you how you can tackle these challenges with open-source tools through a demo.
By the end of this session, attendees will understand:
- How to fine-tune LLMs like Llama-2-7b on a single GPU
- Techniques like parameter efficient tuning and quantization, and how they can help
- How to train a 7b param model on a single T4 GPU (QLoRA)
- How to deploy tuned models l
1 view
447
210
1 month ago 00:15:34 1
LiDAR Simulation Guide: Using ANSYS SPEOS for Best Results