The advent of large language models has taken the AI world by storm. Outside of proprietary foundation models like GPT-4, open-source models are playing a pivotal role in driving the AI revolution forward, democratizing access for anyone looking to leverage these models in production. One of the biggest challenges in generating high-quality output from open-source models rests in fine-tuning, where we improve their outputs based on a series of instructions.
In this session, we take a step-by-step approach to fine-tune a Llama 2 model on a custom dataset. First, we build our own dataset using techniques to remove duplicates and analyze the number of tokens. Then, we fine-tune the Llama 2 model using state-of-the art techniques from the Axolotl library. Finally, we see how to run our fine-tuned model and evaluate its performance.
Maxime Labonne is a Senior Machine Learning Scientist at JPMorgan and holds a Ph.D. in machine learning from the Polytechnic Institute of Paris. Since 2019, he has been actively involved in working with Large Language Models and Graph Neural Networks, applying his expertise in diverse sectors such as R&D, industry, finance, and academia. Beyond his practical work, Maxime is a prolific writer, contributing technical articles on machine learning and data science, which are available on his blog as well as Towards Data Science. He is also the author of "Hands-On Graph Neural Networks using Python," published by Packt.