A Hackers' Guide to Language Models

2023/09/24 に公開

視聴回数 496,164 回

In this deeply informative video, Jeremy Howard, co-founder of fast.ai and creator of the ULMFiT approach on which all modern language models (LMs) are based, takes you on a comprehensive journey through the fascinating landscape of LMs. Starting with the foundational concepts, Jeremy introduces the architecture and mechanics that make these AI systems tick. He then delves into critical evaluations of GPT-4, illuminates practical uses of language models in code writing and data analysis, and offers hands-on tips for working with the OpenAI API. The video also provides expert guidance on technical topics such as fine-tuning, decoding tokens, and running private instances of GPT models.

As we move further into the intricacies, Jeremy unpacks advanced strategies for model testing and optimization, utilizing tools like GPTQ and Hugging Face Transformers. He also explores the potential of specialized datasets like Orca and Platypus for fine-tuning and discusses cutting-edge trends in Retrieval Augmented Generation and information retrieval. Whether you're new to the field or an established professional, this presentation offers a wealth of insights to help you navigate the ever-evolving world of language models.

(The above summary was, of course, created by an LLM!)

For the notebook used in this talk, see https://github.com/fastai/lm-hackers.

00:00:00 Introduction & Basic Ideas of Language Models
00:18:05 Limitations & Capabilities of GPT-4
00:31:28 AI Applications in Code Writing, Data Analysis & OCR
00:38:50 Practical Tips on Using OpenAI API
00:46:36 Creating a Code Interpreter with Function Calling
00:51:57 Using Local Language Models & GPU Options
00:59:33 Fine-Tuning Models & Decoding Tokens
01:05:37 Testing & Optimizing Models
01:10:32 Retrieval Augmented Generation
01:20:08 Fine-Tuning Models
01:26:00 Running Models on Macs
01:27:42 Llama.cpp & Its Cross-Platform Abilities

This is an extended version of the keynote given at posit::conf(2023). Thanks to @wolpumba4099 for chapter titles.