GPT in PyTorch

In this video, we are going to implement the GPT2 model from scratch. We are only going to focus on the inference and not on the training logic. We will cover concepts like self attention, decoder blocks and generating new tokens. Paper: Code minGPT: Code transformers: #L946 Code from the video: 00:00 Intro 01:32 Overview: Main goal [slides] 02:06 Overview: Forward pass [slides] 03:39 Overview: GPT module (part 1) [slides] 04:28 Overview: GPT module (part 2) [slides] 05:25 Overview: Decoder block [slides] 06:10 Overview: Masked self attention [slides] 07:52 Decoder module [code] 13:40 GPT module [code] 18:19 Copying a tensor [code] 19:26 Copying a Decoder module [co

13 views