From the course: Applied AI: Getting Started with Hugging Face Transformers
Unlock the full course today
Join today to access over 23,200 courses taught by industry experts.
Transformer training and inference
From the course: Applied AI: Getting Started with Hugging Face Transformers
Transformer training and inference
- [Instructor] Training a transformer follows a similar process as training any deep learning model. We will briefly discuss these steps in this video. The first step in training a transformer is creating the transformer architecture. This requires decisions on a variety of parameters and hyperparameters, the number of encoder and decoder layers, number of attention heads, feedforward network architecture, and normalization techniques are some of the key decision points. Then, we initialize the weights and other parameters. Please note that there are weights both in the attention block and also the feedforward network block, and this is across multiple layers of the encoder and decoder stack. Then, we pass the training data to the encoder-decoder pipeline and predict the output. The output is then compared with true labels and the cost is determined. The cost is then used to update the weights across the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.