From the course: Applied AI: Getting Started with Hugging Face Transformers

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

The GPT Transformer

The GPT Transformer

- The GPT Transformer is another popular transformer implementation that is used in multiple text generation tasks and scenarios. GPT stands for Generative Pre-trained Transformer. It is created by OpenAI. GPT can be used to generate new text sequences based on an initial prompt. It is popular for building natural language generation tasks and other tasks like question answering. The GPT architecture uses only the decoded side of the transformer. It uses masked self-attention that we discussed earlier in the course. There is no encoder hidden state input used. It is used to predict the next word in a sequence. It is popular with NLG tasks where the output generated can be iteratively fed back as input to generate more token sequences. Using transfer learning, the foundational GPT model can be adapted for a variety of NLP tasks. There are many variants of GPT. They are essentially improvements over the previous…

Contents