Attention and Transformer Seven Questions
7 min read 3 days ago
Introduction
Recently, ChatGPT and other chatbots have pushed large language models LLMs to the forefront of the trend. This has led many people who are not in the field of ML and NLP to pay attention to and learn attention and Transformer models. In this article, we will raise several questions about the structure of the Transformer model and delve into the technical theory behind it. The audience of this article is colleagues who have read Transformer papers and have a general understanding of how attention works.