Attention and Transformer Seven Questions

Beck Moulton
7 min read3 days ago

Introduction

Recently, ChatGPT and other chatbots have pushed large language models LLMs to the forefront of the trend. This has led many people who are not in the field of ML and NLP to pay attention to and learn attention and Transformer models. In this article, we will raise several questions about the structure of the Transformer model and delve into the technical theory behind it. The audience of this article is colleagues who have read Transformer papers and have a general understanding of how attention works.

--

--

Beck Moulton

Focus on the back-end field, do actual combat technology sharing Buy me a Coffee if You Appreciate My Hard Work https://www.buymeacoffee.com/BeckMoulton