2024

December 25, 2024
11 min read

Attention Mechanism, Core of Transformer Models

In this blog post, I will focus on the core principles of transformer models, specifically the self-attention mechanism. To keep the discussion straightforward, I will approach the concepts from the perspective of decoder-only models like GPT.