juniorLLMs

What is self-attention in LLMs?

Updated May 16, 2026

Short answer

Self-attention allows each token to focus on other relevant tokens in the input sequence.

Deep explanation

Self-attention computes relationships between all tokens by generating query, key, and value vectors. The attention scores determine how much each token influences another. This enables context-aware representations.

Real-world example

In 'The cat sat on the mat', the model understands 'cat' relates to 'sat'.

Common mistakes

  • Confusing attention with memory.

Follow-up questions

  • What are Q, K, V matrices?
  • What is multi-head attention?

More LLMs interview questions

View all →