juniorChatGPT
What are tokens in ChatGPT and why are they important?
Updated May 15, 2026
Short answer
Tokens are small pieces of text used by the model to process and generate language.
Deep explanation
Tokens can be words, subwords, or characters depending on encoding. ChatGPT processes text as tokens rather than raw strings. Tokenization helps the model handle vocabulary efficiently and reduces complexity. Each token is mapped to an embedding vector that represents semantic meaning.
Real-world example
API billing for ChatGPT is based on number of tokens processed.
Common mistakes
- Thinking tokens are always equal to words.
Follow-up questions
- Why does tokenization matter for cost?
- What is subword tokenization?