What is Q-Learning in reinforcement learning?

Updated May 17, 2026

Short answer

Q-Learning is a model-free reinforcement learning algorithm that learns the value of actions in states to find an optimal policy.

Deep explanation

Q-Learning estimates the optimal action-value function Q(s,a), which represents the expected reward of taking action a in state s and following the optimal policy afterward. It updates values using the Bellman equation without requiring a model of the environment.

Real-world example

Used in game AI where an agent learns optimal moves by playing repeatedly without prior knowledge of rules.

Common mistakes

  • Confusing Q-Learning with policy-based methods or assuming it requires environment modeling.

Follow-up questions

  • What does 'model-free' mean?
  • What is the Q-function?

More Q-Learning interview questions

View all →