What is SARSA algorithm?

Updated May 17, 2026

Short answer

SARSA is an on-policy Q-learning variant.

Deep explanation

It updates Q-values based on the action actually taken by current policy, not the optimal one.

Real-world example

Used in safer exploration scenarios like robotics.

Common mistakes

  • Confusing SARSA with off-policy Q-learning.

Follow-up questions

  • Why is SARSA safer?
  • What is on-policy learning?

More Q-Learning interview questions

View all →