What is Doubly Robust estimation in offline evaluation?

Updated May 17, 2026

Short answer

Doubly robust estimation combines model-based and importance sampling methods for unbiased evaluation.

Deep explanation

Doubly Robust (DR) estimators are used in Offline Policy Evaluation (OPE). They combine direct reward modeling and inverse propensity scoring (IPS). Even if one of the two models is wrong, the estimator can still remain unbiased under certain conditions. This makes it highly robust for recommender systems and ads ranking evaluation.

Unlock with a Pro subscription to view this section.

View pricing