How does PyTorch optimizer.step() interact with autograd gradients internally?

Updated May 17, 2026

Short answer

optimizer.step() reads gradients from .grad fields and updates parameters without touching the computation graph.

Deep explanation

After loss.backward(), gradients are stored in leaf tensors' .grad attributes. optimizer.step() iterates over parameter groups, applies optimization rules (SGD, Adam, etc.), and updates parameters in-place. It does not interact with autograd graph directly but relies on accumulated gradients. The graph is already freed unless retain_graph=True.

Unlock with a Pro subscription to view this section.

View pricing