ZhangZhihui's Blog  

In PyTorch, a model can operate in two main modes:

  • Training mode — activated by calling model.train()

  • Evaluation mode — activated by calling model.eval()

These two modes affect how certain layers behave, not the computation graph or gradients directly.

Here’s the detailed difference 👇


🔹 1. model.train()

This sets the model to training mode.

  • Used during training (i.e., when you call loss.backward() and optimizer.step()).

  • Some layers behave differently in training mode:

Layer typeBehavior in train()
Dropout Randomly zeroes some neurons (according to dropout probability p). This adds noise to help prevent overfitting.
BatchNorm Uses mini-batch statistics (mean and variance of the current batch) to normalize activations and updates running averages.

So during training, both Dropout and BatchNorm behave stochastically and update their internal states.


🔹 2. model.eval()

This sets the model to evaluation (inference) mode.

  • Used during validation or testing.

  • You typically wrap evaluation code in torch.no_grad() to save memory and speed up computation.

  • Some layers change behavior:

Layer typeBehavior in eval()
Dropout Disabled (no random dropout, all neurons are active).
BatchNorm Uses running (moving average) statistics collected during training — not the batch’s mean/variance.

This ensures deterministic and consistent outputs during inference.


 

✅ Summary

ModeCommandDropoutBatchNormUse case
Training model.train() Active (random neuron drop) Uses batch stats & updates running stats When training
Evaluation model.eval() Inactive Uses stored running stats When validating/testing

 

posted on 2025-11-11 00:48  ZhangZhihuiAAA  阅读(10)  评论(0)    收藏  举报