ZhangZhihui's Blog

公告

日历

导航

PyTorch - what's the difference between model's training mode and evaluation mode?

In PyTorch, a model can operate in two main modes:

Training mode — activated by calling model.train()
Evaluation mode — activated by calling model.eval()

These two modes affect how certain layers behave, not the computation graph or gradients directly.

Here’s the detailed difference 👇

🔹 1. `model.train()`

This sets the model to training mode.

Used during training (i.e., when you call loss.backward() and optimizer.step()).
Some layers behave differently in training mode:

Layer type	Behavior in `train()`
Dropout	Randomly zeroes some neurons (according to dropout probability `p`). This adds noise to help prevent overfitting.
BatchNorm	Uses mini-batch statistics (mean and variance of the current batch) to normalize activations and updates running averages.

So during training, both Dropout and BatchNorm behave stochastically and update their internal states.

🔹 2. `model.eval()`

This sets the model to evaluation (inference) mode.

Used during validation or testing.
You typically wrap evaluation code in torch.no_grad() to save memory and speed up computation.
Some layers change behavior:

Layer type	Behavior in `eval()`
Dropout	Disabled (no random dropout, all neurons are active).
BatchNorm	Uses running (moving average) statistics collected during training — not the batch’s mean/variance.

This ensures deterministic and consistent outputs during inference.

✅ Summary

Mode	Command	Dropout	BatchNorm	Use case
Training	`model.train()`	Active (random neuron drop)	Uses batch stats & updates running stats	When training
Evaluation	`model.eval()`	Inactive	Uses stored running stats	When validating/testing

posted on 2025-11-11 00:48 ZhangZhihuiAAA 阅读(10) 评论(0) 收藏举报

刷新页面返回顶部