在深度学习的视觉VISION领域数据预处理的魔法常数magic constant、黄金数值的具体计算形式： mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]

======================================================

在 https://github.com/pytorch/vision/issues/3657 中对原始预处理过程中的具体代码形式进行了讨论：

给出了对原始方法猜测后的具体的预处理数据的代码形式：

import torch
from torchvision import datasets, transforms as T

transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)

means = []
variances = []
for img in subset(dataset):
    means.append(torch.mean(img))
    variances.append(torch.std(img)**2)

mean = torch.mean(torch.stack(means), axis=0)
std = torch.sqrt(torch.mean(torch.stack(variances), axis=0))

回答：

从回答上可以看到原始计算的时候采用了这个形式的计算，部分内容在：

https://github.com/pytorch/vision/pull/1965 给出了更具体的解释：

重点说明：

We know that they were calculated them on a random subset of the train split of the ImageNet2012 dataset. Which images were used or even the sample size as well as the used transformation are unfortunately lost.

同时作者对自己复现出的结果和原始结果的差距做了猜测和解释：

In #1439 my calculated stds differed significantly from the values we used. This resulted from the fact that I previously used sqrt(mean([var(img) for img in dataset])) while we probably used mean([std(img) for img in dataset]). You can find the script I've used for all calculations here.

作者在上一次复现的时候使用的代码：

sqrt(mean([var(img) for img in dataset]))

但是原始结果中的代码可能是：

mean([std(img) for img in dataset])

作者又给出了新的计算代码：

https://gist.github.com/pmeier/f5e05285cd5987027a98854a5d155e27

============================================================

posted on 2021-11-23 10:32 Angry_Panda 阅读(171) 评论(0) 收藏举报

刷新页面返回顶部

Angry Panda（T-800）

在深度学习的视觉VISION领域数据预处理的魔法常数magic constant、黄金数值的具体计算形式： mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]

公告

导航