在深度学习的视觉VISION领域数据预处理的魔法常数magic constant、黄金数值的具体计算形式: mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]

======================================================

 

 

 

  https://github.com/pytorch/vision/issues/3657   中对原始预处理过程中的具体代码形式进行了讨论:

 

 

 

 

 

给出了对原始方法猜测后的具体的预处理数据的代码形式:

import torch
from torchvision import datasets, transforms as T

transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)

means = []
variances = []
for img in subset(dataset):
    means.append(torch.mean(img))
    variances.append(torch.std(img)**2)

mean = torch.mean(torch.stack(means), axis=0)
std = torch.sqrt(torch.mean(torch.stack(variances), axis=0))

 

 

 

 

 

回答:

 

 

 

从回答上可以看到原始计算的时候采用了这个形式的计算,部分内容在:

 https://github.com/pytorch/vision/pull/1965   给出了更具体的解释:

 

 

 

 

 

 

 

 

 

 

重点说明:

We know that they were calculated them on a random subset of the train split of the ImageNet2012 dataset. Which images were used or even the sample size as well as the used transformation are unfortunately lost.

 

 

 

 

同时作者对自己复现出的结果和原始结果的差距做了猜测和解释:

In #1439 my calculated stds differed significantly from the values we used. This resulted from the fact that I previously used sqrt(mean([var(img) for img in dataset])) while we probably used mean([std(img) for img in dataset]). You can find the script I've used for all calculations here.

 

 

作者在上一次复现的时候使用的代码:

sqrt(mean([var(img) for img in dataset]))

 

但是原始结果中的代码可能是:

mean([std(img) for img in dataset])

 

 

作者又给出了新的计算代码:

https://gist.github.com/pmeier/f5e05285cd5987027a98854a5d155e27

 

 

 

 

 

 

 

============================================================

 

posted on 2021-11-23 10:32  Angry_Panda  阅读(163)  评论(0)    收藏  举报

导航