grid_sample()函数及双线性采样

1. grid_sample函数的接口声明

torch.nn.functional.grid_sample(input, grid, mode='bilinear', padding_mode='zeros', align_corners=None)

在官方文档里面关于该函数的作用是这样描述的：

Given an input and a flow-field grid, computes the output using input values and pixel locations from grid.

关于 input、grid 以及 output 的尺寸如下所示：

input: (N, C, $H_{in}$, $W_{in}$)

grid : (N, $H_{out}$, $W_{out}$, 2)

outinput: (N, C, $H_{out}$, $W_{out}$)

2. grid和input坐标转换

根据 grid 中每个位置提供的坐标信息 (grid的位置转换为 input 中 pixel 的坐标)，将 input 中对应位置的像素值填充到 grid 指定的位置，得到最终的输出。关键的处理过程在于 grid，grid 的最后一维的大小为 2，即表示 input 中 pixel 的位置信息 (x,y), 这里一般会将 x 和 y 的取值范围归一化到 [-1,1] 之间，那如何对应在 input 上呢。这个来看一下 pytorch 的底层源码。
第 66 行到 71 行，获取到了 grid 的 x 和 y，之后对其做了新的变换，变到 input 的坐标系下了。IW 和 IH 是 input 的宽和高。

real ix = THTensor_fastGet4d(grid, n, h, w, 0);
real iy = THTensor_fastGet4d(grid, n, h, w, 1);

// normalize ix, iy from [-1, 1] to [0, IH-1] & [0, IW-1]
ix = ((ix + 1) / 2) * (IW-1);
iy = ((iy + 1) / 2) * (IH-1);

3. 线性插值

ix和iy有可能是小数那么函数将会根据参数_padding_mode_的设定进行不同的处理。

padding_mode='zeros': 对于越界的位置在网格中采用 pixel value=0 进行填充。
padding_mode='border': 对于越界的位置在网格中采用边界的 pixel value 进行填充。
padding_mode='reflection': 对于越界的位置在网格中采用关于边界的对称值进行填充。

对于 mode='bilinear'参数，则定义了在 input 中指定位置的 pixel value 中进行插值的方法，为什么需要插值呢？因为前面我们说了，grid 中表示的位置信息 x 和 y 的取值范围在 [-1,1]之间，这就意味着我们要根据一个浮点型的坐标值在 input 中对 pixel value 进行采样，mode 有nearest和bilinear两种模式。 nearest 就是直接采用与 (x,y)距离最近处的像素值来填充 grid，而 bilinear 则是采用双线性插值的方法来进行填充，总之其与 nearest 的区别就是 nearest 只考虑最近点的 pixel value，而 bilinear 则采用(x,y)周围的四个 pixel value 进行加权平均值来填充 grid。

参考资料

https://zhuanlan.zhihu.com/p/112030273
https://blog.csdn.net/qq_34914551/article/details/107559031

posted @ 2022-06-27 22:35 qufang 阅读(920) 评论(0) 收藏举报

刷新页面返回顶部

Qufang

grid_sample()函数及双线性采样

1. grid_sample函数的接口声明

2. grid和input坐标转换

3. 线性插值

参考资料

公告