InversionNet网络简述

说明:为了看懂FWI这一类网络模型的构造形式,这里决定使用我们在这个领域比较出现早的一个模型进行详细的学习,旨在希望通过这样一个网络模型把这一类的问题的模型构造进行一个归一化处理。

首先,这里放上我们模型的源码地址:https://github.com/lanl/OpenFWI(关于这个OpenFWI可以讲的也很多,后面有机会慢慢来说,今天的主要目的就是这个模型到底是怎么构造的)官网给出的数据集:https://openfwi-lanl.github.io/docs/data.html#vel

(从零开始学模型中,这里可能有很多还是讲不明白的,有错误请指出:))

一段一段看:

NORM_LAYERS = { 'bn': nn.BatchNorm2d, 'in': nn.InstanceNorm2d, 'ln': nn.LayerNorm }

# Replace the key names in the checkpoint in which legacy network building blocks are used 
def replace_legacy(old_dict):
    li = []
    for k, v in old_dict.items():
        k = (k.replace('Conv2DwithBN', 'layers')
              .replace('Conv2DwithBN_Tanh', 'layers')
              .replace('Deconv2DwithBN', 'layers')
              .replace('ResizeConv2DwithBN', 'layers'))
        li.append((k, v))
    return OrderedDict(li)
  • NORM_LAYERS:归一化操作处理,实现归一化层的映射。
    • 简单的说,归一化主要是为了防止梯度爆炸,类似加快训练速度这样的功能把
    • bn:批量归一化;in:实例归一化;ln:层归一化
  • replace_legacy():这里主要就起到一个替换的总用,把我们要处理的layer层的名字统一了
    • 用chat的这个例子会好理解一点:
class Conv2DwithBN(nn.Module):
    def __init__(self, in_fea, out_fea, 
                kernel_size=3, stride=1, padding=1,
                bn=True, relu_slop=0.2, dropout=None):
        super(Conv2DwithBN,self).__init__()
        layers = [nn.Conv2d(in_channels=in_fea, out_channels=out_fea, kernel_size=kernel_size, stride=stride, padding=padding)]
        if bn:
            layers.append(nn.BatchNorm2d(num_features=out_fea))
        layers.append(nn.LeakyReLU(relu_slop, inplace=True))
        if dropout:
            layers.append(nn.Dropout2d(0.8))
        self.Conv2DwithBN = nn.Sequential(*layers)

    def forward(self, x):
        return self.Conv2DwithBN(x)
  • Conv2DwithBN( ):
    • 输入:3*3的卷积核,步长1,padding1维持输入尺寸不发生改变,添加BatchNorm2d进行归一化操作,添加 LeakyReLU 的负斜率(y=0.2x),不添加dropout(0.8概率丢弃)
    • 定义了如上结构的网络层
    • if判断bn/dropout,添加对应的内容
    • 最后向前传播
  • 思考:这个类比之前学的FCNVMB模型,就很像一开始那个3*3进行层内卷积的东西,主要作用是提取这个层内的特性。
class Conv2DwithBN_Tanh(nn.Module):
    def __init__(self, in_fea, out_fea, kernel_size=3, stride=1, padding=1):
        super(Conv2DwithBN_Tanh, self).__init__()
        layers = [nn.Conv2d(in_channels=in_fea, out_channels=out_fea, kernel_size=kernel_size, stride=stride, padding=padding)]
        layers.append(nn.BatchNorm2d(num_features=out_fea))
        layers.append(nn.Tanh())
        self.Conv2DwithBN = nn.Sequential(*layers)

    def forward(self, x):
        return self.Conv2DwithBN(x)
  • Conv2DwithBN_Tanh:
    • 构成:Conv2d(卷积)+BatchNorm2d(稳定训练)+Tanh(激活函数 [-1,1])
    • Sequential部分实现拼接组成
  • 思考:特征处理;降噪
class ConvBlock(nn.Module):
    def __init__(self, in_fea, out_fea, kernel_size=3, stride=1, padding=1, norm='bn', relu_slop=0.2, dropout=None):
        super(ConvBlock,self).__init__()
        layers = [nn.Conv2d(in_channels=in_fea, out_channels=out_fea, kernel_size=kernel_size, stride=stride, padding=padding)]
        if norm in NORM_LAYERS:
            layers.append(NORM_LAYERS[norm](out_fea))
        layers.append(nn.LeakyReLU(relu_slop, inplace=True))
        if dropout:
            layers.append(nn.Dropout2d(0.8))
        self.layers = nn.Sequential(*layers)

    def forward(self, x):
        return self.layers(x)
  • 这个代码和上面那个其实差不多(我是说结构),这里也是一个3*3的标准卷积块,这里归一化更加灵活,支持三种方式,然后添加了dropout
class ConvBlock_Tanh(nn.Module):
    def __init__(self, in_fea, out_fea, kernel_size=3, stride=1, padding=1, norm='bn'):
        super(ConvBlock_Tanh, self).__init__()
        layers = [nn.Conv2d(in_channels=in_fea, out_channels=out_fea, kernel_size=kernel_size, stride=stride, padding=padding)]
        if norm in NORM_LAYERS:
            layers.append(NORM_LAYERS[norm](out_fea))
        layers.append(nn.Tanh())
        self.layers = nn.Sequential(*layers)

    def forward(self, x):
        return self.layers(x)
  • 和上面差不多,但是使用Tanh作为激活函数(也就是说我们这里的输出范围固定了,前面那个的输出范围是不确定并且很大的)

到这里为止,上面类似ConvBlock的结构都是用于下采样的。

class DeconvBlock(nn.Module):
    def __init__(self, in_fea, out_fea, kernel_size=2, stride=2, padding=0, output_padding=0, norm='bn'):
        super(DeconvBlock, self).__init__()
        layers = [nn.ConvTranspose2d(in_channels=in_fea, out_channels=out_fea, kernel_size=kernel_size, stride=stride, padding=padding, output_padding=output_padding)]
        if norm in NORM_LAYERS:
            layers.append(NORM_LAYERS[norm](out_fea))
        layers.append(nn.LeakyReLU(0.2, inplace=True))
        self.layers = nn.Sequential(*layers)

    def forward(self, x):
        return self.layers(x)
  • DeconvBlock():用于上采样
    • 输入:输入通道/输出通道,2*2,步长2(也就是说放大两倍),padding:0保证输出大小一致
    • ConvTranspose2d:可以先不怎么管他,记住它是一个转置矩阵进行上采样,可以升维。步长越大输出越大
class ResizeBlock(nn.Module):
    def __init__(self, in_fea, out_fea, scale_factor=2, mode='nearest', norm='bn'):
        super(ResizeBlock, self).__init__()
        layers = [nn.Upsample(scale_factor=scale_factor, mode=mode)]
        layers.append(nn.Conv2d(in_channels=in_fea, out_channels=out_fea, kernel_size=3, stride=1, padding=1))
        if norm in NORM_LAYERS:
            layers.append(NORM_LAYERS[norm](out_fea))
        layers.append(nn.LeakyReLU(0.2, inplace=True))
        self.layers = nn.Sequential(*layers)

    def forward(self, x):
        return self.layers(x)
  • ResizeBlock():
    • upsample():线性插值进行采样
    • conv2d卷积操作
    • 归一化操作选择
    • 激活函数LeakyReLU

简单说一下,同样是作用于上采样过程,这两种的区别体现在:DeconvBlock使用转置有学习的作用,但是ResizeBlock仅仅是简单的特征提取、尺寸放大,它不能学习。

这里为止,我们把我们网络的基本模型的构件模块基本是介绍完了,简单总结一下:这里基本上分为两大种模块:下采样模块和上采样模块。

对下/上采样模块我们的基本思路是一样的:卷积操作、归一化、激活函数,这里最后按照我们的需求去填充就可以了。

在了解了这些之后,才是我们这个网络的运算核心:

我们先看到初始化定义这块:

class InversionNet(nn.Module):
    def __init__(self, dim1=32, dim2=64, dim3=128, dim4=256, dim5=512, sample_spatial=1.0, **kwargs):
        super(InversionNet, self).__init__()

这里先定义了我们输出通道数(这个在前面最简单的FCNVMB的模型里面也说到了),怎么理解这个东西呢?有点像分辨率,一步一步变得更加清晰。

对应下采样的过程:

self.convblock1 = ConvBlock(5, dim1, kernel_size=(7, 1), stride=(2, 1), padding=(3, 0))
        self.convblock2_1 = ConvBlock(dim1, dim2, kernel_size=(3, 1), stride=(2, 1), padding=(1, 0))
        self.convblock2_2 = ConvBlock(dim2, dim2, kernel_size=(3, 1), padding=(1, 0))
        self.convblock3_1 = ConvBlock(dim2, dim2, kernel_size=(3, 1), stride=(2, 1), padding=(1, 0))
        self.convblock3_2 = ConvBlock(dim2, dim2, kernel_size=(3, 1), padding=(1, 0))
        self.convblock4_1 = ConvBlock(dim2, dim3, kernel_size=(3, 1), stride=(2, 1), padding=(1, 0))
        self.convblock4_2 = ConvBlock(dim3, dim3, kernel_size=(3, 1), padding=(1, 0))

        self.convblock5_1 = ConvBlock(dim3, dim3, stride=2)
        self.convblock5_2 = ConvBlock(dim3, dim3)
        self.convblock6_1 = ConvBlock(dim3, dim4, stride=2)
        self.convblock6_2 = ConvBlock(dim4, dim4)
        self.convblock7_1 = ConvBlock(dim4, dim4, stride=2)
        self.convblock7_2 = ConvBlock(dim4, dim4)
        self.convblock8 = ConvBlock(dim4, dim5, kernel_size=(8, ceil(70 * sample_spatial / 8)), padding=0)

这里对应的前四步有点像半步操作,只减半了高度,宽度没管。后面就是在等比缩小了。

这前四步有什么作用?可以更好地集中特征。

最后还有这个convblock8比较不一样,这里就是大概固定了我们地大小,把它取了个整(这里其实也对应了我们unet里面地瓶颈操作)

接下来是我们地上采样:

self.deconv1_1 = DeconvBlock(dim5, dim5, kernel_size=5)
self.deconv1_2 = ConvBlock(dim5, dim5)
self.deconv2_1 = DeconvBlock(dim5, dim4, kernel_size=4, stride=2, padding=1)
self.deconv2_2 = ConvBlock(dim4, dim4)
self.deconv3_1 = DeconvBlock(dim4, dim3, kernel_size=4, stride=2, padding=1)
self.deconv3_2 = ConvBlock(dim3, dim3)
self.deconv4_1 = DeconvBlock(dim3, dim2, kernel_size=4, stride=2, padding=1)
self.deconv4_2 = ConvBlock(dim2, dim2)
self.deconv5_1 = DeconvBlock(dim2, dim1, kernel_size=4, stride=2, padding=1)
self.deconv5_2 = ConvBlock(dim1, dim1)
x = F.pad(x, [-5, -5, -5, -5], mode="constant", value=0)
self.deconv6 = ConvBlock_Tanh(dim1, 1)

这里大概是这样的:

首先上采样(其实有点像大小缩放了,就是这个DeconvBlock)然后ConvBlock有点像进行内层的卷积进行特征提取。

每个 DeconvBlock 都将空间尺寸增加一倍(通过 stride=2),并且通过 ConvBlock 提取更多特征。通过这样逐步恢复空间分辨率,最终的输出会具有适当的尺寸和通道数。

我们关注到这个最后一步操作:

x = F.pad(x, [-5, -5, -5, -5], mode="constant", value=0)进行规模填充

self.deconv6 = ConvBlock_Tanh(dim1, 1),它正常卷积还是在卷积的,只是多了那个激活函数,然后我们这里最后输出的是单通道的1(灰色的)

总的过程大概就是这样。

(这里丢一个我们的尺寸到底怎么变得:)

Input size: torch.Size([5, 5, 1000, 70])
After convblock1: torch.Size([5, 32, 500, 70])
After convblock2_1: torch.Size([5, 64, 250, 70])
After convblock2_2: torch.Size([5, 64, 250, 70])
After convblock3_1: torch.Size([5, 64, 125, 70])
After convblock3_2: torch.Size([5, 64, 125, 70])
After convblock4_1: torch.Size([5, 128, 63, 70])
After convblock4_2: torch.Size([5, 128, 63, 70])
After convblock5_1: torch.Size([5, 128, 32, 35])
After convblock5_2: torch.Size([5, 128, 32, 35])
After convblock6_1: torch.Size([5, 256, 16, 18])
After convblock6_2: torch.Size([5, 256, 16, 18])
After convblock7_1: torch.Size([5, 256, 8, 9])
After convblock7_2: torch.Size([5, 256, 8, 9])
After convblock8: torch.Size([5, 512, 1, 1])
After deconv1_1: torch.Size([5, 512, 5, 5])
After deconv1_2: torch.Size([5, 512, 5, 5])
After deconv2_1: torch.Size([5, 256, 10, 10])
After deconv2_2: torch.Size([5, 256, 10, 10])
After deconv3_1: torch.Size([5, 128, 20, 20])
After deconv3_2: torch.Size([5, 128, 20, 20])
After deconv4_1: torch.Size([5, 64, 40, 40])
After deconv4_2: torch.Size([5, 64, 40, 40])
After deconv5_1: torch.Size([5, 32, 80, 80])
After deconv5_2: torch.Size([5, 32, 80, 80])
Lst deconv6:torch.Size([5, 1, 70, 70])

这里还有这么一个东西说一下:

class Discriminator(nn.Module):
    def __init__(self, dim1=32, dim2=64, dim3=128, dim4=256, **kwargs):
        super(Discriminator, self).__init__()
        self.convblock1_1 = ConvBlock(1, dim1, stride=2)
        self.convblock1_2 = ConvBlock(dim1, dim1)
        self.convblock2_1 = ConvBlock(dim1, dim2, stride=2)
        self.convblock2_2 = ConvBlock(dim2, dim2)
        self.convblock3_1 = ConvBlock(dim2, dim3, stride=2)
        self.convblock3_2 = ConvBlock(dim3, dim3)
        self.convblock4_1 = ConvBlock(dim3, dim4, stride=2)
        self.convblock4_2 = ConvBlock(dim4, dim4)
        self.convblock5 = ConvBlock(dim4, 1, kernel_size=5, padding=0)

    def forward(self, x):
        x = self.convblock1_1(x)
        x = self.convblock1_2(x)
        x = self.convblock2_1(x)
        x = self.convblock2_2(x)
        x = self.convblock3_1(x)
        x = self.convblock3_2(x)
        x = self.convblock4_1(x)
        x = self.convblock4_2(x)
        x = self.convblock5(x)
        x = x.view(x.shape[0], -1)
        return x

这个东西就是我们所熟知的判别器,但是捏,我们这里毕竟不是GAN(对抗型神经网络),所以我倾向这里是为了提高模型的精确程度。判别器本身是用来区分真实数据和虚假数据的,现在我们有了这个,原来我们拟合的数据或许也就会变得更加准确。

至此,大的结构也就初步介绍完了~(有些细节和延申没有深入),FWI模型里比较早的那一批了,很有学习的价值。

posted @ 2025-03-27 20:32  stribik  阅读(47)  评论(0)    收藏  举报