9层transformer

train loss 31.904207130958294
train loss 16.793884386961487
this dev loss: 13.495090179443359
unseen dev loss: 14.614468665350051
        self.transformer = TransformerModel(
            d_model=32,
            nhead=4,
            num_encoder_layers=9,
            dim_feedforward=128,
            max_len=512
        )
        self.transformer2 = TransformerModel(
            d_model=32,
            nhead=4,
            num_encoder_layers=9,
            dim_feedforward=128,
            max_len=512
        )
        self.BigConv1 = unireplknet.UniRepLKNet(
            in_chans=2,
            num_classes=128,
            depths=(3, 3, 27, 3),
            dims=(48, 48 * 2, 48 * 4, 48 * 4),
            drop_path_rate=0.,
            layer_scale_init_value=1e-6,
            head_init_scale=1.,
            kernel_sizes=None,
            deploy=False,
            with_cp=False,
            init_cfg=None,
            attempt_use_lk_impl=False,
            use_sync_bn=False,
        )

9层transformer增加维度128-》196

train loss 31.223137216134504
train loss 17.56274213032289
this dev loss: 14.279271240234374
unseen dev loss: 13.952949603398642

6层transformer,256维度,将bigconv接到transformer后面

train loss 30.28354885763751
train loss 17.151049870930745
this dev loss: 11.611524124940237
unseen dev loss: 10.601472668025805

6层transformer,256维度,将bigconv接到transformer后面,bigconv初始channel 64,不是简单相加,而是拼接再经过线性层

train loss 31.781492029825845
train loss 18.02604025522868
this dev loss: 14.87015537988572
unseen dev loss: 13.945914615284313
posted on 2023-12-28 11:42  FrostyForest  阅读(13)  评论(0编辑  收藏  举报