记一次PINN的bug修复

在train函数中，如果我们使用验证集，大概会有以下代码片段：

model.eval() # Set your model to evaluation mode.
        loss_record = []
        for x, y in valid_loader:
            x, y = x.to(device), y.to(device)
            with torch.no_grad():
                pred = model(x)
                loss = criterion(pred, y) + compute_physical_loss(x[:, 1], x[:, 2], x[:, 0], model, device)

            loss_record.append(loss.item())

而在我们的物理损失函数compute_physical_loss中，会有以下求导片段

T_x = torch.autograd.grad(
            T,
            x,
            grad_outputs=torch.ones_like(T),
            retain_graph=True,
            create_graph=True,
        )[0]

这样运行会报错：

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow unused=True if this is the desired behavior

当我们去掉compute_physical_loss时是可以运行的，导致我以为是loss函数设计错误，其实不是。

如果你真按照报错说的设置allow unused=True，那只会导致别的报错。
本人利用互联网与gpt多日仍未找到解答，最终在慢慢调试中发现：

with torch.no_grad():

这句代码与我们在compute_physical_loss里要计算torch.autograd.grad是矛盾的，虽然它本意是在evaluation mode模式下不需要计算梯度，但只要不使用该代码即可运行。

最后只要这样就好了：

model.eval() # Set your model to evaluation mode.
        loss_record2 = []

        for batch in valid_loader:

            inputs, labels = batch

            inputs = inputs.to(device)
            labels = labels.to(device)

            
            outputs = model(inputs).to(device)

            loss = criterion(outputs, labels) + compute_physical_loss(inputs[:, 0], inputs[:, 1], inputs[:, 2], model, device)

            loss_record2.append(loss.item())

posted @ 2024-10-11 20:05 srrdhy 阅读(138) 评论(0) 收藏举报

刷新页面返回顶部

srrdhy

记一次PINN的bug修复

公告