记一次PINN的bug修复

在train函数中,如果我们使用验证集,大概会有以下代码片段:

model.eval() # Set your model to evaluation mode.
        loss_record = []
        for x, y in valid_loader:
            x, y = x.to(device), y.to(device)
            with torch.no_grad():
                pred = model(x)
                loss = criterion(pred, y) + compute_physical_loss(x[:, 1], x[:, 2], x[:, 0], model, device)

            loss_record.append(loss.item())

而在我们的物理损失函数compute_physical_loss中,会有以下求导片段

T_x = torch.autograd.grad(
            T,
            x,
            grad_outputs=torch.ones_like(T),
            retain_graph=True,
            create_graph=True,
        )[0]

这样运行会报错:

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow unused=True if this is the desired behavior

当我们去掉compute_physical_loss时是可以运行的,导致我以为是loss函数设计错误,其实不是。

如果你真按照报错说的设置allow unused=True,那只会导致别的报错。
本人利用互联网与gpt多日仍未找到解答,最终在慢慢调试中发现:

with torch.no_grad():

这句代码与我们在compute_physical_loss里要计算torch.autograd.grad是矛盾的,虽然它本意是在evaluation mode模式下不需要计算梯度,但只要不使用该代码即可运行。

最后只要这样就好了:

model.eval() # Set your model to evaluation mode.
        loss_record2 = []

        for batch in valid_loader:

            inputs, labels = batch

            inputs = inputs.to(device)
            labels = labels.to(device)

            
            outputs = model(inputs).to(device)

            loss = criterion(outputs, labels) + compute_physical_loss(inputs[:, 0], inputs[:, 1], inputs[:, 2], model, device)

            loss_record2.append(loss.item())
posted @ 2024-10-11 20:05  srrdhy  阅读(122)  评论(0)    收藏  举报