Threadripper GPU Passthrough和Raid driver for Linux

可能是太新,也可能是照顾EPYC,反正threadripper的PCI Passthrough是有Bug的。终于查到一个Hack的解决办法,记录下来。此外,板载Raid的Linux驱动也还没出来,Linux的话只能靠手工克隆Efi引导分区了。

原文链接

https://techblog.jeppson.org/2018/04/vga-passthrough-threadripper/

https://gist.github.com/freebroccolo/d501b32563891e7a97aa642818f5bfe3

An unfortunate bug exists for the AMD Threadripper family of GPUs which causes VGA Passthrough not to work properly. Fortunately some very clever people have implemented a workaround to allow proper VGA passthrough until a proper Linux Kernel patch can be accepted and implemented. See here for the whole story.

Right now my Thrdearipper 1950x successfully has GPU passthrough thanks to HyenaCheeseHeads “java hack” applet.  I went this route because I really didn’t want to try and recompile my ProxMox kernel to get passthrough to work. Per the description “It is a small program that runs as any user with read/write access to sysfs (this small guide assumes “root”). The program monitors any PCIe device that is connected to VFIO-PCI when the program starts, if the device disconnects due to the issues described in this post then the program tries to re-connect the device by rewriting the bridge configuration.” Instructions taken from the above Reddit post.

  • Go to https://pastebin.com/iYg3Dngs and hit “Download” (the MD5 sum is supposed to be 91914b021b890d778f4055bcc5f41002)
  • Rename the downloaded file to “ZenBridgeBaconRecovery.java” and put it in a new folder somewhere
  • Go to the folder in a terminal and type “javac ZenBridgeBaconRecovery.java”, this should take a short while and then complete with no errors. You may need to install the Java 8 JDK to get the javac command (use your distribution’s software manager)
  • In the same folder type “sudo java ZenBridgeBaconRecovery”
  • Make sure that the PCIe device that you intend to passthru is listed as monitored with a bridge
  • Now start your VM

In my case (Debian Stretch, ProxMox) I needed to install openjdk-8-jdk-headless

sudo apt install openjdk-8-jdk-headless
javac ZenBridgeBaconRecovery.java

Next I have a little script on startup to spawn this as root in a detached tmux session, so I don’t have to remember to run it (If you try to start your VM before running this, it will hose passthrough on your system until you reboot it.) Be sure to change the script to point to wherever you compiled ZenBridgeBaconRecovery

#!/bin/bash
cd /home/nicholas  #change me to suit your needs
sudo java ZenBridgeBaconRecovery

And here is the command I use to run on startup:

tmux new -d '/home/nicholas/passthrough.sh'

Again, be sure to modify the above to point to the path of wherever you saved the above script.

So far this works pretty well for me. I hate having to run a java process as sudo, but it’s better than recompiling my kernel.


Update 6/27/2018:  I’ve created a systemd service script for the ZenBaconRecovery file to run at boot. Here is my file, placed in /etc/systemd/system/zenbridge.service:  (change your working directory to match the zenbridgebaconrecovery java file location. Don’t forget to do systemctl daemon-reload.)

[Unit] 
Description=Zen Bridge Bacon Recovery 
After=network.target 

[Service] 
Type=simple 
User=root 
WorkingDirectory=/home/nicholas 
ExecStart=/usr/bin/java ZenBridgeBaconRecovery 
Restart=on-failure # or always, on-abort, etc 

[Install] 
WantedBy=multi-user.target 
~

Update 8/18/2018 Finally solved for everyone!

Per an update on the reddit thread motherboard manufactures have finally put out BIOS updates that resolve the PCI passthrough problems. I updated my X399 Tachi to the latest version of its UEFI BIOS (3.20) and indeed PCI passthrough worked without any more wonky workarounds!

/**
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, version 3 of the License.
 * 
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 * 
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Path;
import java.util.Date;
import java.util.HashMap;
import java.util.Map.Entry;

/**
 * Try to recover Zen's PCIe bridge after a secondary bus 
 * reset kicks it out of whack.
 * 
 * @author HyenaCheeseHeads
 */
public class ZenBridgeBaconRecovery {
    public static void log(String text){
        System.out.println(new Date() + ": "+text);
    }
    
    public static void main(String[] args) throws IOException, InterruptedException {        
        System.out.println("-------------------------------------------");
        System.out.println("Zen PCIe-Bridge BAR/Config Recovery Tool, rev 1, 2018, HyenaCheeseHeads");
        System.out.println("-------------------------------------------");
        HashMap<Path, Path> bridgeMapping = new HashMap<>();
        
        File vfioDir = new File("/sys/bus/pci/drivers/vfio-pci");
        if (!vfioDir.exists()){
            log("!!! Cannot find the VFIO-PCI sysfs at "+vfioDir);
            System.exit(-1);
        }
        if (!(vfioDir.canRead() && vfioDir.canWrite())){
            log("!!! This tool requires R/W access to the VFIO-PCI sysfs at "+vfioDir);
            log("!!! Make sure to run it as root (or similar super user)");
            System.exit(-1);
        }

        // Detect list of devices using VFIO-PCI sysfs entry
        log("Detecting VFIO-PCI devices");
        for (File f : vfioDir.listFiles()){            
            if (f.isDirectory() && f.getName().startsWith("00")){
                Path devicePath = f.toPath().toRealPath();
                log("\tDevice: "+devicePath);
                
                try {
                    Path bridgePath = devicePath.getParent();
                    byte[] id = new byte[4];
                    bridgePath.resolve("config").toUri().toURL().openStream().read(id);
                    if (id[0]==0x22 && id[1]==0x10 && id[2]==0x53 && id[3]==0x14){
                        log("\t\tBridge: "+bridgePath);
                        bridgeMapping.put(devicePath, bridgePath);
                    } else {
                        log("\t\t!!! Unknown bridge type! Skipping...");
                    }
                } catch (IOException ex){
                    log("\t\t!!! Exception: "+ex.getMessage());
                    log("\t\t!!! Skipping...");
                }
            }
        }
        
        // Monitor devices for bridge failure pattern
        log("Monitoring "+bridgeMapping.size()+" device(s)...");
        while (true){
            for (Entry<Path,Path> bridgeEntry : bridgeMapping.entrySet()){
                try (FileInputStream deviceConfig = new FileInputStream(bridgeEntry.getKey().resolve("config").toFile())){                    
                    byte[] id = new byte[4];
                    deviceConfig.read(id);
                    if (id[0]==-1 && id[1]==-1){
                        // Failure detected, recover bridge by rewriting its config
                        log("Lost contact with "+bridgeEntry.getKey());
                        byte[] data = new byte[512];
                        try (
                                FileInputStream bridgeConfigIn = new FileInputStream(bridgeEntry.getValue().resolve("config").toFile());
                                FileOutputStream bridgeConfigOut = new FileOutputStream(bridgeEntry.getValue().resolve("config").toFile());
                                ){                                                
                            log("\tRecovering "+bridgeConfigIn.read(data)+" bytes");
                            bridgeConfigOut.write(data);
                            log("\tBridge config write complete");
                        } catch (IOException ex){
                            log("\t!!! Exception: "+ex.getMessage());
                            Thread.sleep(10000);
                        }
                        try (FileInputStream deviceRecoveryConfig = new FileInputStream(bridgeEntry.getKey().resolve("config").toFile())){                    
                            deviceConfig.read(id);
                            if (id[0]==-1 && id[1]==-1){
                                log("\tFailed to recover bridge secondary bus");
                            } else {
                                log("\tRecovered bridge secondary bus");
                                log("Re-acquired contact with "+bridgeEntry.getKey());
                            }
                        }
                    }
                }
            }
            Thread.sleep(100);
        }
    }
    
}

 

posted @ 2018-11-15 19:56  Robird  阅读(292)  评论(0)    收藏  举报