GPU 服务器
资源
AS -5126GS-TNRT
- 主板:H14DSG-O-CPU (H14)
- 机箱:CSE-528G2TSR000NDP
- 用户手册
AS -4125GS-TNRT
- 主板:H13DSG-O-CPU (H13)
- 机箱:CSE-418G2TS
- 用户手册
- BIOS/BMC Downloads
- BMC 说明书
管理工具
-
产品密钥获取工具(不适用 H13 主板):zsrv/supermicro-product-key
从 X13 系列开始的主板全部使用 JSON key,生成该 key 需要 Supermicro 私钥进行签名,因此无法使用工具破解。参见:What about SFT-DCMS-SINGLE License key? | GitHub
Redfish
Redfish 是 DMTF(Distributed Management Task Force)制定的基于 RESTful API 的开放管理标准,用于现代数据中心的带外服务器管理。
安装 Redfish:下载 BIOS/BMC 安装包,执行:
sudo bash saa*/script/Linux_enable_RHI.sh
- In-Band:带内管理
- OOB:带外管理(Out-of-Band)
IPMIView
sudo ./IPMIView20 # X11 转发时需使用 sudo -E
IPMIView 有一个远程 console 功能,可以将服务器控制台窗口转发到 HTML5,可以帮助调试启动问题。
SSM
sudo ./SSMInstaller_*_linux_x64_*.bin
PCIe 配置
- Single Root:1 个 CPU 通过 2 个 PLX 交换机与所有 GPU 通信。适合通用工作负载。
- Dual Root:2 CPU 分别通过 1 个独立的 PLX 交换机与 GPU 通信。适合深度学习,HPC
- Direct Attached:没有 PLX 交换机,所有 PCIe 通道直接连接到 CPU 进行通信。适合图形渲染和视频编辑。
参考:
- Differences in Single Root, Dual Root, Direct Attached Servers | Exxact Blog
- product-brief-GPU-Servers-Root-Explained.pdf
- What is the difference between PCIe 5.0 x16" Switch Dual-Root" (X13DEG-OA) and "PCIe 5.0 x16 CPU-to-GPU Interconnect" (X13DEG-QT)? | Support - Super Micro Computer, Inc.
SOL
-
内核参数启用串行控制台
sudoedit /etc/default/grubGRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8" GRUB_TERMINAL="console serial" GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1" -
启用串行终端服务
systemctl enable serial-getty@ttyS0.service systemctl start serial-getty@ttyS0.service -
连接 SOL:
sudo ipmitool -I lanplus -H <BMC_IP> -U <username> -P <password> sol activate
接口
网络接口
接口类型
-
RJ45(Registered Jack 45):最常见的电网络接口,支持 10GbE(10 Gigabit Ethernet)网络传输。

-
SFP+(Small Form-factor Pluggable):光网络接口,支持 10GbE 和 8G Fibre Channel 等应用。

-
SFP28:SFP+ 的升级版本,主要用于 25GbE。
-
QSFP28(Quad Small Form-factor Pluggable):支持多通道高带宽连接的光模块接口标准,“28”表示每通道提供 28Gbps 的带宽。

-
OSFP(Octal Small Form-factor Pluggable):专为支持高达 400Gbps 的数据传输速率而设计的光模块接口标准。

接口品牌
- 100-Gigabit MCX623106AN-CDAT (2 x QSFP56):Mellanox(迈络思),2020 年被 NVIDIA 收购。MCX6 是 Mellanox 公司的 ConnectX-6 Dx 系列产品。
- 100-Gigabit BCM57508 (2 x QSFP28):Broadcom(博通)
- 100-Gigabit CX-4 (2 x QSFP28):Mellanox 公司的 ConnectX-4 Lx 系列产品。
电源接口
- RTX PRO 6000 使用 12V-2x6 接口(外有 H++ 标识)
- 主板使用 MicroHi 2x4Y 接口
- 电源是 PWS-2K08A-1R
关于 12V-2x6 接口,参见:
- 12VHPWR and 12V-2x6 Compared | CORSAIR
- Will my 12VHPWR cable work with 12V-2x6? | CORSAIR
- 12V-2x6 adapter, or direct cable - what is best? | CORSAIR
- H+ vs H++ on 12V-2x6 / 12VHPWR cables: What does it mean? | CORSAIR
- Native 12V-2x6 vs 8-pin to 12V-2x6 - what is the difference? | CORSAIR
- 2x 8-pin to 12V-2x6 vs dual PCIe 6+2 to 12V-2x6: What's the difference? | CORSAIR
命名惯例
Server (A+ Server)
| Character | Representation | Options |
|---|---|---|
| Prefix | Product Category | • AS = A+ Server or Workstation • ASG = A+ Storage System |
| 1st | Form Factor | • 1 = 1U • 2 = 2U • 3 = 3U • 4 = 4U • 8 = 8U • A = 10U |
| 2nd | HD Tray Type / Chassis | • 0 = 3.5" • 1 = 2.5" or EDSFF |
| 3rd | Processor Count | • 1 = Single • 2 = Dual • 4 = Quad |
| 4th | Generation | • 6 = 6th Gen: EPYC™ 9005 series (Socket SP5) or Instinct™ MI300A APUs (Socket SH5) • 5 = 5th Gen: EPYC 9004 series (Socket SP5) or EPYC 8004 series (Socket SP6) or EPYC 4004 series (Socket AM5) or Ryzen 7000/8000/9000 series (Socket AM5) or Ryzen Threadripper™ PRO 7000WX series (Socket sTR5) |
| 5th & 6th | Server Platform or MB platform/Chipset/Socket | • A = Socket AM5 (Zen 4) • C = CloudDC (5th) • FT = FlexTwin™ • H = Hyper architecture (5th); Socket SH5 (6th) • G = GPU • GT = GrandTwin® • MR = MicroCloud • S = Socket SP5/SP6 (Zen 4) • V = Socket sTR5 / WRX90 Chipset |
| 7th/8th/9th and beyond | Additional Features | • 1 = Single Root • 2 = Dual Root • B = NVIDIA Blackwell Architecture • H = NVIDIA Hopper Architecture • LCC = Liquid cooling • M = MI300A/X • N = NVMe • T = SATA • R = Redundant Power |
SuperServer (2.5" HDDs)
| Character | Representation | Options |
|---|---|---|
| Prefix | Product Category | • SYS = Intel SuperServer |
| 1st | Chassis Form Factor | • 1 = 1U • 2 = 2U • 4 = 4U • 7 = Tower |
| 2nd | CPU/Socket Quantity | • 1 = Single Processor (UP) • 2 = Dual Processor (DP) • 4 = Dual Processor (DP), 4U |
| 3rd | Generation | • 1 = X13 Series • 0 = X12 Series |
| 4th | GPU Info. | • GP = GPU • GQ = Quad GPU |
| 5th | Storage Features | • T = SATA • N = NVMe |
| 6th | GPU (Optional) | • A = A100 SXM4 |
| 7th | Power Type | • R = Redundant Power Supplies |
| 8th | Connectivity | • T = 10GbE |
参考:Best GPU Server from Supermicro for Modern Data Center | Supermicro
GPU 服务器
Universal GPU systems
- Intel SXM x 8: SYS-821GE-TNHR
- AMD SXM x 8: AS -8125GS-TNHR
- Intel SXM x 4: SYS-421GU-TNXR
GPU Lines with PCIe 5.0
- Intel PCIe All NVMe: SYS-522GA-NRT
- Intel PCIe 8 x NVMe: SYS-521GE-TNRT
- Intel PCIe 4 x NVMe: SYS-421GE-TNRT3
- AMD PCIe: AS -5126GS-TNRT
- AMD PCIe: AS -4125GS-TNRT
NVIDIA MGX Systems
- Intel PCIe E1.S: SYS-422GL-NR
参见:GPU 服务器价格 | Supermicro eStore
螺丝
安装时注意到两种类似的螺丝:
#6-32x5/16" Phillips Hex Head Serrated Flange Bolts


这两个螺丝的区别是带不带齿。带齿的版本用于固定产生震动的物体,如风扇。不带齿的版本用于固定普通物体。
- Flanged Hex:法兰六角头(Head Style)
- Phillips:十字槽(Drive Style)
- #6-32:螺纹尺寸(Thread Size)
- 5/16":螺栓长度
参见:Computer case screws | Wikipedia
CPU
Intel Xeon 6952P
││││└─ 产品后缀
│└┴┴─ SKU 编号
└──── 世代数字
- 世代数字(6):表示第 6 代 Xeon 架构,数字越大代表越新的架构
- SKU 编号(952):第一位数字
9通常表示性能级别,数字越大性能越强。后两位52是具体的产品型号区分 - 产品后缀字母:P = Performance,E** = Efficiency
硬盘
- SIE:Secure Instant Erase,企业级快速数据擦除功能
- ISE:Instant Secure Erase,与 SIE 类似,方便数据销毁
- SED:Self-Encrypting Drive,自加密盘,保障数据安全
- OPAL:OPAL 协议支持,数据加密增强标准
- Dual-Port:双端口,利于服务器冗余和热备
- Single-Port:单端口,适合单机架或成本敏感应用
- DWPD:日写入次数,代表耐久性。1DWPD 适合读多写少业务,<2DWPD 适合写密集场景

浙公网安备 33010602011771号