(翻译 gafferongames) Client Server Connection 客户端服务器连接
https://gafferongames.com/post/client_server_connection/
So far in this article series we’ve discussed how games read and write packets, how to unify packet read and write into a single function, how to fragment and re-assemble packets, and how to send large blocks of data over UDP.
Now in this article we’re going to bring everything together and build a client/server connection on top of UDP.
到目前为止,在本系列文章中我们已经讨论了游戏是如何读取和写入数据包的,如何将数据包的读取和写入统一为一个函数,如何对数据包进行分片与重组,以及如何通过 UDP 发送大块数据。
现在,在这篇文章中,我们将把这些内容整合起来,构建一个基于 UDP 的客户端/服务器连接。
Background
Developers from a web background often wonder why games go to such effort to build a client/server connection on top of UDP, when for many applications, TCP is good enough. *
The reason is that games send time critical data.
Why don’t games use TCP for time critical data? The answer is that TCP delivers data reliably and in-order, and to do this on top of IP (which is unreliable, unordered) it holds more recent packets hostage in a queue while older packets are resent over the network.
This is known as head of line blocking and it’s a huuuuuge problem for games. To understand why, consider a game server broadcasting the state of the world to clients 10 times per-second. Each client advances time forward and wants to display the most recent state it receives from the server.
But if the packet containing state for time t = 10.0 is lost, under TCP we must wait for it to be resent before we can access t = 10.1 and 10.2, even though those packets have already arrived and contain the state the client wants to display.
有 Web 开发背景的开发者常常会疑惑,为什么游戏要费这么大力气在 UDP 之上构建一个客户端/服务器连接?对于很多应用来说,TCP 已经足够用了。*
原因是:游戏需要发送的是对时间敏感的数据。
那为什么游戏不使用 TCP 来发送这些时间敏感的数据呢?答案是:TCP 为了实现可靠、有序的数据传输,会在 IP(本身不可靠、无序)之上引入机制,将较新的数据包“扣押”在队列中,直到较老的丢包被重新传输并成功收到为止。
这就是所谓的 “队头阻塞(Head-of-Line Blocking)”,而对游戏来说,这是一个巨大的问题。
要理解为什么这很严重,我们可以想象一个游戏服务器每秒向客户端广播世界状态 10 次。客户端不断推进游戏时间,并希望展示自己收到的最新服务器状态。
但是如果包含时间 t = 10.0 状态的数据包丢失了,在 TCP 下,客户端必须等待该数据包被重传并成功接收,然后才能访问时间 t = 10.1 和 10.2 的数据,即使这些数据包已经到达,并且它们包含了客户端想要展示的游戏状态。
这就是 TCP 的“有序可靠性”机制带来的副作用:**即使你已经收到了更新的数据,也不能用,必须等旧的先补上。**而对于实时游戏来说,这种延迟是不可接受的。

Worse still, by the time the resent packet arrives, it’s far too late for the client to actually do anything useful with it. The client has already advanced past 10.0 and wants to display something around 10.3 or 10.4!
So why resend dropped packets at all? BINGO! What we’d really like is an option to tell TCP: “Hey, I don’t care about old packets being resent, by they time they arrive I can’t use them anyway, so just let me skip over them and access the most recent data”.
Unfortunately, TCP simply does not give us this option :(
更糟糕的是,当那个被重传的数据包最终抵达时,客户端早就已经过了 10.0,现在正想显示 10.3 或 10.4 附近的状态了! 这个时候再拿到 10.0 的状态数据,根本没什么用了。
所以我们不禁要问:为什么还要重传丢失的数据包呢?
BINGO! 我们真正想要的是这样一种能力,能告诉 TCP:“嘿,我不关心旧的数据包是否重传,它们到的时候我已经用不上了,干脆跳过它们,直接给我最新的数据吧!”
不幸的是,TCP 并不提供这种选择
All data must be delivered reliably and in-order.
This creates terrible problems for time critical data where packet loss and latency exist. Situations like, you know, The Internet, where people play FPS games.
Large hitches corresponding to multiples of round trip time are added to the stream of data as TCP waits for dropped packets to be resent, which means additional buffering to smooth out these hitches, or long pauses where the game freezes and is non-responsive.
Neither option is acceptable for first person shooters, which is why virtually all first person shooters are networked using UDP. UDP doesn’t provide any reliability or ordering, so protocols built on top it can access the most recent data without waiting for lost packets to be resent, implementing whatever reliability they need in radically different ways to TCP.
所有数据必须被可靠且按顺序地传输。
这对存在丢包和延迟的时间敏感数据造成了严重问题。比如,你懂的,互联网——人们在上面玩 FPS 游戏的地方。
由于 TCP 等待丢失的数据包重传,数据流中会出现相当于多个往返时延的大幅卡顿,
这意味着要么增加额外缓冲来平滑这些卡顿,要么出现游戏卡住、无响应的长时间暂停。
对于第一人称射击游戏来说,这两种方案都是不可接受的,这就是为什么几乎所有的第一人称射击游戏都使用 UDP 进行联网的原因。
UDP 不提供任何可靠性和顺序保证,因此在其上构建的协议可以无需等待丢失的数据包被重传,就能访问最新的数据,并能用与 TCP 完全不同的方式,实现自己所需的可靠性。
But, using UDP comes at a cost:
UDP doesn’t provide any concept of connection.
We have to build that ourselves. This is a lot of work! So strap in, get ready, because we’re going to build it all up from scratch using the same basic techniques first person shooters use when creating their protocols over UDP. You can use this client/server protocol for games or non-gaming applications and, provided the data you send is time critical, I promise you, it’s well worth the effort.
* These days even web servers are transitioning to UDP via Google’s QUIC. If you still think TCP is good enough for time critical data in 2016, I encourage you to put that in your pipe and smoke it :)
但是,使用 UDP 是有代价的:
UDP 不提供任何“连接”的概念。
我们必须自己构建这一部分。这是一项繁重的工作!所以,系好安全带,准备好了,因为我们将从零开始构建,使用第一人称射击游戏在构建基于 UDP 协议时所采用的基础技术。
你可以将这个客户端/服务器协议用于游戏或非游戏应用,只要你发送的数据是时间敏感的,我向你保证,这一切努力都非常值得。
如今,甚至 Web 服务器也通过 Google 的 QUIC 协议转向使用 UDP。如果你仍然认为 TCP 足以胜任 2016 年的时间敏感型数据传输,我建议你把这话写在烟斗上,点着抽了吧 :)
Client/Server Abstraction 客户端/服务器抽象层
The goal is to create an abstraction on top of a UDP socket where our server presents a number of virtual slots for clients to connect to:
目标是在 UDP 套接字之上创建一个抽象层,使得服务器可以提供多个虚拟连接槽位,供客户端连接:

When a client requests a connection, it gets assigned to one of these slots:
当客户端请求连接时,它将被分配到其中一个虚拟槽位:

If a client requests connection, but no slots are available, the server is full and the connection request is denied:
如果客户端请求连接,但没有可用的槽位,则服务器已满,连接请求将被拒绝:

Once a client is connected, packets are exchanged in both directions. These packets form the basis for the custom protocol between the client and server which is game specific.
一旦客户端连接,数据包将在双向交换。这些数据包构成了客户端和服务器之间的自定义协议的基础,该协议是特定于游戏的。

In a first person shooter, packets are sent continuously in both directions. Clients send input to the server as quickly as possible, often 30 or 60 times per-second, and the server broadcasts the state of the world to clients 10, 20 or even 60 times per-second.
Because of this steady flow of packets in both directions there is no need for keep-alive packets. If at any point packets stop being received from the other side, the connection simply times out. No packets for 5 seconds is a good timeout value in my opinion, but you can be more aggressive if you want.
When a client slot times out on the server, it becomes available for other clients to connect. When the client times out, it transitions to an error state.
在第一人称射击游戏中,数据包持续在双向传输。客户端尽可能快速地将输入发送到服务器,通常每秒发送30次或60次,而服务器则以每秒10次、20次甚至60次的频率广播世界状态给客户端。
由于数据包在双向的稳定流动,因此不需要保keep-alive packets.。如果在任何时候没有从另一端接收到数据包,连接就会超时。在我看来,5秒没有收到数据包是一个不错的超时值,但如果你愿意,也可以设置得更激进一些。
当服务器上的客户端插槽超时时,它将变为可供其他客户端连接的状态。当客户端超时后,它会转变为错误状态。
Simple Connection Protocol
Let’s get started with the implementation of a simple protocol. It’s a bit basic and more than a bit naive, but it’s a good starting point and we’ll build on it during the rest of this article, and the next few articles in this series.
First up we have the client state machine.
The client is in one of three states:
- Disconnected
- Connecting
- Connected
Initially the client starts in disconnected.
简单连接协议
让我们开始实现一个简单的协议。它有点基础,也有点天真,但它是一个很好的起点,在接下来的这篇文章以及本系列接下来的几篇文章中,我们将基于它进行构建。
首先,我们来看客户端的状态机。
客户端处于三种状态之一:
-
断开连接 (Disconnected)
-
正在连接 (Connecting)
-
已连接 (Connected)
客户端初始时处于“断开连接”状态。
When a client connects to a server, it transitions to the connecting state and sends connection request packets to the server:
当客户端连接到服务器时,它会切换到“正在连接”状态,并向服务器发送连接请求包:

The CRC32 and implicit protocol id in the packet header allow the server to trivially reject UDP packets not belonging to this protocol or from a different version of it.
Since connection request packets are sent over UDP, they may be lost, received out of order or in duplicate.
Because of this we do two things: 1) we keep sending packets for the client state until we get a response from the server or the client times out, and 2) on both client and server we ignore any packets that don’t correspond to what we are expecting, since a lot of redundant packets are flying over the network.
包头中的CRC32和隐式协议ID允许服务器轻松地拒绝不属于该协议或来自不同版本的UDP数据包。
由于连接请求包是通过UDP发送的,它们可能会丢失、乱序或重复接收。
因此,我们采取了两个措施:1) 我们会继续发送客户端状态包,直到收到来自服务器的响应或客户端超时;2) 在客户端和服务器端,我们会忽略任何不符合预期的数据包,因为网络上会有很多冗余的数据包。
On the server, we have the following data structure:
const int MaxClients = 64; class Server { int m_maxClients; int m_numConnectedClients; bool m_clientConnected[MaxClients]; Address m_clientAddress[MaxClients]; };
Which lets the server lookup a free slot for a client to join (if any are free):
int Server::FindFreeClientIndex() const { for ( int i = 0; i < m_maxClients; ++i ) { if ( !m_clientConnected[i] ) return i; } return -1; }
Find the client index corresponding to an IP address and port: 根据 IP 地址和端口查找对应的客户端索引:
int Server::FindExistingClientIndex( const Address & address ) const { for ( int i = 0; i < m_maxClients; ++i ) { if ( m_clientConnected[i] && m_clientAddress[i] == address ) return i; } return -1; }
Check if a client is connected to a given slot: 检查某个客户端是否已连接到指定的槽位:
bool Server::IsClientConnected( int clientIndex ) const { return m_clientConnected[clientIndex]; }
… and retrieve a client’s IP address and port by client index: 并通过客户端索引获取该客户端的 IP 地址和端口:
const Address & Server::GetClientAddress( int clientIndex ) const { return m_clientAddress[clientIndex]; }
Using these queries we implement the following logic when the server processes a connection request packet:
-
If the server is full, reply with connection denied.
-
If the connection request is from a new client and we have a slot free, assign the client to a free slot and respond with connection accepted.
-
If the sender corresponds to the address of a client that is already connected, also reply with connection accepted. This is necessary because the first response packet may not have gotten through due to packet loss. If we don’t resend this response, the client gets stuck in the connecting state until it times out.
当服务器处理连接请求数据包时,使用这些查询我们可以实现以下逻辑:
-
如果服务器已满,回复连接被拒绝。
-
如果连接请求来自一个新客户端,且我们有空闲的插槽,则将该客户端分配到一个空闲插槽,并回复“连接已接受”。
-
如果发送者对应的地址是一个已经连接的客户端,也回复“连接已接受”。这是必要的,因为第一个响应数据包可能因为丢包没有到达客户端。如果我们不重新发送该响应,客户端就会卡在“连接中”状态,直到超时为止。
The connection accepted packet tells the client which client index it was assigned, which the client needs to know which player it is in the game:

Once the server sends a connection accepted packet, from its point of view it considers that client connected. As the server ticks forward, it watches connected client slots, and if no packets have been received from a client for 5 seconds, the slot times out and is reset, ready for another client to connect.
Back to the client. While the client is in the connecting state the client listens for connection denied and connection accepted packets from the server. Any other packets are ignored.
If the client receives connection accepted, it transitions to connected. If it receives connection denied, or after 5 seconds hasn’t received any response from the server, it transitions to disconnected.
Once the client hits connected it starts sending connection payload packets to the server. If no packets are received from the server in 5 seconds, the client times out and transitions to disconnected.
一旦服务器发送了“连接接受”数据包,从服务器的角度来看,它就认为该客户端已经连接成功。随着服务器持续运行,它会持续监视已连接的客户端槽位,如果某个客户端在 5 秒内没有发送任何数据包,则该槽位会超时并被重置,准备接受另一个客户端的连接。
回到客户端这边。当客户端处于“连接中”状态时,它会监听来自服务器的“连接拒绝”和“连接接受”数据包,其他类型的数据包则会被忽略。
如果客户端收到了“连接接受”数据包,它就会转入“已连接”状态。如果收到“连接拒绝”,或者在 5 秒内没有收到服务器的任何回应,它就会转入“断开连接”状态。
一旦客户端进入“已连接”状态,它就开始向服务器发送连接负载数据包。如果在之后的 5 秒内仍然没有收到服务器的数据包,客户端也会超时并转为“断开连接”状态。
Naive Protocol is Naive
While this protocol is easy to implement, we can’t use a protocol like this in production. It’s way too naive. It simply has too many weaknesses to be taken seriously:
虽然这个协议实现起来很简单,但我们不能在生产环境中使用这样的协议。它实在是native了。它有太多的弱点,无法被认真对待:
-
Spoofed packet source addresses can be used to redirect connection accepted responses to a target (victim) address. If the connection accepted packet is larger than the connection request packet, attackers can use this protocol as part of a DDoS amplification attack. 伪造的数据包源地址可以用来将连接接受响应重定向到目标(受害者)地址。如果连接接受包比连接请求包大,攻击者可以利用这个协议进行DDoS放大攻击。
-
Spoofed packet source addresses can be used to trivially fill all client slots on a server by sending connection request packets from n different IP addresses, where n is the number of clients allowed per-server. This is a real problem for dedicated servers. Obviously you want to make sure that only real clients are filling slots on servers you are paying for. 伪造的数据包源地址还可以通过从n个不同的IP地址发送连接请求包来轻松填满服务器上的所有客户端插槽,其中n是每个服务器允许的客户端数量。这对专用服务器来说是一个实际问题。显然,你需要确保只有真实的客户端才能占用你所支付的服务器上的插槽。
-
An attacker can trivially fill all slots on a server by varying the client UDP port number on each client connection. This is because clients are considered unique on an address + port basis. This isn’t easy to fix because due to NAT (network address translation), different players behind the same router collapse to the same IP address with only the port being different, so we can’t just consider clients to be unique at the IP address level sans port. 攻击者可以通过在每个客户端连接中更改客户端的UDP端口号,轻松填满服务器上的所有插槽。这是因为客户端是基于地址+端口来唯一标识的。这不容易修复,因为由于NAT(网络地址转换),同一个路由器后面的不同玩家会共享相同的IP地址,只有端口不同,因此我们不能仅仅将客户端视为基于IP地址唯一的。
-
Traffic between the client and server can be read and modified in transit by a third party. While the CRC32 protects against packet corruption, an attacker would simply recalculate the CRC32 to match the modified packet. 客户端和服务器之间的流量可以被第三方读取和修改。虽然CRC32可以防止数据包损坏,攻击者仍然可以重新计算CRC32以匹配修改后的数据包。
-
If an attacker knows the client and server IP addresses and ports, they can impersonate the client or server. This gives an attacker the power to completely a hijack a client’s connection and perform actions on their behalf.如果攻击者知道客户端和服务器的IP地址和端口,他们就可以伪装成客户端或服务器。这使得攻击者能够完全劫持客户端的连接,并代表客户端执行操作。
-
Once a client is connected to a server there is no way for them to disconnect cleanly, they can only time out. This creates a delay before the server realizes a client has disconnected, or before a client realizes the server has shut down. It would be nice if both the client and server could indicate a clean disconnect, so the other side doesn’t need to wait for timeout in the common case.一旦客户端连接到服务器,就没有办法干净地断开连接,它们只能超时。这会导致延迟,直到服务器意识到客户端已断开连接,或者直到客户端意识到服务器已经关闭。如果客户端和服务器都能指示一个干净的断开连接,那将会很好,这样在常见情况下另一方就不需要等待超时了。
-
Clean disconnection is usually implemented with a disconnect packet, however because an attacker can impersonate the client and server with spoofed packets, doing so would give the attacker the ability to disconnect a client from the server whenever they like, provided they know the client and server IP addresses and the structure of the disconnect packet.干净的断开连接通常通过断开连接数据包来实现,然而,由于攻击者可以通过伪造数据包来冒充客户端和服务器,进行这种操作将使得攻击者能够在任何时候断开客户端与服务器的连接,只要他们知道客户端和服务器的IP地址以及断开连接数据包的结构。
-
If a client disconnects dirty and attempts to reconnect before their slot times out on the server, the server still thinks that client is connected and replies with connection accepted to handle packet loss. The client processes this response and thinks it’s connected to the server, but it’s actually in an undefined state.如果客户端断开连接时没有正确断开,并尝试在其插槽超时之前重新连接,服务器仍然认为该客户端已连接,并回复连接接受,以处理数据包丢失。客户端处理此响应并认为自己已连接到服务器,但实际上它处于未定义的状态。
While some of these problems require authentication and encryption before they can be fully solved, we can make some small steps forward to improve the protocol before we get to that. These changes are instructive.
虽然其中一些问题需要认证和加密才能完全解决,但在我们达到这一点之前,我们可以采取一些小步骤来改进协议。这些改进是具有指导意义的。
Improving The Connection Protocol
The first thing we want to do is only allow clients to connect if they can prove they are actually at the IP address and port they say they are.
To do this, we no longer accept client connections immediately on connection request, instead we send back a challenge packet, and only complete connection when a client replies with information that can only be obtained by receiving the challenge packet.
我们想要做的第一件事是,只允许客户端在能够证明自己确实位于所声明的 IP 地址和端口时才允许连接。
为此,我们不再在连接请求时立即接受客户端连接,而是先发送一个挑战包,只有当客户端回复一个只有通过接收挑战包才能获得的信息时,我们才完成连接。
The sequence of operations in a typical connect now looks like this:

To implement this we need an additional data structure on the server. Somewhere to store the challenge data for pending connections, so when a challenge response comes in from a client we can check against the corresponding entry in the data structure and make sure it’s a valid response to the challenge sent to that address.
While the pending connect data structure can be made larger than the maximum number of connected clients, it’s still ultimately finite and is therefore subject to attack. We’ll cover some defenses against this in the next article. But for the moment, be happy at least that attackers can’t progress to the connected state with spoofed packet source addresses.
为了实现这一点,我们需要在服务器上增加一个额外的数据结构,用于存储待处理连接的挑战数据。这样,当来自客户端的挑战响应到达时,我们可以与数据结构中的相应条目进行比对,确保它是对该地址发送的挑战的有效响应。
虽然待处理连接数据结构可以比最大连接数更大,但它毕竟是有限的,因此仍然容易受到攻击。我们将在下一篇文章中讨论一些防御方法。但目前,至少可以确保攻击者无法通过伪造包源地址进入连接状态。
Next, to guard against our protocol being used in a DDoS amplification attack, we’ll inflate client to server packets so they’re large relative to the response packet sent from the server. This means we add padding to both connection request and challenge response packets and enforce this padding on the server, ignoring any packets without it. Now our protocol effectively has DDoS minification for requests -> responses, making it highly unattractive for anyone thinking of launching this kind of attack.
Finally, we’ll do one last small thing to improve the robustness and security of the protocol. It’s not perfect, we need authentication and encryption for that, but it at least it ups the ante, requiring attackers to actually sniff traffic in order to impersonate the client or server. We’ll add some unique random identifiers, or ‘salts’, to make each client connection unique from previous ones coming from the same IP address and port.
接下来,为了防止我们的协议被用于DDoS放大攻击,我们将使客户端到服务器的数据包变得相对于服务器发送的响应包较大。为此,我们在连接请求和挑战响应数据包中添加填充,并在服务器端强制执行这一填充,忽略任何没有填充的包。这样,我们的协议实际上实现了DDoS请求->响应的最小化,从而使其对任何想要发动这种攻击的人变得非常不具吸引力。
最后,我们再做一件小事来提高协议的健壮性和安全性。虽然这并不完美,仍然需要认证和加密来确保安全,但至少它增加了攻击者的难度,要求他们必须嗅探流量才能伪装成客户端或服务器。我们将添加一些独特的随机标识符,或称为“盐”,使每个客户端连接与来自相同IP地址和端口的先前连接不同。
The connection request packet now looks like this:

The client salt in the packet is a random 64 bit integer rolled each time the client starts a new connect. Connection requests are now uniquely identified by the IP address and port combined with this client salt value. This distinguishes packets from the current connection from any packets belonging to a previous connection, which makes connection and reconnection to the server much more robust.
Now when a connection request arrives and a pending connection entry can’t be found in the data structure (according to IP, port and client salt) the server rolls a server salt and stores it with the rest of the data for the pending connection before sending a challange packet back to the client. If a pending connection is found, the salt value stored in the data structure is used for the challenge. This way there is always a consistent pair of client and server salt values corresponding to each client session.
数据包中的客户端盐是每次客户端启动新连接时生成的一个随机64位整数。连接请求现在通过IP地址和端口与该客户端盐值唯一标识。这将当前连接的数据包与任何属于先前连接的数据包区分开来,从而使连接和重新连接到服务器变得更加稳健。
现在,当一个连接请求到达并且在数据结构中找不到对应的待处理连接条目(根据IP、端口和客户端盐值)时,服务器会生成一个服务器盐,并将其与待处理连接的其余数据一起存储,然后发送一个挑战数据包回客户端。如果找到了待处理连接,数据结构中存储的盐值将用于挑战。通过这种方式,总是会有一对一致的客户端和服务器盐值与每个客户端会话对应。

The client state machine has been expanded so connecting is replaced with two new states: sending connection request and sending challenge response, but it’s the same idea as before. Client states repeatedly send the packet corresponding to that state to the server while listening for the response that moves it forward to the next state, or back to an error state. If no response is received, the client times out and transitions to disconnected.
客户端状态机已经扩展,连接状态被替换为两个新状态:发送连接请求和发送挑战响应,但它与之前的概念相同。客户端状态会重复发送对应状态的包到服务器,同时监听响应,响应会将客户端状态推进到下一个状态,或者返回到错误状态。如果没有收到响应,客户端会超时并转到断开连接状态。
The challenge response sent from the client to the server looks like this:

The utility of this being that once the client and server have established connection, we prefix all payload packets with the xor of the client and server salt values and discard any packets with the incorrect salt values. This neatly filters out packets from previous sessions and requires an attacker to sniff packets in order to impersonate a client or server.



浙公网安备 33010602011771号