RFC 1323 Using the Window Scale Option

这是内核协议栈里面的一个发送窗口赋值代码

tp->snd_wnd = ntohs(th->window) << tp->rx_opt.snd_wscale;
snd_wscale : 4,    /* Window scaling received from sender    */
rcv_wscale : 4;    /* Window scaling to send to receiver    */
/* Maximal number of window scale according to RFC1323 */
#define TCP_MAX_WSCALE        14U

看下rfc 1312 文档

 Using the Window Scale Option

      A model implementation of window scaling is as follows, using the
      notation of RFC-793 [Postel81]:

      *    All windows are treated as 32-bit quantities for storage in
           the connection control block and for local calculations.
           This includes the send-window (SND.WND) and the receive-
           window (RCV.WND) values, as well as the congestion window.

      *    The connection state is augmented by two window shift counts,
           Snd.Wind.Scale and Rcv.Wind.Scale, to be applied to the
           incoming and outgoing window fields, respectively.

      *    If a TCP receives a <SYN> segment containing a Window Scale
           option, it sends its own Window Scale option in the <SYN,ACK>
           segment.

      *    The Window Scale option is sent with shift.cnt = R, where R
           is the value that the TCP would like to use for its receive
           window.

      *    Upon receiving a SYN segment with a Window Scale option
           containing shift.cnt = S, a TCP sets Snd.Wind.Scale to S and
           sets Rcv.Wind.Scale to R; otherwise, it sets both
           Snd.Wind.Scale and Rcv.Wind.Scale to zero.

      *    The window field (SEG.WND) in the header of every incoming
           segment, with the exception of SYN segments, is left-shifted
           by Snd.Wind.Scale bits before updating SND.WND:
              SND.WND = SEG.WND << Snd.Wind.Scale
           (assuming the other conditions of RFC793 are met, and using
           the "C" notation "<<" for left-shift).

      *    The window field (SEG.WND) of every outgoing segment, with
           the exception of SYN segments, is right-shifted by
           Rcv.Wind.Scale bits:
              SEG.WND = RCV.WND >> Rcv.Wind.Scale.
TCP determines
if a data segment is "old" or "new" by testing whether its sequence number is within 2**31 bytes of the left edge of the window, and if it is not, discarding the data as "old". To insure that new data is never mistakenly considered old and vice- versa, the left edge of the sender's window has to be at most 2**31 away from the right edge of the receiver's window. Similarly with the sender's right edge and receiver's left edge. Since the right and left edges of either the sender's or receiver's window differ by the window size, and since the sender and receiver windows can be out of phase by at most the window size, the above constraints imply that 2 * the max window size must be less than 2**31, or max window < 2**30 Since the max window is 2**S (where S is the scaling shift count) times at most 2**16 - 1 (the maximum unscaled window), the maximum window is guaranteed to be < 2*30 if S <= 14. Thus, the shift count must be limited to 14 (which allows windows of 2**30 = 1 Gbyte). If a Window Scale option is received with a shift.cnt value exceeding 14, the TCP should log the error but use 14 instead of the specified value. The scale factor applies only to the Window field as transmitted in the TCP header; each TCP using extended windows will maintain the window values locally as 32-bit numbers. For example, the "congestion window" computed by Slow Start and Congestion Avoidance is not affected by the scale factor, so window scaling will not introduce quantization into the congestion window.

  要识别一个序号相对另一个序号在前还是在后,窗口必须不能大于半个 2^32 字节,最大值为 2^31 字节,也就是 2 GB;

此时需要参考在 32bit 的 unsigned int 圆环域上,两个数字顺时针间隔在一个半圆内才能无歧义比较大小(参考 Linux kernel 的 before,after 宏定义实现)

 

tcp协议头中有seq和ack_seq两个字段,分别代表序列号和确认号。tcp协议通过序列号标识发送的报文段。seq的类型是__u32,当超过__u32的最大值时,会回绕到0。

   PS:一个tcp流的初始序列号(ISN)并不是从0开始的,而是采用一定的随机算法产生的,因此ISN可能很大(比如(2^32-10)),因此同一个tcp流的seq号可能会回绕到0。

而我们tcp对于丢包和乱序等问题的判断都是依赖于序列号大小比较的。此时就出现了所谓的tcp序列号回绕(sequence wraparound)问题。

 

/*
 * The next routines deal with comparing 32 pbit unsigned ints
 * and worry about wraparound (automatic with unsigned arithmetic).
 */

static inline bool before(__u32 seq1, __u32 seq2)
{
        return (__s32)(seq1-seq2) < 0;
}
#define after(seq2, seq1)     before(seq1, seq2)

为什么(__s32)(seq1-seq2)<0就可以判断seq1<seq2呢?这里的__s32是有符号整型的意思,而__u32则是无符号整型;

以unsigned char  8bits 来看:

假设seq1=255, seq2=1(发生了回绕)

seq1 = 1111 1111  seq2 = 0000 0001

seq1 - seq2= 1111 1110

我们将结果转化成了有符号数,由于最高位是1,因此结果是一个负数。

如果seq2=128 也就是1000 0000 的话,我们会发现:

 seq1 - seq2= 0111 1111

此时结果尤为正了,判断的结果是seq1>seq2。

因此,上述算法正确的前提是,回绕后的增量小于2^(n-1)-1

再看u32的seq 以及 ack_seq 可以支持的回绕幅度是2^31-1

也就是tcp中窗口大小不应该超过2^31,然而实际上窗口最大2^30,为啥,rfc1312中的page 10有明确说明,本文一开始也介绍了。但是还是有点懵逼!!

来看下这个例子Re: [tcpm] TCP window max size

 

 

But mostly the place for trouble is at the boundary conditions, and if
you had a 2^31 window and the sender/receiver windows were fully skewed,
any outstanding packet from before left edge of the senders window could
show up at the receiver and be viewed as new data at the right edge of
the receivers window, which would be a bad thing.

1) Sender has a full 2^31 window of unacknowledged data,
   receiver is a full 2^31 ahead of the sender.
2) Sender retransmits a packet at left edge of it's window.
3) In-transit Ack from Receiver arrives, acks all data, and
   advances window by 2^31.
4) Sender sends a new packet.
5) New packet gets reordered in transit with the retransmitted packet.
6) New packet arrives, and receiver advances window.
7) Retransmitted packet arrives, and it now appears to be at the
   right edge of the receivers window as new data.

A 2^14 max window keeps things well away from those boundary conditions.

   

理论上来说,即使窗口大小设为2^30,也可能出现把老的重传报文当成新的情况,只有这个报文在网络种待的时间够长,但是窗口小了,这样的冲突的概率就小了,我想这也是PAWS提出了的原因。

  PAWS就是Protection Against Wrapped Sequences,它基于timestamp选项,这也限制了PAWS的应用。

 

posted @ 2023-04-08 17:08  codestacklinuxer  阅读(144)  评论(0)    收藏  举报