观cliff cummings的《Async FIFO simulation & synthesis》有感

大牛就是大牛

看完我又茅厕顿开了

说的都是一些async FIFO的基础的东西 但是感觉这些基础的东西自己以前都是一知半解的 现在终于算是有点觉悟了

当你使用同步FIFO的时候 大可以不必要弄两个counter 直接一个counter wr就+1 rd就-1,又wr又rd就hold,就完事了

像我之前写的小FIFO总是拘泥于异步FIFO的圈子 弄的乱七八糟的

接着 使用FIFO的时候 也可以考虑通过handshake来同步两边的pointer来进行比较 和判断full/empty 还是看应用的场合来决定 使用何种手段来搞pointer 的跨时钟域的问题

之所以不用binary而用gray,这里还是考虑multi-bit synchronization的问题才提出来的 因为每次只变1个bit 就相当于是在做单bit 的同步了 就解决了这个问题

另外提到的一个小细节就是

Gray code counter is that most useful Gray code counters must have power-of-2 counts in the sequence. It is possible to make a Gray code counter that counts an even number of sequences but conversions to and from these sequences are generally not as simple to do as the standard Gray code. Also note that there are no odd-count-length Gray code sequences so one cannot make a 23-deep Gray code. This means that the technique described in this paper is used to make a FIFO that is power-of-2 deep.

还有就是学到了 写模块的时候不光根据功能 一个FIFO模块 作者分成了好几个小模块来写 作用就在于在做STA的时候根据时钟域都分好了 模块 很好做timing 同时set_falsepath的时候 由于有特定的命名规则 也很好处理这种问题 这一下子感觉level就不一样了!!

而full flag和empty flag产生于wr_clk 还是rd_clk 的道理相信也都清楚 但是作者提出了一个叫pessimistic full & pessimistic full的观念 这个就是其实你认为的full/empty都是非常保守的一种 信号 full只可能比真正的full 信号要早assert同时晚deassert 同样的 empty 也只可能比真正的empty 信号要早assert 同时晚deassert 虽然保守 但是确实有余量 很好用

再说它如何生成empty/full flag。首先他这里用的是gray来比较 并不是像之前我写的 从gray再变成binary再来比较 他是直接用gray码来 比较 empty的信号就是让同步过来的写地址与读地址相同 就是empty,而由于gray code的特点,full的信号是2 MSBs相反 剩下的相同就算是full信号 为什么是这样呢? 让我们来看下面这样一张表

No.     binary      gray
-------------------------
0       0000        0000
1       0001        0001
2       0010        0011
3       0011        0010
4       0100        0110
5       0101        0111
6       0110        0101
7       0111        0100
-------------------------
8       1000        1100
9       1001        1101
10      1010        1111
11      1011        1110
12      1100        1010
13      1101        1011
14      1110        1001
15      1111        1000

最高bit是相当于计数的一个指示器,不是地址,低3bit是地址,当你用binary来计数的时候就会发现绕了一圈同样位置的数字是MSB不同,其他bit相同;而使用gray来计数空满的时候你会发现绕了一圈相同位置的数字,其实是高2bit相反,剩下的bits都相同,这就是用gray编码比较空满的时候的一个特殊的地方。

感觉还是挺不错的 比用binary估计能节省很大的一部分组合逻辑

文章末尾还提出了两个问题,我觉得还是有很大的学问的,这里分享一下:

1.Noting that a synchronized Gray code that increments twice but is only sampled once will
show multi-bit changes in the synchronized value, will this cause multi-bit synchronization problems?

The answer is no. Synchronizing multi-bit changes is only a problem when multiple bits are changing near the rising edge of the synchronizing clock. The fact that a Gray code counter could increment twice (or more) between slower synchronization clock edges means that the first Gray code change will occur well before the rising edge of the slower clock and only the second Gray code transition could change near the rising clock edge. There is no multi-bit synchronization problem with Gray code counters.

问题大概意思就是说,比如一个写的时钟非常快,在读的时钟一个周期里面,写的指针变了两次,那么会不会达不到multi-bit 采样避免亚稳态的效果。答案肯定是否定的,因为multi-bit同步的问题发生在时钟的上升沿,不同的bit,发生变化的延迟时间不同,导致采的数据不一定是同一拍的数据,但是如果一个读时钟变化两次的话说明第一次变化肯定很稳定的远离这个时钟沿,在这次采的时钟沿,相当于还是只有1bit数据发生变化,所以不会有问题。
2.Again noting that a faster Gray code counter could increment more than once between the
rising edge of a slower clock signal, is it possible that the Gray code counter from the faster clock domain
could increment to a full-state and to a full+1-state before full is detected, causing the FIFO to overflow
without recognizing that the FIFO was ever full?

Again, the answer is no using the implementation described in this paper. Consider first the generation of FIFO full. The FIFO goes full when the write pointer catches up to the synchronized read pointer and the FIFO-full state is detected in the write clock domain. If the wclk-domain is faster than the rclk-domain, the write pointer will eventually catch up to the synchronized read pointer, the FIFO will be full, the wfull bit will be set and the FIFO will quit writing until the synchronized read pointer advances again. The write pointer cannot advance past the synchronized read pointer in the wclk-domain.

就是说比如写时钟快,可能不可能写的地址超过了读的地址,overwrite了一些数据,显然是不可能的。如果写的快的话,最终的full是一定的,但是同步过来的读地址是之前的读地址,当你看到读地址和写地址满足full关系的时候,其实读地址已经在这个读的地址往后读了,但是写时钟还没看到你读到了这个新的状态,写时钟域就认为已经满了,这总是会提前满的,所以不必要担心写的地址超过读的地址,一定是在这种设计中会被避免的。

接下来再进行结尾的时候作者又提到了,一种almost full/almost empty信号的生成,这里就需要gray2bin的一个组合逻辑,将binary比如+4,再与另一个指针比较,距离小到一定程度的时候就说明almost full/almost empty信号要assert了。

接着又有一个binary与gray的比较的东东。

Some advantages of using binary pointers over Gray code pointers:
•  The technique of sampling a multi-bit value into a holding register and using synchronized handshaking control signals to pass the multi-bit value into a new clock domain can be used for passing ANY arbitrary multi-bit value across clock domains. This technique can be used to pass FIFO pointers or any multi-bit value.
•  Each synchronized Gray code pointer requires 2n flip-flops (2 per pointer bit). The sampled multi-bit register requires 2n+4 flip-flops (1 per holding register bit in each clock domain, 2 flip-flops to synchronize a ready bit and 2 flip-flops to synchronize an acknowledge bit). There is no appreciable difference in the chance that either pointer style would experience metastability.
•  The sampled multi-bit binary register allows arbitrary pointer changes. Gray code pointers can only increment and decrement.
•  The sampled multi-bit register technique permits arbitrary FIFO depths; whereas, a Gray code pointer requires power-of-2 FIFO depths. If a design required a FIFO depth of at least 132 words, using a standard Gray code pointer would employ a FIFO depth of 256 words. Since most instantiated dual-port RAM blocks are power-of- 2 words deep, this may not be an issue.
•  Using binary pointers makes it easy to calculate “almost-empty” and “almost-full” status bits using simple binary arithmetic between the pointer values.

One small disadvantage to using binary pointers over Gray code pointers is:
•  Sampling and holding a binary FIFO pointer and then handshaking it across a clock boundary can delay the capture of new samples by at least two clock edges from the receiving clock domain and another two clock edges from the sending clock domain. This latency is generally not a problem but it will typically add more pessimism to the assertion of full and empty and might require additional FIFO depth to compensate for the added pessimism. Since most FIFOs are typically specified with excess depth, it is not likely that extra registers or a larger dual-port FIFO buffer size would be required.

打算还是自己写个模板来测试一下 何以解忧 唯有coding

posted @ 2012-08-24 20:43  poiu_elab  阅读(3249)  评论(1编辑  收藏  举报