This blog aims to provide a worked example to derive device physics for Xilinx UltraScale devices such that the effect of multiple synchronisation stages can be understood, and the synchroniser chain length selected. To achieve this, we need to understand how to calculate the Mean Time Between Failure (MTBF) of a Clock Domain Crossing's (CDC) synchroniser chain. Then how to translate the overall MTBF design goal into the MTBF requirement of a single synchroniser and hence determine the length of each synchroniser required.

- Essential Background
- Reverse Engineering Device Physics
- Change Frequency
- Confirming the MTBF Calculation
- Metastability Window's Relationship to Setup and Hold Times
- Design MTBF
- Choosing the Number of Stages in a Basic CDC Synchroniser
- Conclusions
- References

## Essential Background

If any of this section is unfamiliar, alternative texts need to be read first. Here I just state our starting point in order to support subsequent sections. There are several texts on the subject, and Performance Analysis of Synchronization Circuits provides a good explanation.

This equation describes the interface between clock domains. The destination clock domain's frequency is given by \(F_c\), but we also need to know something about the source clock domain, the "change frequency". This rate of change depends on the presenting data, so if we assume the data (well control signal actually) changes 1 in every *n* clock cycles and we know the launch clock domain's frequency we can calculate \(F_d\). More on this separately later as Xilinx's Vivado has a TCL command to specify this.

The term \(W F_c\) is a probability more easily recognised as \(\frac W {P_c}\), where \(P_c\) is the sampling clock period and hence this term is the probability of data arrival during the metastability window for a given destination clock cycle. The settling time, \(S\), can then be spread over multiple clock cycles, and the rate of decay is governed by \(\tau\).

## Reverse Engineering Device Physics

Xilinx does not provide any values for the physical quantities of any of its devices. This might be because they get updated with new released of their device libraries in Vivado. The last publication is "XAPP 094 Metastable Recovery" dated 24 Nov 1997 and is not available via their DoCNav. We can however reverse engineer the values by using Vivado's TCL command report_synchronizer_mtbf, and analysing the results it produces. For this we need a design to be synthesised by Vivado and from which to make measurements. We can use a generic *n*-stage synchroniser, assign *n* using TCL, and then create a report of the MTBF calculated by Vivado.

Using out of context synthesis seemed to affect the settling times used in the results. So, an outer wrapper of registers needs to be provided to isolate the synchroniser from any direct and manual specification of boundary timing constraints.

Next, we supply the Out of Context (OOC) constraints for timing analysis.

Finally, we can derive measurement results for analysis using the following TCL script.

The plan here is to measure the MTBF of the synchroniser with between 2 and 5 flip-flop stages in the synchroniser chain. For practical reasons, Xilinx will not recognise a single flip-flop synchroniser, that's just a CDC error. The report_synchronizer_mtbf TCL command help text is provided here for reference. The output is a log file that needs parsing, the values are not available directly to TCL for use in script-based calculations. I copied and pasted the results into Excel for analysis with various forumlae.

Refer to the report_synchronizer_mtbf documentation.

### Change Frequency

The source clock domain's change frequency, \(F_d\), needs a little explanation. Vivado's default switching rate is 12.5% (=0.125). That is the rate at which the output of a synchronous logic element switches compared to a given clock input. A toggle rate of 100% means that on average the output toggles once during every clock cycle, changing on either the rising or falling clock edges, and making the effective output signal frequency half of the clock frequency. For clock and DDR signals only, the toggle rate can be specified up to 200%. You convert the switching activity value to a "*change frequency*" by multiplying the fractional value (percentage as a fraction) by half the source clock domain's frequency. The factor of a half is because 100% (=1.0) would be "half of the clock frequency".

Refer to the set_switching_activity documentation.

## Confirming the MTBF Calculation

Note this is not about confirming the correctness of any method of calculating MTBF, just confirming how Xilinx does it. Experimental data was created for a variety of different source and destination clock frequencies when the destination clock frequency was always faster than the source. The results did vary meaning that the values of \(\tau\) and \(W\) derived from Vivado's TCL function have not been precisely determined. However, we do get sensible values in the ranges expected. For brevity a single data set is given below.

NB. Take care with the units used, years vs ns, MHz etc.

Stages | MTBF (s) | ln(MTBF) | Settling Time (s) |
---|---|---|---|

2 | 111.08E+72 | 170.50 | 7.44E-9 |

3 | 154.00E+111 | 260.62 | 11.10E-9 |

4 | 212.07E+150 | 350.74 | 14.90E-9 |

5 | 292.85E+189 | 440.87 | 18.60E-9 |

The settling time is actually the "sum of all slack" in the *n*-stages of the CDC synchroniser, i.e. not just a multiple of the destination clock domain's clock frequency. This is confirmed by reading Quartus Prime documentation, Synchronization Register Chains. As it happens, timing analysis in Vivado consistently reports a slack (after synthesis only) in each stage of the synchroniser (3.711 ns), but it does not match the accumulating values used in the table above. Its close, but the settling times used here are clearly different and variable too. I expect the best settling times to use will be post implementation for the precise path delays achieved (if the approximation is not inherent in the function). Plotting the \(ln(MTBF)\) against settling time, \(S\), does yield a clear linear relationship.

Using the linear relationship we can map the gradient and axis crossing point back to the device's physical values. Here I used linear regression in an Excel spreadsheet to calculate the gradient and offset and substitute those values to derive sensible physical values.

\[ \begin{align} \ln(MTBF) &= \frac S \tau + \ln\left(\frac 1 {W F_c F_d}\right) \\ \text{Mapping to:} \quad y &= mx + c \quad \\ \tau &= \frac1 m = 41.37 \text{ps} \\ W &= \frac {e^c} {F_c F_d} = 2.16 \text{ps} \end{align} \]Parameter Set | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|

Source Clock Frequency (MHz) | 200.0 | 166.7 | 166.7 | 142.9 | 142.9 |

Change frequency, \(F_d\) (M transitions / s) | 12.50 | 10.42 | 10.42 | 8.93 | 8.93 |

Destination Clock Frequency, \(F_c\) (MHz) | 250.0 | 250.0 | 200.0 | 200.0 | 166.7 |

Settling time constant, \(\tau\) (ps) | 41.37 | 41.37 | 41.33 | 41.33 | 41.40 |

Metastability window, \(W\) (ps) | 2.16 | 2.15 | 2.16 | 2.16 | 0.619 |

We can see a trend in the figures as well as a loss in numerical precision caused by exponentiation of a value that may not have been stated sufficiently precisely. We have however reverse engineered the physical properties of a Xilinx xczu2cg-sbva484-2-e part. We have also demonstrated the relationship used by Xilinx to calculate the MTBF of an *n*-stage CDC synchroniser. Each synchroniser stage increases the settling time by a destination clock period. So now S is a multiple of \(1 / F_c\).

The experimental data confirms Xilinx assumes an exponential relationship between improvement of MTBF with the number stages in the synchroniser. This is worth confirming since a literature search on the Internet struggled to find any reference article willing to make this claim. Instead, they offer much more imprecise claims like *"For most synchronization applications, the two flip-flop synchronizer is sufficient to remove all likely metastability."* I have of course assumed Xilinx have got their calculations correctly modelled, and the paper MTBF Bounds for Multistage Synchronizers does confirm this, and includes the subtraction of the intra synchronising flip-flop delays from the settling time. Then develops a more accurate model for MTBF prediction of an *n*-stage synchroniser. A simpler explanation for the *n*-stage synchroniser can be found at Lecture 11 - Mitigating Metastability. These sources have taken me some time to find on the Internet.

### Metastability Window's Relationship to Setup and Hold Times

Quantity | Min (ps) | Max (ps) | Average (ps) |
---|---|---|---|

T Setup_FDRE_C_D (ps)_{SU} |
23 | 25 | 24 |

T Hold_FDRE_C_D (ps)_{H} |
46 | 60 | 53 |

Sum (ps) | 69 | 85 | 77 |

We talk about violating the setup and hold times of a flip-flop as causing a metastable state. Now we see that the metastability window, \(W\) of 2-3 ps, is much smaller than the sum of setup and hold times of ~77ps. So which window does cause a flip-flop to go metastable? Also, I have carelessly talked about extracting "*physical properties*" of the devices, here I stand corrected.

W "is an extrapolated value and is nonphysical; however, is related to the setup/hold window of a ﬂop."

"It is important to understand that is a mathematical tool to enable one to determine the MTBF of a circuit - it is not a physical property of the circuit in the same manner as the setup/hold window."

Miller and Noise Effects in a Synchronizing Flip-Flop, Charles Dike and Edward Burton

I am not currently aware of a particular relationship than can determine the "mathematical tool" that is \(W\) from the actual physical parameters of setup and hold times.

"All flip-flops have a metastability window around the clock edge which lies somewhere between the setup and hold times. When the data changes during this metastability window a flipflop takes longer to reach its final output value than the normal propagation delay."

Asynchronous Inputs and Flip-Flop Metastability in the CLAS Trigger at CEBAF, David Doughty, Stephan Lemon, Peter Bonneau

"The window of time relative to the clock edge where metastability will actually be triggered is much smaller than the window defined by the setup and hold times (on the order of femtoseconds in modern FPGAs), however it’s exact location is not known and is a function of a number of variables including temperature and voltage. Meeting the setup and hold requirements guarantee a metastable state will not be triggered."

Metastability and Clock Uncertainty in FPGA Designs, Ray Andraka

The metastability's window location being dependent on external factors like voltage and temperature explains that we have a movable range within a guaranteed range. Add in some statistical distribution with high probabilities perhaps for the rest of the explanation? But don't get hung-up on the metastability window size because...

"Because the time-resolving constant \(\tau\) has the greatest impact on the mean-time-between-failure (MTBF) of the flip-flop due to its exponential relationship, the design of metastable-hardened flip-flops is focused exclusively on the optimization of \(\tau\)."

Design and Analysis of Metastable-Hardened, High-Performance, Low-Power Flip-Flops, PhD Thesis, David Li

## Design MTBF

There are two approaches to this section. Firstly the text Performance Analysis of Synchronization Circuits provides a derivation of how to calculate the MTBF for multiple CDCs. Secondly Xilinx offer a formula they use for accumulating different CDC's MTBF into a single result. Thankfully they both converge.

You will note the second becomes the first when each CDC being aggregated has the same MTBF. So, the second is more general, but the first equation demonstrates clearly that MTBF gets worse linearly as the number CDCs increases. The VHDL analysed in this CDC example uses 4 parallel CDCs for no particular reason, and the results for each are shown along with the aggregated MTBF. The values used for analysis here have been taken from a single CDC.

## Choosing the Number of Stages in a Basic CDC Synchroniser

The choice for the number of stages in each CDC synchroniser needs to be goal driven. Here the goal is set by the overall design, and we work back to a single synchroniser, using knowledge of the MTBF for a 1-stage synchroniser, which Xilinx's report_synchronizer_mtbf TCL command will not give us. We now know how this is calculated from the MTBF equation, or how to "fudge" it from the results that are returned by report_synchronizer_mtbf. A similar analysis ought to be possible in Intel's Quartus Prime using their report_metastability TCL command.

Estimate:

- The number of CDCs in your design, which could be tough,
- Decide the overall MTBF you should be aiming to achieve,
- Calculate the MTBF you need for a single CDC,
- Calculate the number of synchroniser stages required.

In practice it will be tempting to get Vivado to calculate the MTBF for the whole design and then just add a stage if we were not happy. However, this requires the number of stages in each CDC to be driven by a single generic value. Also, that changes in this generic value do not upset any functional timing. Hence some "rule of thumb" might be helpful in advance.

## Conclusions

It is now possible to understand the MTBF calculation method and relationships with precision, even if in practice it is non-trivial to put into use. Adding a synchroniser stage to all CDCs in a design gives multiplicative benefit and increasing the number of CDC synchronisers gives a linear degradation in MTBF.

## References

- Github Source Code.
- Performance Analysis of Synchronization Circuits, MPhil Thesis, Zhen Zhang.
- Clock Domain Crossing (CDC) Design & Verification Techniques Using System Verilog, Clifford E. Cummings.
- Wikipedia Metastability (electronics)
- A survey and taxonomy of GALS design styles, Paul Teehan, Mark Greenstreet, Guy G. Lemieux
- Quartus Prime Synchronization Register Chains
- Understanding Metastability in FPGAs, Altera
- Metastability and Synchronizers: A Tutorial, Ran Ginosar
- Lecture 11 - Mitigating Metastability, Ryan Robucci
- MTBF Bounds for Multistage Synchronizers, Salomon Beer, Jerome Cox, Tom Chaney and David Zar
- Metastability and Clock Uncertainty in FPGA Designs, Ray Andraka

### 2 comments

#### Comment from: Adrian Byszuk Visitor

#### Comment from: philip Member

Thank you for your kind words.

Philip

I’ve just discovered this blog post and I’d like to thank you for it.

This post, just like the whole blog is just great!