|UP PREV NEXT|
|Initial circuit||Timing overview||Best stage effort||Library mapping||Gate retiming||Input buffering||Better accuracy||Prior art||Summary||Conclusions|
Calculating the best stage effortAccording to the theory of logical effort, the fastest circuit occurs when each stage bears an equal effort. That is, the values of gh in the expression d = τ × (p+gh) all have the same value. The best stage effort is the one that minimises the delay, and is used to determine the best number of stages in a logic network.
We label the best stage effort ρ and it occurs when the
parasitic delay of an inverter, pinv
equals ρ × (ln(ρ) - 1), or
If the inverter parasitic delay is 0, i.e. an ideal inverter, then the best stage effort occurs when ρ = e or 2.7. If the inverter parasitic delay is 1 (as it is in the text on logical effort referenced by the patent), then the best stage effort occurs when ρ = 3.6. For a DSM technology like 0.13um, the inverter parasitic delay is much higher. For the vsclib characterised in 0.13um the inverter parasitic delay is 3.6, which gives a value for best stage effort ρ = 5.4.
Using the best stage effort to calculate fixed gate timingIn the method of the patent, an initial fixed delay is given to each function. This delay equals that of the best stage effort:
d = τ × (p + ρ)
The parasitic delay, logical effort and the two fixed delays are shown in the table on the right. The fixed delay used in the method described in the patent is in the column with ρ=3.6.
Consideration of the errors applying logical effort to non-inverting gatesAn advantage of the logical effort theory is the simplification of timing information so that each function has one set of timing numbers. If the standard cell library is sufficiently rich, then the predicted timing will be met by choosing cells which are close to the desired drive strength.
This really only works for single stage inverting cells. For 2-stage non-inverting cells, the approximation is poor once the gain h, or COUT /CIN exceeds a value of about 4. The graph on the right shows a good clustering of parasitic delay and logical effort around a mean for the inverting NAND and NOR gates and inverters, but a worse clustering for the non-inverting AND and OR gates and buffers. For these gates, the weak drive strengths have less delay when lightly loaded, shown by a smaller value of parasitic delay, p. But when heavily loaded, the weak drive strengths will have a larger delay, as shown by the larger value of logic effort, g.
The nd2ab, or2v0 and or2v4 are all 2-OR gates implemented in different ways. As a result, their values of parasitic delay and logical effort are different. The patent does not teach how to handle this situation. The starting netlist of the 4-bit adder does not have any 2-OR gates, so the problem would only arise if a 2-NOR gate needs to be inverted.
There are also 3 different types of 2-XOR gate implementation. Here average values are taken from all the variants which leads to more substantial errors between the estimated and actual values. For example, there is a 15% difference between the estimated and actual parasitic delay for the worst case cell, the xor2v1x05. For the 2-XNOR gate, where the different drive strengths all have a similar implementation, the maximum difference between the estimated and actual parasitic delays is 10%.
|UP PREV NEXT|