UP PREV NEXT |

Initial circuit | Timing overview | Best stage effort | Library mapping | Gate retiming | Input buffering | Better accuracy | Prior art | Summary | Conclusions | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

## Library analysis to derive the typical load value
A library analysis is performed which derives
a _{S} |

A | is the estimate of the cell area |

C_{OUT} | is the load capacitance |

C_{S} | is the typical load value for the function |

d | is the delay fixed during the timing algorithm |

τ | is the characteristic delay of the technology (9.7ps for the 0.13um vsclib) |

p | is the parasitic delay of the gate |

*C _{S}* =

Deriving the drive strength from
the estimated cell area, pin a of a 2-XOR gateC = 0.47, _{S}g = 1.65 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

x05 |
x1 |
x2 |
x3 |
x4 | ||||||

C_{IN} |
3.34 | 5.34 | 10.30 | 15.07 | 20.29 | |||||

Adjusted C_{IN} |
2.97 | 5.34 | 10.22 | 16.23 | 21.23 | |||||

Actual width | 8 | 8 | 15 | 19 | 25 | |||||

Est. width A=C/(_{IN}gC)_{S} |
3.8 | 6.9 | 13.1 | 20.9 | 27.4 | |||||

Mapping | 0 | x05 |
5.1 | x1 |
9.5 | x2 |
16.6 | x3 |
23.9 | x4 |

So for example, if the estimated area is 8 tracks,
which lies between 5.1 and 9.5, then the mapped cell is
the **xor2v0x1**.

The boundary areas such as 9.5 are determined
by the expression

*A*/6.9 = 13.1/*A*;
*A*^{2} = 90.3;
*A* = 9.5

The precise method of getting the drive strength
of a cell from its estimated area is not described in
the patent. The patent limits itself to describing
the use of *C _{S}* to estimate the area
as though this is
sufficient for mapping to the standard cell library.
The method above uses the logical effort of the individual
cell compared to the overall average to adjust the value
of

A correct mapping is important for the quality of the final netlist timing, and the method used here is an improvement over the one used in the original paper on July 25, 2005.

The second schematic on the right **Fig 4b** shows the
same netlist as **Fig 4a** but using the vsclib
timing instead of an idealised approximation.
There is a good correlation between the estimated
delays coming from the constant delay model
and the actual library delays.
The critical path is slightly down at 368ps from 371ps,
and the maximum input capacitance is up from
86fF to 95fF.
Although the 4-bit adder timing spec of
350ps is almost met, the input capacitances are too large.

The estimated area of the 4-bit adder is 83.1 gates and the actual area is 85.0 gates, a 2.2% inaccuracy.

The critical path delay is longer, up from 371ps to 441ps, but the input pin capacitances are lower, down from a maximum of 86fF to 45fF.

When the timing from the actual library cells replaces
the constant timing in **Fig 4d**, the delays again remain similar.
The critical path is up from 441ps to 449ps
and the maximum input capacitance is down slightly from
52fF to 45fF, as shown in the fourth schematic on the right.

The estimated area of the 4-bit adder is 48.7 gates and the actual area is 52.3 gates, a 7% inaccuracy.

Critical Path | Input Capacitance | Gate Count | |
---|---|---|---|

Initial schematic | 545 | 45 | 40 |

Stage effort ρ=5.4 | 449 | 45 | 52 |

Stage effort ρ=3.6 | 368 | 95 | 85 |

This poor fit occurs because a single coefficient

Later we will see what difference mapping with two area coefficients makes.

UP PREV NEXT |

4-AUG-05 |