UP PREV NEXT

Other Papers on the Subject

The conclusion of this experiment is that a reasonably large standard cell library is needed to get full performance from a technology. This is similar to the conclusion reach by Masgonty, Cserveny, Arm, Pfister and Piguet in a paper describing their CSEL-LIB 5 library. This has 22 functions and 92 cells, compared to the minimum vsclib library which has 33 functions and 92 cells.
The Piguet paper though doesn't tell you the list of cells in their library, but it is given here for the vsclib.

Other papers on the subject of which cells to include in a library conclude that a very small library is adequate. A paper by Nguyen and Sakurai describing the CyHP library concludes that a library of 17 combinatorial cells give the same performance as a library of 400 cells. Another paper "Do we need so many cells for digital ASIC synthesis?" by Noullet and Ferreira-Noullet (unfortunately the paper cannot be read at this site) describing the RACL library of 18 cells arrives at a similar conclusion. These conclusions seem to be wrong.

A critical path will always have some high fanout nets. These nets are driven by high drive cells. If the library does not have a high drive cell, then the load is split and each part is driven by a separate weak drive cell.    Note that the Alliance synthesis flow cannot make this optimisation. If you design a library that requires it, then you will have to use a commercial tool like Synopsys for synthesis.

fanout example This works if no allowance is made for interconnect capacitance. As shown in the drawing, the 2-NAND gate driving the cell with a high fanout has a fanout of one with a single high drive destination cell, but a fanout of two when driving two weaker cells. If an allowance is made for interconnect capacitance, for example by estimating it with a wireload capacitance of 6fF, then the 2-NAND with a fanout of two has an extra 6fF load, and this slows it down.

If no allowance is made for interconnect capacitance, then the performance could still be the same. But the area will generally be more, and the routing congestion will be too. For the small example above, implemented with the vsclib, splitting the high drive cell increases the area from 4 to 4.7 gates and reduces the porosity from 42% to 21%.

A weakness of both these papers is their reliance on an existing commercial standard cell library. These aren't always very good. The Noullet paper lists the initial cell list of their standard cell library. There are 168 combinational logic cells, a max of three drive strengths per function except inverters and buffers, and importantly no carry generator cell.

High performance design was difficult with this library when complete, and it isn't surprising that the performance degradation when removing cells was small.