The question of which cells to include in a standard cell library is an important one. One can consider a complete library as containing a very large number of cells. A good way to deliver it is in staged releases. The first release contains a few important cells which can be quickly designed, but yet are well chosen so that as much as possible of a technology's performance is available to the user. Further releases add cells that enhance the performance available to the user, but the benefit of each new cell becomes increasingly marginal. The library designer decides how many cells should be included based on an assessment of design time and utility of the cells.
Which cells should be included in a first library release?
How big should the final library be?
Three papers covering this subject are described on the right. Two of them claim that a very small library gives similar performance to a large one. The other paper reaches a similar conclusion, although the library size is bigger.
On the other hand, commercial libraries run to hundreds of cells, which are presumably there because they improve the quality of the final synthesized netlist. A book like "Closing the Gap Between ASIC and Custom" by David Chinnery and Kurt Keutzer estimates that a cell library with only two drive strengths could be 25% slower than a library with a rich selection of drive strengths (see Chapter 1, section 9.1).
Companies like Zenasis exist to provide on-the-fly optimisations of combinational logic which is claimed to improve performance by around 15%. These companies would not survive if this claim was completely false.
Paper which concludes that a standard cell library with only 20 cells has the same performance as libraries with 400 cells. This assertion is tested with 31 benchmark designs. These designs are structural netlists that use simple primitives which are then re-targeted to three vendor libraries for the analysis. The least used cells are successively removed and the circuit performance measured. Starting with 400 cells, the authors found no significant loss in speed even with a library set of 20 cells. 17 of these cells are combinational logic with just 11 different functions being used.
Paper which concludes that one can reduce the number of combinational cells in a library from 168 down to 16 with no significant loss of performnce. Eight libraries, from the initial 168 right down to the smallest library of just 8 combinational cells were created. Synopsys was then used to synthesise six benchmark circuits (including a 10×22 multiplier), recording the area and critical path delay. The conclusion was that a library of 16 combinational cells gave virtually the same performance as the full library of 168 cells. For the multiplier, the area was 5.6% more and the critical path delay was 18.4% more (which is actually quite significant!).
Paper which describes a library design exercise with the aim of including only the most useful cells in a library. The authors had already designed one library, the CSEL-LIB 4, and the speed of the new library, the CSEL-LIB 5 was compared to the previous one as new cells were added. The conclusion is that just 22 different logic function are needed, but with 92 different cells because each function is supplied with many different drive strengths. The paper doesn't say how many of these 22 functions are combinational and no cell list is given.
In the following pages, an experiment is conducted to try to find out how many combinational cells are needed in a standard cell library. The Alliance software is used to synthesise an 8×8 multiplier with the vsclib. The library contents are gradually increased up to 188 cells, then decreased to 92 cells, which is considered the minimum size library before significant performance degradation occurs.