| UP PREV NEXT |
gate count 1751 number of cells 568 number of library cells 92 number of used cells 50 max fanin 4 max input capacitance 94 max internal fanout 34 critical path 0fF 2123 critical path 6fF 2462 |
By selectively removing cells, the library size can be more than halved with only a 0.9% loss in performance.
The interesting observation here is the removal of the x1 drive strength cells does not worsen performance significantly. For most functions, these cells are the ones with the largest transistor sizes before folding in the smallest area. But the conclusion is that cells with half sized transistors are better, because they load the critical path less when driving non critical outputs; and x2 or stronger drive strengths are chosen for the critical path.
The critical path itself is shown below on the left, with the full library critical path on the right.
< 92 cell library critical path > < 188 cell library critical path >
x 1 3 51 x 1 3 61
1 bf1v0x12 15 a->z 191 140 bf1v0x12 15 a->z 208 147
2 nd4v0x3 1 d->z 294 103 nd4v0x3 1 d->z 311 103
3 oai21v0x8 4 b->z 375 81 oai21v0x8 4 b->z 392 81
4 iv1v0x12 1 a->z 424 49 xor2v0x4 1 b->z 473 81
5 oai21v0x8 4 a2->z 513 89 cgi2v0x3 3 c->z 578 105
6 xor2v0x4 1 b->z 610 97 iv1v0x6 1 a->z 627 49
7 cgi2v0x3 3 a->z 729 119 cgi2v0x3 3 c->z 730 103
8 iv1v0x4 1 a->z 784 55 iv1v0x6 1 a->z 779 49
9 cgi2v0x3 3 c->z 888 104 cgi2v0x3 3 c->z 889 110
10 iv1v0x4 1 a->z 943 55 iv1v0x6 1 a->z 938 49
11 cgi2v0x3 3 c->z 1039 96 cgi2v0x3 3 c->z 1055 117
12 iv1v0x4 1 a->z 1094 55 iv1v0x6 1 a->z 1104 49
13 cgi2v0x3 3 c->z 1203 109 cgi2v0x3 3 c->z 1209 105
14 iv1v0x4 1 a->z 1257 54 iv1v0x6 1 a->z 1258 49
15 cgi2v0x3 3 c->z 1362 105 cgi2v0x3 3 c->z 1361 103
16 iv1v0x4 1 a->z 1416 54 iv1v0x6 1 a->z 1410 49
17 cgi2v0x3 4 c->z 1534 118 cgi2v0x3 4 c->z 1529 119
18 xnr2v0x3 1 a->z 1638 104 xnr2v0x3 1 a->z 1633 104
19 xor2v0x4 1 b->z 1727 89 xor2v0x4 1 b->z 1722 89
20 cgi2v0x3 2 a->z 1822 95 cgi2v0x3 2 a->z 1817 95
21 iv1v0x4 1 a->z 1877 55 iv1v0x4 1 a->z 1871 54
22 cgi2v0x3 2 c->z 1960 83 cgi2v0x3 2 c->z 1960 89
23 iv1v0x4 1 a->z 2010 50 iv1v0x6 1 a->z 2009 49
24 cgi2v0x2 2 c->z 2096 86 cgi2v0x3 2 c->z 2090 81
25 an2v0x4 2 b->z 2205 109 an2v0x8 2 b->z 2194 104
26 an2v0x8 2 b->z 2315 110 an2v0x8 2 b->z 2304 110
27 xor2v0x2 0 b->z 2462 147 xaon21v0x3 0 a2->z 2441 137
r 14 r 15 |
These two critical paths are nearly the same. Only gates 3,4,5,6 and 27 are different. It is useful to analyse the differences and see where the loss in speed occurs.
If now further cells are removed, they will either be the high drive cells needed for the critical path, or cells like the xaon21v0x3 which can both appear on the critical path and significantly reduce the cell count. So from this analysis, the minimum set of combinatorial cells which gives the best performance is 92 cells. Increasing the library to 189 cells gives a slight performance benefit, 0.9% measured with the multiplier. Including the extra cells is a choice for the library developer.
| Table of synthesis results | |||||||
| critical path (ps) | gate count | cell count | porosity | library cells | used cells | ||
| synthesis 1 | 4279 | 1561 | 923 | 43% | 9 | 8 | basic inverters, NAND & NOR gates |
| synthesis 2 | 4236 | 1472 | 792 | 45% | 15 | 12 | AND & OR gates |
| synthesis 3 | 4157 | 1357 | 696 | 46% | 19 | 16 | AOI & OAI gates, 2/1 and 2/2 |
| synthesis 4 | 4157 | 1357 | 696 | 46% | 20 | 16 | mxi2 2-way inverting mux |
| synthesis 5 | 3983 | 1343 | 668 | 48% | 21 | 16 | cgi2 carry generator inverting |
| synthesis 6 | 3948 | 1352 | 668 | 48% | 28 | 18 | inverters with multiple drive strengths |
| synthesis 7 | 3061 | 1433 | 666 | 51% | 70 | 27 | x2 drive strengths for all functions |
| synthesis 8 | 3056 | 1456 | 666 | 52% | 70 | 30 | BOOG with x1 drive strengths |
| synthesis 9 | 2960 | 1476 | 666 | 53% | 70 | 32 | BOOG with x05 drive strengths |
| synthesis 10 | 2963 | 1480 | 666 | 53% | 76 | 34 | nd2a and nr2a cells |
| synthesis 11 | 2963 | 1480 | 666 | 53% | 79 | 34 | nd2ab type of 2-OR |
| CyHP library | 3778 | 1539 | 832 | 46% | 18 | 17 | Minimum size library |
| synthesis 12 | 2908 | 1362 | 553 | 54% | 91 | 38 | AND/OR into XOR/XNOR |
| synthesis 13 | 2893 | 1378 | 551 | 55% | 103 | 39 | aoi211, aoi31, oai211 & oai31 |
| synthesis 14 | 2931 | 1400 | 562 | 55% | 104 | 38 | 3-XOR gate, 1/2 stage delays |
| synthesis 15 | 2886 | 1390 | 536 | 56% | 109 | 40 | 3-XOR/XNOR gates as 2×2-I/P gates |
| synthesis 16 | 2665 | 1514 | 538 | 60% | 136 | 46 | x3 drive strength cells |
| synthesis 17 | 2567 | 1571 | 540 | 61% | 155 | 49 | x4 drive strength cells |
| synthesis 18 | 2523 | 1611 | 540 | 62% | 167 | 49 | x6 drive strength cells |
| synthesis 19 | 2497 | 1625 | 538 | 62% | 179 | 54 | x8 drive strength cells |
| synthesis 20 | 2493 | 1628 | 541 | 62% | 188 | 55 | buffers to decouple non-critical paths |
| synthesis 21 | 2441 | 1758 | 563 | 64% | 188 | 55 | input buffers |
| synthesis 22 | 2550 | 1717 | 535 | 64% | 188 | 55 | optimised Alliance flow |
| synthesis 23 | 2439 | 1695 | 560 | 63% | 188 | 58 | current 209 cell vsclib |
| synthesis 24 | 2462 | 1751 | 568 | 64% | 92 | 50 | reduced 92 cell library |
| UP PREV NEXT |