

# A Design of Threshold Logic Flip-Flops for Minimizing Power, Leakage and Area of Standard Cell ASIC

Ms. T. Archana, Electronics and Communication Engineering Department, Saveetha Engineering College, Chennai, India.

**Dr. S. Praveenkumar,** Electronics and Communication Engineering Department, Saveetha Engineering College, Chennai, India

Article Info Volume 82 Page Number: 2961 - 2966 Publication Issue: January-February 2020

Article History

Article Received: 14 March 2019 Revised: 27 May 2019 Accepted: 16 October 2019 Publication: 19 January 2020

#### Abstract

In modern VLSI design power becomes one of the major issues. The clock consumes more power which is dominant part in recent trends in integrated circuits. In proposed design the power can be reduced by replacing some flip-flops with fewer multi bit flip-flop. On other hand this procedure influences the performance of the integrated circuits. It will leads to a complex problem when the replacement of flip-flop has been done without considering the timing and placement capacity consideration. We have proposed techniques to eliminate this problem. To identify the flip-flop to replace we perform coordinate transformation that can be merged and their legal region. Further, we developed combination table to specify possible combination of flip-flop given by the library. At last, we exercise a hierarchical way approach to combine flip-flop in encryption standard register. In addition to power reduction, reducing the number of register also considered. In the test case, which consists of 1000 flip-flops we have achieved minimize the time to replace flip-flops and reduced up to 21% power reduction.

Keywords: VSLI design, transformation, integrated circuits, flip-flop.

#### I. PROPOSED SYSTEM

Let us see notation of the problem formulation

1) Let fi correspond to a flip-flop and bi correspond to its bit width.

2) Let A( fi ) stand for the area of fi .

3) Let P( fi ) correspond to all the pins connected to fi .

4) Let M(pi, fi) correspond to the Manhattan distance between a pin pi and fi, where pi is an I/O pin that connect to fi.

5) Let S(pi ) symbolize the restriction of maximum wire length for a net that connects to a pin pi of a flip-flop.

6) In a particular placement region, we split it into several bins [see Fig. 3(b) for example], and each bin is indicated by Bk.

7) Let RA(Bk) symbolize the remaining area of the bin Bk that can be used to place additional cells.

8) Let L represent a cell library which contains different flip-flop types.

To reduce total power consumption, merge as many flip-flop as possible in a given cell library. To substitute a number of flip-flops f1,..., f j-1 by a latest flip-flop f j ,the summation of bit widths in the original one must be equal to the bit width of f j. Due to replacement of flip flop, the routing length of flip flop has been changed. Some of the path timing has been chaged. It certainly modify timing of some paths after the replacement, to assure that a allowed placement can be obtained. To reflect on these issues, we describe two limitations as follows.



1) There will be timing constraint of net which connecting pin pi to flip-flop f j. To avoid timing issue after replacent, Pi must be longer in length than Manhattan distance between pi and Fj. We can get a possible placement region for a flipflop fj based on each timing constraint defined on a pin. . See Fig. 4 for example. The one bit flipflop connected by pins p1 and p2. To figure out a diamond region, which is represent by Rp(pi), i =1, the possible placement region of f1 constrained by the pin pi or spot the region with this by scattered lines in the figure. The solid lines represents the overlapping region which is formed by the the legal placement region of f1. Rp(p1) and Rp(p2) got overlap region of R( f1). R( f1).

2) Capacity Constraint for Each Bin Bk : The bin Bk consists of totala area of flip flop which should not be greater than the remaining area of the bin Bk. (i.e., A( fi )  $\leq$  RA(Bk)).

#### **II. CLOCK DISTRIBUTION NETWORKS**

There are different techniques employed in Integrated Circuit design to minimize the clock skew. A few of the techniques are listed below:

- a. Clock Trees
- b. Single Clock Mesh/Grid
- c. Clock Trees with Multiple Local Meshes

Clock trees work on the principle that relative phase of the clock at two sinks is more important than the absolute delay in the clock path from the source to the sink. Clock trees are balanced clock distribution networks. The most commonly used trees used for clock distribution are the H-TREE networks.

Clock distribution networks with H-Tree structure are constructed by recursively connecting H-Tree structures to each other as shown in Figure 4 : H-Tree Network. The H-Tree structure at each level has lengths equal to half of the previous level. For example if a level 1 H-Tree has lengths of 11 and 12, then the level 2 H-tree will have lengths 13=11/2 and 14=12/2. A simple H-Tree structure is shown in the figure below:



Figure 4 : H-Tree Network

Some of the main advantages of H-Tree clock distribution network are ideally zero skew, low power, low area and ease of generation. It also has some disadvantages. The sink distribution in an integrated circuit may not be as uniform as the H-Tree structure. There might be a large concentration of sinks at one location whereas a very less sink concentration at another location. In such a case the capacitive load distribution connected to the H-Tree vertices is highly irregular. This will result in non zero skew among sinks connected to different vertices. However techniques like introduction of dummy loads, wire snaking can be employed to overcome these disadvantages.

Another approach used for clock distribution is the singleclock mesh. In this method a grid or mesh is used in the final stage of the clock distribution network. A balanced tree like an H-Tree is used to connect the main clock source to the local grid. The grid/mesh helps to achieve a low local skew i.e. the skew close to the sinks and the H-Tree structures help to achieve a low skew in the path from the clock source to the clock grid. Therefore this structure combines the



advantages of both the Regular H-Tree structure as well as a grid.

One of the major advantages of the single clock mesh structure is that it helps to achieve very low skews as compared to the only H-Tree structure. Also this structure accommodates for late design as the grid is easily accessible from various points on the integrated circuit. Therefore the clock mesh can be designed in the early stages.

The major disadvantages of the single clock mesh structure are its huge power consumption and wire length. Also, in this approach it is not possible to selectively turn off the clock (for power reduction) to a certain area of the integrated circuit because of the single mesh structure i.e. clock gating is not possible in this approach. The clock mesh structure is shown in Figure 5-Level H-Tree with Single 8X8 Mesh

Figure 5-Level H-Tree with Single 8X8 Mesh



A third method of clock distribution is Clock Tree with Local Mesh. This is an improvement over the previous clock mesh scheme. In this case instead of having a single mesh, there will be a local mesh for a group of sinks. Thus we would have a number of local meshes. All these local meshes would be connected to the main clock source using balanced trees like H-Trees.

The major benefit with this scheme is selectively turning off certain local meshes when not in use, we can able to reduce power. This is clock gating. The disadvantage with this scheme is that the clock skew is slightly worse when compared to single clock mesh scheme.

This architecture can be made reconfigurable by connecting two adjacent local meshes by transmission gates. The transmission gates will be turned on only if the two meshes connecting to it are turned on. The insertion of transmission gates brings

### **III. H-TREE CLOCK NETWORK DESIGN**

The clock network based on H-tree can be constructed using all the above described sub circuits. We start by placing the first level H-Tree at the centre of the given sink distribution. the first level H-Tree with x and y coordinates will be given by

$$X = X_{\min} + (X_{\max} - X_{\min})/2$$
 &

 $Y = Y_{min} + (Y_{max}-Y_{min})/2$  where

 $X_{min}$  = The minimum X-Coordinate of the given sink distribution

 $X_{max}$  = The maximum X-Coordinate of the given

sink distribution

 $Y_{min}$  = The minimum Y-Coordinate of the given sink distribution

 $Y_{max}$  = The maximum Y-Coordinate of the given

sink distribution

The lengths of the first level H-Tree are given by

$$11 = 2^{*}(X_{max}-X)$$
 and  
 $12 = 2^{*}(Y_{max}-Y)$ 

The First level H-Tree is constructed using the H-Tree structure shown in **Error! Reference source not found.** (This supports clock gating) and the subsequent level H-Tree structures can be constructed using the H-Tree structure shown in the **Error! Reference source not found.** . The lengths of the H-Tree branches are halved as we move to the higher H-Tree levels.



The First level H-Tree includes a central buffer which is driven by the main clock source. It also consists of clock gates to selectively turn off power to a specific region of the integrated circuit. The signals clk\_cntl0, clk\_cntl1, clk\_cntl2 and clk\_cntl3 are used to enable clock gating. If the clock gating is enabled at any of the leaf nodes of the first level H-Tree then the higher level H-Tree structures connected to that particular node will also be turned off thus saving power.

The connection to the sinks should be made from the vertices (leaf nodes) of the last level H-Tree. In order to connect the sinks, first the vertex of the H-Tree closest to the sink is to be determined. Since the benchmark circuits are described in the Manhattan plane, we need to consider Manhattan distance while connecting the sinks to the H-Tree vertices. The Manhattan distance between two points is can be obtained by adding the line segment which is projected form the coordinate axis.In other words, if point P<sub>1</sub> has coordinates P<sub>1</sub>(X1,Y1) and point P<sub>2</sub> has coordinates P<sub>2</sub>(X2,Y2) then the Manhattan distance between the two points is given by |X1-X2|+|Y1-Y2|.

Buffer chains are inserted between the H-Tree Vertex and the Sink. Buffer chains are used to isolate the load from the H-Tree. The H-Tree will only see a load equal to the Input capacitance of the buffer chain. At the same time by setting a correct stage ratio we would we able to drive a huge load capacitance presented by the sinks. The buffers also help to improve the clock slew rate. Without the buffers the clock slew would be very large.

A PI-Model is used to connect the output of the buffer chain to the sink. The PI-Model emulates the behavior of the wire segment connecting the buffer chain output to the sink. The length parameter of the PI-Model sub circuit is set equal to the above calculated Manhattan distance. This is indicated in the figure on the next page.

Starting with a approximately assumed number for number of H-Tree levels, buffer sizes, stage ration several simulations had to be run to find out the exact values which give the optimum values of clock skew, slew and power.

## IV. SIMULATION RESULTS

The table below summarizes the simulation results for the two test cases r1 (267 sinks) and r5 (3101 sinks) with H-Tree based clock distribution network. For r1 the results were obtained using three levels of H-Tree and for r5 the results were obtained using five levels of H-Tree

| ر:<br>ا | NUMBER OF H-<br>TREE LEVELS         | MAX SKEW (ps) | MAX RISE SLEW (ps) | RISE SLEV | INIAA FALL SLEW | <b>L</b> | MUNEK CONSUMED |  |
|---------|-------------------------------------|---------------|--------------------|-----------|-----------------|----------|----------------|--|
| D       | 3                                   | 25<br>.4<br>6 | 23                 | 10        | 210             | 16       | 55             |  |
| R       | 3                                   | .4            | 8.                 | 19        | 218             | 16       | 55.            |  |
| 1       |                                     | 6             | 17                 | 8.9       | .16             | 6.0      | 08             |  |
|         |                                     | 28<br>.7      | 22                 |           |                 |          |                |  |
| R       | 5                                   | .7            | 6.                 | 16        | 197             | 14       | 650            |  |
| 5       |                                     | 5             | 78                 | 6.7       | .92             | 4.6      | .18            |  |
|         |                                     | -             |                    |           |                 |          |                |  |
|         | Table 1: Phase-I Simulation Results |               |                    |           |                 |          |                |  |



Figure 1.Waveform showing Maximum and Minimum skew points for R1



Figure 7. Waveform showing Maximum and Minimum skew points for R5

# V. RESULTS AND DISCUSSION

Clock Distribution:



Above wave form shows different clock signals with skews generated by clock distribution network. Here we use totally 20 regional clocks to apply for different flip flops. The clock skew is the difference of maximum and minimum delay.

# VI. CONCLUSION

The power consumed by location of the flip-flop has been determined and replacing flip-flops. The fewer figure of flip-flops symbolize fewer figure of clock sinks during clock tree synthesis. Hence the clock network with the drop of power consumption and less routing paths. Further the larger multi bit flip flop replaces smaller flip flops which will lead to device size variation. The inverter based clock buffer increase appreciably in CMOS technology process. The driving capability of clock buffer,

The number of minimum-sized inverters which can be diven on given rising and falling time. To avoid unnecessary power waste numerous flipflops can divide a common clock buffer.

## REFRENCES

- [1] P. R. Panda, A. Shrivastava, B. V. N. Silpa, and K. Gummidipudi, Power-Efficient System Design. New York, NY, USA: Springer, 2010.
- [2] K.-Y. Siu, V. Roychowdhury, and T. Kailath, Discrete Neural Computation: A Theoretical Foundation. Englewood Cliffs, NJ, USA:Prentice-Hall, 1995.
- [3] V. Beiu, "A survey of perceptron circuit complexity results," in Proc.Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2003, pp. 989–994.
- [4] V. Beiu, J. M. Quintana, and M. J. Avedillo, "VLSI implementations of threshold logic—A comprehensive survey," IEEE Trans. Neural Netw.,vol. 14, no. 5, pp. 1217–1243, Sep. 2003.
- [5] B. Nikoli'c, V. G. Oklobdžija, V. Stojanovi'c, W. Jia, J. K.-S. Chiu, and M. M.-T. Leung, "Improved sense-amplifier-based flip-flop: Design andmeasurements," IEEE J. Solid-State Circuits, vol. 35, no. 6, pp. 876–884, Jun. 2000.
- [6] R. Strandberg and J. Yuan, "Single input currentsensing differentiallogic (SCSDL)," in Proc. IEEE Int. Symp. Circuits Syst., vol. 1.May 2000, pp. 764–767.
- [7] M. Padure, S. Cotofana, and S. Vassiliadis, "Design and experimentalresults of a CMOS flipflop featuring embedded threshold logic," inProc. Int. Symp. Circuits Syst., May 2003, pp. V-253– V-256.
- [8] S. Leshner, N. Kulkarni, S. Vrudhula, and K. Berezowski, "Designof a robust, high performance standard cell threshold logic family forDSM technology," in Proc. IEEE Int. Conf. Microelectron., Dec. 2010,pp. 52–55.
- [9] S. Leshner, "Modeling and implementation of threshold logic circuitsand architectures," Ph.D. dissertation, Comput. Sci., Arizona State Univ., Tempe, AZ, USA, 2010.
- [10] V. J. Modiano, "Majority logic circuit using a constant current bias,"U.S. Patent 3 155 839, Nov. 3, 1964.
- [11] R. Z. Fowler and E. W. Seymour, "Direct coupled, current mode logic,"U.S. Patent 3 321 639, May 23, 1967.
- [12] J. A. Hidalgo-López, J. C. Tejero, J. Fernández, and A. Gago, "Newtypes of digital comparators,"



in Proc. IEEE Int. Symp. Circuits Syst., Apr./May 1995, pp. 29–32.

- [13] J. M. Quintana, M. J. Avedillo, R. Jiménez, and E. Rodríguez-Villegas, "Practicallow-cost CPL implementations threshold logic functions," inProc. 11th Great Lakes Symp. VLSI, 2001, pp. 139–144.
- [14] R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOSversus pass-transistor logic," IEEE J. Solid-State Circuits, vol. 32, no. 7,pp. 1079–1090, Jul. 1997.
- [15] J. Lerch, "Threshold gate circuits employing field-effect transistors,"U.S. Patent 3 715 603, Feb. 6, 1973.
- [16] H. Özdemir, A. Kepkep, B. Pamir, Y. Leblebici, and U. Çilingiro glu, "A capacitive thresholdlogic gate," IEEE J. Solid-State Circuits, vol. 31,no. 8, pp. 1141–1150, Aug. 1996.
- [17] J. López-García, J. Fernández-Ramos, and A. Gago-Bohórquez, "A balancedcapacitive threshold-logic gate," Analog Integr. Circuits SignalProcess., vol. 40, no. 1, pp. 61–69, 2004.
- [18] T. Shibata and T. Ohmi, "An intelligent MOS transistor featuring gatelevelweighted sum and threshold operations," in Proc. Int. ElectronDevices Meeting, Dec. 1991, pp. 919– 922.
- [19] K. Kotani, T. Shibata, M. Imai, and T. Ohmi, "Clock-controlled neuron-MOS logic gates," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 45, no. 4, pp. 518–522, Apr. 1998.
- [20] H.-Y. Huang and T.-N. Wang, "CMOS capacitor coupling logic (C3L)circuits," in Proc. 2nd IEEE Asia Pacific Conf. ASICs, Aug. 2000,pp. 33–36.
- [21] P. Celinski, J. F. López, S. Al-Sarawi, and D. Abbott, "Low power, high speed, charge recycling CMOS threshold logic gate," Electron. Lett.,vol. 37, no. 17, pp. 1067–1069, Aug. 2001.
- [22] M. J. Avedillo, J. M. Quintana, A. Rueda, and E. Jiménez, "Lowpower CMOS threshold-logic gate," Electron. Lett., vol. 31, no. 25,pp. 2157– 2159, 1995.
- [23] S. Bobba and I. N. Hajj, "Current-mode threshold logic gates," in Proc. Int. Conf. Comput. Design, 2000, pp. 235–240.
- [24] S. Dechu, M. K. Goparaju, and S. Tragoudas, "A metric of tolerance for the manufacturing defects

of threshold logic gates," in Proc. 21st IEEEInt. Symp. Defect Fault Tolerance VLSI Syst., Oct. 2006, pp. 318–326.

[25] S. Muroga, Threshold Logic and Its Applications. New York, NY, USA:Wiley, 1971.

#### **AUTHORS PROFILE**



T ARCHANA received her master's degree (M.E –APPLIED ELECTRONICS) at JJ College Of Engineering bachelor's degree EIE Engineering at Mookambigai College Of Engineering, 2012 and 2007 respectively. She has over 10.5 years of experience in teaching. He has published many papers in various International Journals, International Conferences, National Journals and National Conferences. His research interests are MEMS, VLSI Design ,Embedded Systems. She is currently working as an Assistant Professor in Department of Electronics and Communication Engineering, Saveetha Engineering College, Thandalam, Chennai.



S. PRAVEENKUMAR, working as Professor, ECE at Saveetha Engineering College, Chennai. He has 13 years of Teaching Experience. He has presented his work in 10 International conferences. He has published 15 International and 5 National Proceedings and He has 15 International journal publications. He got the Best Teacher award for the last consecutive years. His research area includes MEMS, VLSI, Biomems, Satellite communication.