# Congestion and Power Integrity Aware Placement and Routing for 3D Packaging

Jacob Minz, Jinwoo Choi, Madhavan Swaminathan, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {jrminz, jwchoi, madhavan.swaminathan, limsk}@ece.gatech.edu

*Abstract*— One of the major concerns for a 3-D package is to deal with power supply noise. Decoupling Capacitances (decap) allocation is a powerful technique to suppress power supply noise. In this work we integrate noise analysis and decap estimation in the floorplanning process. We also use the global routers results directly to estimate congestion and tight couple global routing with floorplanning to get a better area/congestion tradeoff. Our experiments prove the quality of our approach. We obtain improvements in both decap amount and congestion with only small increase in area, wirelength and runtime.

# I. INTRODUCTION

The true potential of System-On-Package (SOP) technology illustrated in Figure 1 lies in its capability to integrate both active and passive components into a single high speed/density 3D packaging substrate. 3D packaging offers an order of magnitude saving in area, delay, and power compared to the conventional PCB and MCM technology. We leverage our recently developed 3D packaging layout tool to tackle power supply noise and congestion problem that are seriously threatening the performance and reliability of 3D packaging. Existing approaches consider power supply noise and congestion as an afterthought, which may require excessive amount of decoupling capacitance (= decap) to suppress the Simultaneous Switching Noise (SSN) and increase the overall layout area to alleviate congestion problem. In addition, many iterations are required between full-length SSN/congestion simulation and manual layout repair until we converge to a satisfactory result. Our goal is to overcome this problem with decap/thermalaware 3D layout tools.

Power supply noise is a crucial factor deciding the reliability and ensuring the correct functioning of any circuit. The area of 3-D packaging has seen lot of recent developments. Prototypes have been developed and studies suggest that 3-D integration technologies have a lot of advantages to offer in terms of reduced area of package and system integration. In simple terms, 3-D package can be seen as a multi-layer placement and routing system. The continuing trend of shrinking feature size in recent design, have also led to reduced power supply. This results in reduced noise margin which effects reliability and may even cause functional failures due to spurious transitions. Signal integrity is an important aspect of 3-D packaging systems and must be addressed early on in the design process. Two of the dominating factors reducing signal integrity are power-supply noise and crosstalk. Circuits draw a large volume of instantaneous current during switching which causes voltage swings at the power sources. The swings are compounded by the presence of several switching entities which cause simultaneous switching noise (SSN). A transistor drawing current from a noisy source is likely to cause logic failures due to its decreased drive capability. Hence in order to ensure a high quality design SSN must be suppressed. Recent works have addressed the issue of decoupling capacitor allocation and power supply noise suppression for general 2-D floorplans [1] and standard cell layout [2]. The exact calculation of power-supply noise is however too time consuming to be used for floorplan evaluation. The major contribution of these works is modeling power/ground (P/G) network and an efficient scheme to calculate power supply noise.

The process of Global Routing for 3D packaging is also very different from that of the more conventional technologies (PCB, Standard Cells and MCM). The 3-D global routing is multi-layer like MCM, but unlike MCM, routing must be done between many placement layers. The routers for MCM, described in the literature [3], [4], [5], can be used to develop routing tools for 3-D but issues such as signal integrity, crosstalk, via-minimization,pin assignment and layer assignment must be addressed in the context of the 3-D technology. Estimating congestion to a reasonable level of accuracy during the floorplaning process is atleast as hard as global routing itself. The congestion profile itself is very sensitive to the objectives in global routing. In order to achieve good placement and routing, congestion must be involved during floorplanning.

The contribution of the paper is threefold:

- We model the power supply network of a 3-D package based on a corresponding elegant technique for 2-D floorplans.
- We also study the inclusion of accurate congestion metric in the cost function of floorplaning.
- Simulated Annealing and an intelligent schedule for invoking the exact analyzers is used to do decap and congestion aware floorplanning.

The problem formulation is dealt in Section 2. Noise analysis and decap estimation techniques have been outlined in the Section 3. Congestion estimation has been explained in Section 4. The algorithm overview is given in Section 5. Experimental results are presented in Section 6. Section 7 concludes our



Fig. 1. Illustration of embedded passives and 3D packaging

paper.

## **II. PROBLEM FORMULATION**

Given a set of blocks  $B = \{B_1, B_2, \dots, B_m\}$ , a netlist  $N = \{N_1, N_2, \dots, N_n\}$ , width, height, and maximum switching currents for each block, and the number of placement layers L, and constants  $\alpha$ ,  $\beta$ ,  $\gamma$ , find the location of each block  $x(B_k)$ ,  $y(B_k)$  and placement layer  $l(B_k)$  such that

$$Cost = \alpha \cdot A + \beta \cdot C + \gamma \cdot D$$

is minimized, where A, C and D are the area, congestion and decap allocation costs for the floorplan.

## **III. DECAP ESTIMATION**

#### A. SSN Modeling

The blocks are modeled as a time-varying current source for the purpose of noise analysis [2]. Figure 2 illustrates the model. and consists of voltage sources, current sources, conductances and capacitances. The value of the current  $I_{load}$ is proportional to the switching activity of the block and is given by the function:

$$I_{load} = \begin{cases} 0 & \text{if } t < 0\\ \mu \cdot t & \text{if } t < t_P\\ \mu(2t_p - t) & \text{if } t < 2t_p\\ 0 & \text{if } t > 2t_p \end{cases}$$

The noise  $V_n$ , can be defined in simple terms as the amount of voltage drop that the block sees, from the actual supply voltage of V dd.  $V_{dd} - V_n$  is the effective voltage source value as seen by the block. The noise may be detrimental to the circuit functioning if the noise goes below the supply-voltage tolerance of the block. Allocating on-chip decoupling capacitor is a powerful technique to keep the noise to a tolerable limit. With the allocation of decap  $C_d$ , the load voltage seen by the circuit is given by

$$\frac{V_{load}}{V_{dd}} = 1 - \frac{\mu}{G_g} \left( t - \frac{C_d}{G_g} (1 - e^{-t/\tau}) \right)$$



Fig. 2. Representation of the blocks and the power supply network

where

$$\tau = \frac{(C_g + G_d)C_d}{G_g G_d}$$

The magnitude of the maximum noise  $V_{noise} = V_{dd} - V_{min}$ is given by

$$\frac{V_{noise}}{V_{dd}} = \frac{\mu}{G_g} \left( t_p - \frac{C_d}{G_g} (1 - e^{-t_p/\tau}) \right)$$

From the given equation it is clear that in order to reduce noise for a given block we should drastically increase the second term in the above equation. In order to reduce

$$\frac{C_d}{G_g}(1 - e^{-t_p/\tau})$$

we should increase  $C_d$  while decreasing  $G_g$  and  $G_d$ .  $G_g$  and  $G_d$  can be decreased by placing the decap and the current source closer to the power source. The decap  $C_d$  can be allocated by dedicating a small area in the floorplan. Since this means an overhead in terms of increased floorplan area, it is also desired that the decap allocation is minimized while satisfying the power supply noise tolerance.

## B. 3D Power Supply Noise Modeling

The analysis above computed the noise and decap for a single block. In case of a floorplan containing numerous blocks, the noise experienced by a single block is also affected by the presence of other current consuming blocks. This phenomenon is called the *Simultaneous Switching Noise* (SSN). To compute the exact SSN for the blocks is a computationally intensive process, requiring detailed modeling of the layout and lengthy simulations. In order to do a quick evaluation of SSN for a given floorplan, a simpler model is required. The model of interest is described in [1].

The power is provided to the multilayer floorplans by multilayer power/ground (P/G) planes. The power planes are connected to each other through highly conducting vias. One of the extremal placement layers (usually the bottom) connect to the actual power pins. The P/G network for a 3-dimensional floorplan is modeled as a 3-D grid graph as shown in Figure 3. Each placement layer in the multi-layer floorplan is represented as a mesh. The mesh is connected by edges which represent the via in the P/G network. The edges in the mesh have inductive and resistive impedances. The mesh contains power-supply points and connection points. The connection



Fig. 3. 3D power supply modeling

points consume currents. The current is drawn from all the sources by the consumers and the current drawn along a path is inversely proportional to the impedance of the path in the power supply mesh. If for a particular block  $I_1, I_2, \dots, I_N$  be the currents drawn from N power sources in the grid then

$$I_1 + I_2 + \dots + I_N = I$$

where I is the switching current demand of the block. Then,

$$I_1 Z_1 = I_2 Z_2 = \cdots = I_N Z_N$$

where  $Z_i$  is the impedances of path *i*. If  $Y_j = 1/Z_j$  then

$$I_j = \frac{Y_j}{\sum_{i=1}^N Y_i} I, \ 1 \le j \le N$$

The current distribution for all blocks can be calculated using the above equations. If  $\{P_1, P_2, \dots, P_j\}$  be the current paths under consideration for a consumer block  $B_k$ , then the current distribution on the paths can be found by the above equations.

Let  $P_k$  be the dominant current path. Then  $T^k = \{P_j : P_j \cup P_k \neq \emptyset\}$  denotes the set of paths overlapping with  $P_k$ . Let  $P_{jk}$  be the overlapping between path  $P_j$  and  $P_k$ . After the current paths and their values have been determined for all blocks, the SSN for  $B_k$  is given by

$$V_{noise}^{k} = \sum_{P \in T^{k}} (i_{j} \cdot R_{P_{jk}} + L_{P_{jk}} \frac{di_{j}}{dt})$$

where  $i_j$  is the current in the path  $P_{jk}$ , which is the sum of all currents through this path to various consumer. The weigths of  $i_j$  and its ate of change are the resistive and inductive components of the path.

Let  $Q^k$  denote the maximum charge drawn from the power supply by block  $B_k$ . If  $\theta = \max(1, V_{noise}^k/V_{noise}^{lim})$ , where  $V_{noise}^{lim}$  is the noise tolerance, the decap allocated to block  $B_k$ is given by

$$C^{k} = \frac{(1 - 1/\theta)Q^{k}}{V_{noise}^{lim}}, \ 1 \le k \le M$$

The decap cost is given by

$$D = \sum_{k=1}^{M} C^k \tag{1}$$

Fig. 4. SSN and Decap Estimation

It has been shown experimentally in [1] that a fair enough estimate of the SSN can be done by considering the shortest and the second shortest paths of currents from the power supply points to the connection points.

The pseudocode for decap estimation is given in Figure 4. The dominant source is defined as the voltage source supplying significantly more power to the block than the other neighbouring sources. Dominant Path for the block is a path from the supply to the block causing the most drop in voltage. In the noise analysis of the 3-D floorplans, the dominant sources and paths for each block is found out. The current is distributed in the paths according to the expressions outlined earlier. After this procedure, the noise and decap can be estimated using the formula given.

## **IV. CONGESTION ESTIMATION**

## A. 3D Congestion Modeling

In the 3-D package we also allow routing to be multi-layer between the placement layers. We model the routing resource between placement layers, as a 3-D grid graph. The nodes in the graph represent the routing area and the edges denote the adjacency between the bondaries of the routing area. A route taken by the net is a path in this grid graph in case the net has two pins or a tree in case the net is multi-pin. We observe that a net can have its entire pin in the single placement layer or the pins may be located in different placement layers. For the purpose of congestion analysis we segment the nets, such that the subnets reside in a single routing interval, which is defined as the region between the two placement layers. The routing density of the edge in the grid graph is defined to be the total number of nets utilizing the edge. Let  $G_i = (V_i, E_i)$ be the grid graph representing the routing resource model of routing interval i. Suppose  $T_i(N)$  be the set of subnets for the routing interval i. Then the routing density of an edge e in  $G_i$ is

$$d_e^i = \sum_{n \in T_i(N)} x_e, \ e \in E_i$$

Then,  $d^i = \max_{e \in E_i} d^i_e$  is the local congestion in routing interval *i*. The congestion of the 3-D floorplan with *L* placement layers is given by

$$C = \sum_{i=0}^{L-1} d^{i}$$
 (2)

The value of the congestion D determines the number of layers required for routing and is a very important factor during the manufacturing process. A more uniform usage of the grid graph results in lesser congestion. We use results of a fast global router to estimate the exact value of congestion. In the following paragraph we outline the steps of the router.

The global routing is done in many steps [6], [7], [8] as illustrated in Figure 5, the essential ones of which are:

- Net Segmentation: Nets traversing multiple routing intervals are segmented into subnets which are part of the net confined to the single interval.
- Pin Generation: Pins are generated for nets which also defines the locations where the net enters or exits the interval.
- 3) Net Distribution: Routing Interval is selected for the nets which have a choice of the routing regions. For example, nets having all its pin the same floorplan layer can either be routed above or below this layer, if this layer is not the bottommost or topmost of the package.
- 4) Detailed Pin Distribution: The pins are assigned a legal location in the routing interval. Special care is taken to distribute the pins uniformly in the routing interval.
- 5) Topology Generation: A topology for each subnet is generated in the local grid graph [5].

#### B. Fast 3D Congestion Estimation Algorithm

Since congestion is a part of the cost function used in the floorplanner, its value needs to be determined after each move. However calling the full-length global router at each move to estimate congestion is not feasible. In order to tackle this problem, the optimization routine runs complete global routing on the candidate configuration only at certain points according to the scheduling policy. At all other moves the value of congestion must be estimated. We have a very simple method of estimating congestion based on the previous values of actual congestions. Let us assume a very simple policy for calling the global router for accurate congestion analysis, i.e after every fixed number of moves (say m). Let the values of area, wirelength, decap and congestion at the  $n^{th}$  call be a1, w1, d1, c1 and at  $n + 1^{th}$  is a1, w1, d1, c1. If at move b (n + m < b < n + 2m), the parameters of the configuration is a3, w3, d3, c3, c3 can be estimated as

$$c3 = c2 + \Delta \left[ \alpha (\frac{a2 - a3}{a2 - a1}) + \beta (\frac{w2 - w3}{w2 - w1}) + \gamma (\frac{d2 - d3}{d2 - d1}) \right]$$

where  $\Delta c = |c2 - c1|$ . Its to be noted that c3 is only an approximate value for congestion at move b. The estimated value of congestion is reasonable close to actual value only if certain conditions are fulfilled. Namely, the next move is a



small perturbation of the current one and doesn't result in value swings of the individual cost parameters. Since in a simulated annealing based floorplanner there is no sure and easy way of ensuring the second condition, the estimated value maybe quite erroneous in practice. A serious consequence of this technique is that a wrongly estimated cost function is selected as the best solution while the actual value of the cost is quite disparate with the estimated one. The solution is to allow the value of estimated congestion to vary only within a certain range  $(c^2 - L < c^3 < c^2 + L)$ . This prevents a pathological configuration from becoming the best solution.

#### V. DECAP AND CONGESTION MINIMIZATION

Simulated Annealing is a very popular approach for floorplanning [9] because of it high solution quality. Another motivation for using this approach is its flexibility in handling various types of cost functions. As in 2-D floorplan the efficacy of the florplanning algorithm depends on its representation. Several floorplan representations have been presented in the literature. Some of the well known ones are polish expression for slicing floorplans [9], sequence-pairs [10], O-tree [11], B\* tree [12] for general non-slicing floorplans. The N-layered 3-D floorplan [13] is represented by N sequence pairs, each one representing the floorplan at each level.

A new floorplan is generated by defining some moves based on the floorplan representations. Since calling the global router is too expensive at each cost evaluation, a schedule is defined for calling the global router at some predefined intervals.

Figure 6 presents the pseudocode for the optimization routine.Simulated annealing procedure starts with an initial multilayer floorplan and old cost of infinity. Based on the



Fig. 5. 3D global routing

current configuration the area, wirelength and congestion is calculated. If the schedule allows for the current move, the global router is called and the exact congestion is calculated, or else an approximate value of the congestion is estimated is based on previous accurate values of congestion and other metrics as has been outlined in the previous section. The new cost is the weighted sum of all the metrics. If the new cost is lower than the old one, the solution is accepted and the old cost is assigned the new cost. If the new cost is more than the old cost, the solution is accepted with a probability given by an exponential function of the difference of the old and the new cost. Temperature levels are defined before hand and at each level certain numbers of moves are made. The temperature levels are decreased by a multiplicative factor. The algorithm stops when the stopping or the freezing temperature is reached.

# VI. EXPERIMENTAL RESULTS

We implemented the proposed algorithms and analysis tools using C++/STL. Our program was evaluated using the GSRC benchmarks. The current demands for the blocks were randomly generated. We designed several experiments to test the efficiency of our algorithms. Twenty temperature levels are defined and hundred moves are made in each temperature level. We chose our baseline as the layout optimized for wirelength. We define area utilization as the ratio of the area of the sum of the block areas divided by the number of layers to the area of the multi-layer floorplan. Therefore higher utilization means better solution quality in terms of area.

## A. Decap/Congestion Optimization Results

We compare our congestion driven and congestion driven floorplanning in terms of the four quality metrics, area utilization, wirelength, decap amount and congestion in Table I. The baseline is compared with the area, wirelength, decap and congestion driven floorplanning using frequency of analyzer calls per 50 moves, in Table II. The impact of the frequency of global router calls is tabulated in Table III, where the solution is measured in terms of decap and congestion. For the calculation of the cost function, the area and wirelength are normalized using a fixed constant. The weights used for the area, wirelength, decap and congestion are 1,1,2 and 1 for the multi-objective floorplanning. For the case of congestion driven optimization we let the weight of congestion in the cost function to vary between 1 and 4.

The results obtained are highly sensitive to the weights of the metrics in the cost factor. In the tables we report the solution, keeping weights of the solution quality metrics constant across the benchmarks. However we noticed that the solution of all circuits can be improved by fine tuning the parameters. We do notice some predictable trends in the tables which can be summarized as follows.

- The decap driven floorplaning gives reduced decap (average of 20% and maximum of 25%) for all circuits with only a small percentage decrease in area utilization(19%), since the decap is calculated at each move.
- The congestion optimization is highly sensitive to parameters, since we are also using approximate values to accept solutions. There is 3% improvement in congestion. The maximum improvement of congestion for the benchmarks is 15%.
- We were able to achieve improvements in both congestion and decap over the baseline with only a slight increase in area (23% on the average) and wirelength (15% on the

TABLE IV DECAP-CONGESTION CORRELATION

|       | 1           |         | 1 /              |      |  |  |
|-------|-------------|---------|------------------|------|--|--|
|       | decap · con | gestion | decap/congestion |      |  |  |
| ckt   | range       | ave     | range            | ave  |  |  |
| n50   | [0, 25.8]   | 5.05    | [0, 0.41]        | 0.11 |  |  |
| n50b  | [0, 17.7]   | 3.33    | [0, 0.81]        | 0.13 |  |  |
| n50c  | [0, 31.2]   | 4.53    | [0, 0.44]        | 0.10 |  |  |
| n100  | [0, 59.4]   | 9.05    | [0, 0.72]        | 0.11 |  |  |
| n100b | [0, 52.0]   | 9.32    | [0, 0.50]        | 0.12 |  |  |
| n100c | [0, 64.8]   | 10.55   | [0, 1.00]        | 0.11 |  |  |
| n200  | [0, 123.7]  | 19.03   | [0, 1.12]        | 0.10 |  |  |
| n200c | [0, 131.6]  | 17.19   | [0, 1.30]        | 0.14 |  |  |
| n300  | [0, 175.5]  | 28.28   | [0, 1.00]        | 0.11 |  |  |

average) with only a small increase in runtime.

• There is a general trend for congestion values to get better with increased frequency of full-length calls but each circuit has an "inflexion point" where the solution is better than others. With decreasing frequency of the calls, congestion gets worse (4% on the average per calls per 25 moves) and decap gets better (2% on the average per calls per 25 moves).

The numbers in the table suggest a correlation between decap and congestion. In order to test the accuracy of this observation we ran tests for investigating correlation between congestion and decap. This was done on a per block basis. We defined local congestion of the block as the congestion in the area spanned by the block. Table IV gives the range of the decap-congestion products and range and their average over the number of blocks. The difference of the average from the midpoint of the range gives us a measure of correlation between decap and congestion. As can be seen from the table there is poor correlation between the two metrics. This makes congestion analysis necessary during multi-objective optimization during floorplanning

## B. Power Supply Noise Simulation Results

In this section, power supply noise was computed for three different cases. The modeling method in [14] was applied to model vdd/gnd planes. This modeling method uses a cavity resonator model to represent a plane pair as an electromagnetic system. Since vdd/gnd plane pair acts as a cavity resonator at high frequencies, it uses the equivalent circuit in [14] expressed in terms of C, L, G parameters whose values are directly derived from the analytical expression in [15]. The capacitance 'C' is used for storage of electric energy and inductance 'L' is used for storage of magnetic energy, where at the resonant frequency, there is an exchange of energy between the two elements, which forms a resonator circuit. The conductance 'G' is used to account for losses in conductor and dielectric. The equivalent circuit in [14] models the vdd/gnd plane pair as a waveguide that is coupled to the various natural modes of the resonators through transformers. Based on the physical parameters such as width and length of the vdd/gnd plane, dielectric thickness, permittivity and permeability of the dielectric, loss tangent of dielectric, conductance of the plane, port location and size, the circuit for the planes can be constructed with other components. Ports represent positions on the plane pair where either a current source exists or a voltage is to be measured. This method was then extended to multiple plane pairs under the skin effect approximation which allows each plane pair to be modeled separately and recombined during simulation using a conventional circuit simulator such as HSPICE.

Table V shows power supply noise simulation results for non-optimized case without any decoupling capacitor. In this case, plane structure size is 246 mm x 254 mm and vdd/gnd plane pair was modeled using cavity resonator model and simulated in HSPICE. The plane pair has 14 ports. The DC 5 V source are located at 4 edges in the plane pair and 14 current sources exist in the plane pair. The highest current source is located at (x=46 mm, y=31 mm) with value of 2.87241 [A], the second highest current source is located at (x=89 mm, y=213 mm) with value of 2.31646 [A], the third highest current source is located at (x=21.5 mm, y=198 mm) with value of 1.73231 [A]. The power supply noise was computed at 14 ports on the plane pair. Table 1 shows the locations of the current sources and power supply noise values. Table V shows power supply simulation results for optimized case without any decoupling capacitor. In this case, plane structure size is 292 mm x 295 mm and vdd/gnd plane pair was modeled using cavity resonator model and simulated in HSPICE. The plane pair has also 14 ports. The DC 5 V sources are also located at the edges in the plane pair and 14 current sources exist in the plane pair. The highest current source is located at (x=46 mm, y=31 mm) with value of 2.87241 [A], the second highest current source is located at (x=229 mm, y=25 mm) with value of 2.31646 [A], the third highest current source is located at (x=21.5 mm, y=245 mm) with value of 1.73231 [A].

The power supply noise was computed at 14 ports in the the plane pair. Table V shows power supply noise simulation results for optimized case with decoupling capacitors. The decoupling capacitor of 32 nF is connected with each current source in parallel in the plane pair. So, a total of 14 decoupling capacitors were used in this simulation. Compared to the case without any decoupling capacitor, power supply noise is reduced pretty much, which proves the effectiveness of decoupling capacitor for reducing power supply noise.

As can be seen from the Table V, the SSN for the decap optimized case is lower than the one optimized only for area. The SSN of the noisiest block blk4 with noise of 1.58 V is reduced to 1.42 V which is still above the tolerance level of 1 V. With the inclusion of decap the noise is supressed to 0.22 V, which is within the tolerance of the layout. The noise of blk11 is increased from 1.43 V to 1.44 Vwhich is the result of change of locations but is again brought down within limits using a small amount of decap. The largest amount of decap is used for blk5 (0.50 nF), because of an increase in its SSN after optimization. The numbers show that the SSN was efficiently suppressed and the amount of decap reduced by using our algorithms. Figure 7 shows the comparison between SSN with and without decap insertion.

| TABLE I                                               |
|-------------------------------------------------------|
| AREA/WIRE-DRIVEN VS DECAP-DRIVEN VS CONGESTION-DRIVEN |

|       | 1.    |     | 1    | · · ·     | 1 .      |      | 1    | 1      |        |      |      |           |           |      |
|-------|-------|-----|------|-----------|----------|------|------|--------|--------|------|------|-----------|-----------|------|
|       | ckts  |     |      | area/wire | e-driven |      |      | decap- | driven |      |      | congestio | on-driven |      |
| name  | size  | lyr | util | wire      | decap    | cong | util | wire   | decap  | cong | util | wire      | decap     | cong |
| n50   | 50    | 4   | 0.81 | 47599     | 26.7     | 23   | 0.57 | 58503  | 19.9   | 25   | 0.44 | 62514     | 49.1      | 23   |
| n50b  | 50    | 4   | 0.73 | 45711     | 27.1     | 27   | 0.62 | 52566  | 18.5   | 24   | 0.32 | 61222     | 51.45     | 23   |
| n50c  | 50    | 4   | 0.71 | 52804     | 30.0     | 26   | 0.60 | 51638  | 16.4   | 24   | 0.49 | 54306     | 37.0      | 24   |
| n100  | 100   | 4   | 0.78 | 84469     | 90.6     | 40   | 0.59 | 112131 | 75.5   | 38   | 0.58 | 105136    | 106.60    | 44   |
| n100b | 100   | 4   | 0.71 | 69554     | 98.6     | 34   | 0.60 | 88277  | 78.3   | 33   | 0.49 | 89359     | 102.0     | 38   |
| n100c | 100   | 4   | 0.74 | 82728     | 100.6    | 41   | 0.66 | 105769 | 71.7   | 32   | 0.41 | 117786    | 116.9     | 36   |
| n200  | 200   | 4   | 0.81 | 171096    | 226.3    | 87   | 0.64 | 211171 | 209.6  | 67   | 0.26 | 246638    | 264.0     | 82   |
| n200b | 200   | 4   | 0.81 | 181526    | 233.2    | 83   | 0.67 | 221619 | 214.5  | 71   | 0.43 | 249897    | 257.6     | 82   |
| n200c | 200   | 4   | 0.80 | 168831    | 237.4    | 62   | 0.69 | 195118 | 214.4  | 67   | 0.80 | 168831    | 237.4     | 62   |
| n300  | 300   | 4   | 0.84 | 286218    | 393.8    | 100  | 0.63 | 178150 | 382.7  | 90   | 0.61 | 331029    | 402.9     | 94   |
| F     | RATIO |     | 1.00 | 1.00      | 1.00     | 1.00 | 0.81 | 1.12   | 0.80   | 0.92 | 0.62 | 1.26      | 1.25      | 0.97 |
| -     | ГІМЕ  |     |      | 32        | 2        |      |      | 40     | C      |      |      | 7         | 0         |      |

#### TABLE II

AREA/WIRE-DRIVEN VS AREA/WIRE/DECAP/CONGESTION-DRIVEN

|       |      | area/wire | e-driven |      | area/wire/decap/congestion-driven |        |       |      |
|-------|------|-----------|----------|------|-----------------------------------|--------|-------|------|
| ckt   | util | wire      | decap    | cong | util                              | wire   | decap | cong |
| n50   | 0.81 | 47599     | 26.7     | 23   | 0.64                              | 53751  | 35.0  | 24   |
| n50b  | 0.73 | 45711     | 27.1     | 27   | 0.59                              | 48185  | 34.0  | 23   |
| n50c  | 0.71 | 52804     | 30.0     | 26   | 0.54                              | 58224  | 26.9  | 28   |
| n100  | 0.78 | 84469     | 90.6     | 40   | 0.63                              | 101458 | 83.8  | 38   |
| n100b | 0.71 | 69554     | 98.6     | 34   | 0.60                              | 81151  | 86.2  | 31   |
| n100c | 0.74 | 82728     | 100.6    | 41   | 0.73                              | 89316  | 91.4  | 46   |
| n200  | 0.81 | 171096    | 226.3    | 87   | 0.62                              | 207946 | 224.1 | 79   |
| n200b | 0.81 | 181526    | 233.2    | 83   | 0.65                              | 219148 | 228.2 | 86   |
| n200c | 0.80 | 168831    | 237.4    | 62   | 0.66                              | 200902 | 230.6 | 69   |
| n300  | 0.84 | 286218    | 393.8    | 100  | 0.62                              | 334278 | 398.5 | 95   |
| RATIO | 1.00 | 1.00      | 1.00     | 1.00 | 0.81                              | 1.15   | 1.01  | 0.99 |
| TIME  |      | 32        | 2        |      | 67                                |        |       |      |

TABLE III

RUNTIME VS QUALITY TRADEOFF. IMPACT OF THE FREQUENCY OF FULL-LENGTH ANALYSIS ON DECAP AND CONGESTION

|       | frq = 25 |      | frq = 50 |      | fra = | = 75 | frq = 100 |      |
|-------|----------|------|----------|------|-------|------|-----------|------|
| ckt   | decap    | cong | decap    | cong | decap | cong | decap     | cong |
| n50   | 35.1     | 24   | 35.0     | 24   | 34.9  | 31   | 30.7      | 26   |
| n50b  | 34.7     | 26   | 34.0     | 23   | 27.8  | 23   | 31.0      | 26   |
| n50c  | 27.1     | 24   | 26.9     | 28   | 31.0  | 25   | 28.7      | 26   |
| n100  | 93.6     | 43   | 83.8     | 38   | 83.3  | 49   | 85.4      | 43   |
| n100b | 99.5     | 33   | 86.2     | 31   | 92.0  | 39   | 84.3      | 44   |
| n100c | 88.2     | 38   | 91.4     | 46   | 87.4  | 50   | 84.3      | 50   |
| n200  | 226.0    | 71   | 224.1    | 79   | 224.9 | 89   | 229.3     | 83   |
| n200b | 234.9    | 81   | 228.2    | 86   | 225.1 | 80   | 226.8     | 81   |
| n200c | 237.4    | 62   | 230.6    | 69   | 228.5 | 68   | 231.5     | 73   |
| n300  | 399.4    | 93   | 398.5    | 95   | 395.6 | 86   | 400.6     | 98   |
| RATIO | 1.00     | 1.00 | 0.97     | 1.04 | 0.96  | 1.06 | 0.95      | 1.12 |
| TIME  | 75       |      | 67       |      | 60    |      | 53        |      |

## VII. CONCLUSION

We have extended power-supply network modeling technique for 3-D floorplans. We also tried to address the issue of congestion during the floorplanning process. We have outlined a technique to use runtime intensive analyzers during the optimization process and making intelligent approximation of the metrics based on those analysis. Our results show that good results can be obtained and runtime reduced by carefully choosing the weights in the cost function. We also validated the results for decap optimization using accurate analytical tools.

## References

- S. Zhao, C.-K. Koh, and K. Roy, "Decoupling capacitance allocation and its application to power supply noise aware floorplanning," *IEEE Trans. on Computer-Aided Design*, pp. 81–92, 2002.
- [2] H. Su, S. Sapatnekar, and S. R. Nassif, "An algorithm for optimal decoupling capacitor sizing and placement for standard cell layouts," in *Proc. Int. Symp. on Physical Design*, 2002, pp. 68–73.
- [3] J. Cong, "Pin assignment with global routing for general cell design," *IEEE Trans. on Computer-Aided Design*, pp. 1401–1412, 1991.
- [4] J. Cong, A. B. Kahng, and K.-S. Leung, "Efficient algorithms for the minimum shortest path steiner arborescence problem with applications to VLSI physical design," *IEEE Trans. on Computer-Aided Design*, pp. 24–39, 1999.
- [5] K. Khoo and J. Cong, "An efficient multilayer mcm router based on fourvia routing," *IEEE Trans. on Computer-Aided Design*, pp. 1277–1290, 1995.
- [6] J. Minz and S. K. Lim, "Layer assignment for system on packages," in Proc. Asia and South Pacific Design Automation Conf., 2004.



Fig. 7. Power supply noise simulation with and without decap insertion.

#### TABLE V

Power supply noise simulation results. We report SSN noise for each block placed in the top placement layer. (A)

Non-optimized case without any decoupling capacitor. From Table I we need 26.7 decap units to suppress SSN noise. (b) Optimized case without any decoupling capacitor. From Table I we need 19.9 decap units. (c) Optimized case with decoupling

CAPACITORS INSERTED.

|       | no decap | decap | decap |
|-------|----------|-------|-------|
| blk   | aware    | aware | used  |
| blk1  | 1.21     | 1.01  | 0.16  |
| blk2  | 1.31     | 1.15  | 0.17  |
| blk3  | 1.33     | 1.18  | 0.20  |
| blk4  | 1.58     | 1.42  | 0.22  |
| blk5  | 1.26     | 1.41  | 0.50  |
| blk6  | 1.54     | 1.27  | 0.31  |
| blk7  | 1.47     | 1.33  | 0.43  |
| blk8  | 1.33     | 1.21  | 0.36  |
| blk9  | 1.44     | 1.16  | 0.32  |
| blk10 | 1.34     | 1.38  | 0.32  |
| blk11 | 1.43     | 1.44  | 0.20  |
| blk12 | 1.41     | 1.35  | 0.33  |
| blk13 | 1.37     | 1.27  | 0.27  |
| blk14 | 1.05     | 1.15  | 0.22  |

- [7] J. Minz, M. Pathak, and S. K. Lim, "Net and pin distribution for 3D package global routing," in *Proc. Design, Automation and Test in Europe*, 2004.
- [8] R. Ravichandran, J. Minz, M. Pathak, S. Easwar, and S. K. Lim, "Physical layout automation for system-on-packages," in *IEEE Electronic Components and Technology Conference*, 2004.
- [9] D. F. Wong and C. L. Liu, "Floorplan design of VLSI circuits," *Algorithmica*, pp. 263–291, 1989.
- [10] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajitani, "Rectangle packing based module placement," in *Proc. IEEE Int. Conf. on Computer-Aided Design*, 1995, pp. 472–479.
- [11] P.-N. Guo, C.-K. Cheng, and T. Yoshimura, "An o-tree representation of nonslicing floorplan and its applications," in *Proc. ACM Design Automation Conf.*, 1999, pp. 268–273.
- [12] Y. Chang, Y. Chang, G. Wu, and S. Wu, "B-trees: A new representation for nonslicing floorplans," in *Proc. ACM Design Automation Conf.*, 2000, pp. 458–463.
- [13] P. H. Shiu, R. Ravichandran, S. Easwar, and S. K. Lim, "Multi-layer floorplanning for reliable system-on-package," in *Proc. IEEE Int. Symp.* on *Circuits and Systems*, 2004.
- [14] N. Na, J. Choi, M. Swaminathan, J. P. Libous, and D. P. O'Connor, "Modeling and simulation of core switching noise for asics," *IEEE Trans. Advanced Packaging*, pp. 4–11, 2002.

[15] T. Okoshi, Planar Circuits for Microwaves and Lightwaves. Springer-Verlag, 1985, ch. 2.