Maßberg, Jens Uwe: Facility Location and Clock Tree Synthesis. - Bonn, 2010. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.

Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5N-20942

Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5N-20942

@phdthesis{handle:20.500.11811/4558,

urn: https://nbn-resolving.org/urn:nbn:de:hbz:5N-20942,

author = {{Jens Uwe Maßberg}},

title = {Facility Location and Clock Tree Synthesis},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

year = 2010,

month = apr,

note = {The construction of clock trees and repeater trees are major challenges in chip design. Such trees distribute an electrical clock signal from a source to a set of sinks on a chip. On recent designs there can be millions of repeater trees with only a few up to some hundred sinks and several clock trees with up to some hundred thousand of sinks. In repeater trees the signal has to arrive at each sink not later than an individual required arrival time, while in clock trees it has to arrive at each sink within an individual required arrival time window. In this thesis, we present new theory and algorithms for the construction of clock trees and repeater trees and an essential sub-problem, the Sink Clustering Problem. We also describe our clock tree construction tool BonnClock, which has been used by IBM Microelectronics for the design of hundreds of most complex chips.

First, we introduce the Sink Clustering Problem, the main sub-problem of clock tree design. Given a metric space

We present the first constant-factor approximation algorithm for the Sink Clustering Problem. It is based on decomposing a minimum spanning tree on the sinks and has an approximation guarantee of

We analyze the structure of the Sink Clustering Problem and exhibit its connections to matroid theory. In particular, we use the property of matroids that for any two bases

We replace each Steiner tree of an optimum solution by a minimum spanning tree and connect all trees to a new artificial vertex

The bound can be further improved by combining it with a lower bound for the length of a minimum Steiner tree on

Our experimental results on real-world instances from clock tree design show that the cost of the solutions computed by our algorithms is in average only

Clock trees have to satisfy several timing constraints. More precisely, the signal has to reach each sink within an individual required arrival time window. Sinks can only be clustered together if their required arrival time windows have a point of time in common. Typically, all required arrival time windows are the same. In this case we have the Sink Clustering Problem defined above. However, there are clock trees where the sinks have different required arrival time windows. This motivates a generalization of the Sink Clustering Problem where each sink additionally has an individual time window. As further constraint the time windows of the sinks of a cluster must have at least one point of time in common. We study the Sink Clustering Problem with Time Windows and present a polynomial

For the practical construction of clock trees we present our algorithm BonnClock. BonnClock builds a clock tree combining a bottom-up clustering and a top-down partitioning strategy. In the bottom-up phase BonnClock is using the Sink Clustering Algorithm in order to determine the drivers of unconnected sinks or inverters. The `global' topology of the tree is determined by the top-down partitioning considering big blockages and timing restrictions. BonnClock uses a dynamic program in order to determine the sizes of the inverters that are inserted. All components of the algorithm are discussed in detail.

As part of this thesis, we have also implemented this algorithm. BonnClock has become the standard tool to construct clock trees within IBM. We show experimental results with comparisons to another industrial clock tree construction tool and to lower bounds for the power consumption. It turns out that - mainly due to the Sink Clustering Algorithm

Finally, we consider the repeater tree construction problem. In contrast to clock trees, each sink has a latest required arrival time instead of a time window. We describe a simple algorithm to build such trees where we insert the sinks one by one into an existing tree. Depending on the optimization goal we show a variant of the algorithm computing trees of almost optimal length or trees with guaranteed best possible performance.

Moreover, we analyze the topology of trees with best or almost best performance more closely. Such trees are equivalent to minimax and almost minimax trees: Let

Finally, we study a further mathematical model of repeater trees that considers that additional delay caused by a bifurcation of a tree can be distributed partially to the two branches. For

url = {http://hdl.handle.net/20.500.11811/4558}

}

urn: https://nbn-resolving.org/urn:nbn:de:hbz:5N-20942,

author = {{Jens Uwe Maßberg}},

title = {Facility Location and Clock Tree Synthesis},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

year = 2010,

month = apr,

note = {The construction of clock trees and repeater trees are major challenges in chip design. Such trees distribute an electrical clock signal from a source to a set of sinks on a chip. On recent designs there can be millions of repeater trees with only a few up to some hundred sinks and several clock trees with up to some hundred thousand of sinks. In repeater trees the signal has to arrive at each sink not later than an individual required arrival time, while in clock trees it has to arrive at each sink within an individual required arrival time window. In this thesis, we present new theory and algorithms for the construction of clock trees and repeater trees and an essential sub-problem, the Sink Clustering Problem. We also describe our clock tree construction tool BonnClock, which has been used by IBM Microelectronics for the design of hundreds of most complex chips.

First, we introduce the Sink Clustering Problem, the main sub-problem of clock tree design. Given a metric space

*(V,c)*, a finite set*D*of terminals with positions*p(v) ∈ V*and demands*d(v) ∈ R*for all_{ ≥ 0}*v ∈ D*, a facility opening cost*f ∈ R*and a load limit_{>0}*u ∈ R*, the task is to find a partition_{>0}*D=D*of_{1}∪ ... ∪ D_{k}*D*and, for all*1 ≤ i ≤ k*, a Steiner tree*S*for_{i}*{p(v)| v ∈ D*. Each cluster_{i}}*(D*,_{i},S_{i})*1 ≤ i ≤ k*, has to keep the load limit, that means*∑*. The goal is to minimize the weighted sum of the length of all Steiner trees plus the number of clusters, i.e. minimize_{e ∈ E(Si)}c(e) +∑_{s ∈ Di}d(s) ≤ u*∑*._{i=1,...,k}(∑_{e ∈ E(Si )}c(e)) +kfWe present the first constant-factor approximation algorithm for the Sink Clustering Problem. It is based on decomposing a minimum spanning tree on the sinks and has an approximation guarantee of

*1+2α*, where*α*is the Steiner ratio of the underlying metric. Moreover, we introduce two variants of the algorithm that rely on decomposing an approximate minimum Steiner tree and an approximate minimum traveling salesman tour. These algorithms have approximation guarantees of*3β*and*3γ*, respectively, where*β*and*γ*are the approximation guarantees of the Steiner tree and TSP approximation algorithms, respectively. We also propose two post-optimization algorithms that can further improve an existing clustering.We analyze the structure of the Sink Clustering Problem and exhibit its connections to matroid theory. In particular, we use the property of matroids that for any two bases

*B*there is a bijection_{1}, B_{2}*p : B*so that_{1}→ B_{2}*(B*\_{1}*{b}) ∪ {p(b)}*is again a basis for each*b ∈ B*._{1}We replace each Steiner tree of an optimum solution by a minimum spanning tree and connect all trees to a new artificial vertex

*s*and get a tree*S*. In a modified metric the total length of*S*is a good lower bound for the cost of an optimum solution. Due to the matroid property we can compare a minimum spanning tree*T*on*D ∪ {s}*with*S*; the length of any edge of*T*is bounded by the length of an edge of*S*. We introduce the concept of*K*-dominated functions that helps us to increase the `cost' of certain edges of*T*while still having the property that the total length of all edges of*T*ending in a vertex of*K ⊆ D*is bounded by the total length of all edges of*S*ending in a vertex of*K*. Applying this procedure to the sets of a laminar family on*D*yields an improved lower bound.The bound can be further improved by combining it with a lower bound for the length of a minimum Steiner tree on

*D*. For this bound we prove the following lemma: For any family of trees*with*`T`= {T_{1},..., T_{k}}*V(T*,_{i}) ⊂ D*1 ≤ i ≤ k*, with the property that for any subset*the trees in*`T`' ⊆`T`*cover at least*`T`'*|*vertices, there exists an edge`T`' |+1*e*for_{i}∈ E(T_{i})*i=1,..., k*such that these edges*E={e*form a forest, i.e. the set does not contain an edge twice and it does not contain a circuit._{i}| 1 ≤ i ≤ k}Our experimental results on real-world instances from clock tree design show that the cost of the solutions computed by our algorithms is in average only

*10%*over the best lower bound. Moreover, we compare our algorithm to another clustering algorithm used in industry. The results show that the total cost of our solutions is*10%*less than the cost of the solutions computed by the competitive tool.Clock trees have to satisfy several timing constraints. More precisely, the signal has to reach each sink within an individual required arrival time window. Sinks can only be clustered together if their required arrival time windows have a point of time in common. Typically, all required arrival time windows are the same. In this case we have the Sink Clustering Problem defined above. However, there are clock trees where the sinks have different required arrival time windows. This motivates a generalization of the Sink Clustering Problem where each sink additionally has an individual time window. As further constraint the time windows of the sinks of a cluster must have at least one point of time in common. We study the Sink Clustering Problem with Time Windows and present a polynomial

*O(*log*s)*-approximation algorithm for this problem, where*s*is the size of a minimum clique partition in the interval graph induced by the time windows. Our algorithm is based on a divide and conquer approach and uses the approximation algorithms for the Sink Clustering Problem on sub-sets of the instance. We show that the approximation guarantee of the algorithm is tight.For the practical construction of clock trees we present our algorithm BonnClock. BonnClock builds a clock tree combining a bottom-up clustering and a top-down partitioning strategy. In the bottom-up phase BonnClock is using the Sink Clustering Algorithm in order to determine the drivers of unconnected sinks or inverters. The `global' topology of the tree is determined by the top-down partitioning considering big blockages and timing restrictions. BonnClock uses a dynamic program in order to determine the sizes of the inverters that are inserted. All components of the algorithm are discussed in detail.

As part of this thesis, we have also implemented this algorithm. BonnClock has become the standard tool to construct clock trees within IBM. We show experimental results with comparisons to another industrial clock tree construction tool and to lower bounds for the power consumption. It turns out that - mainly due to the Sink Clustering Algorithm

`- our power consumption is much smaller than with the other tool and only one third over the lower bound.`Finally, we consider the repeater tree construction problem. In contrast to clock trees, each sink has a latest required arrival time instead of a time window. We describe a simple algorithm to build such trees where we insert the sinks one by one into an existing tree. Depending on the optimization goal we show a variant of the algorithm computing trees of almost optimal length or trees with guaranteed best possible performance.

Moreover, we analyze the topology of trees with best or almost best performance more closely. Such trees are equivalent to minimax and almost minimax trees: Let

*a*be a set of numbers. The weight of a tree with_{1}, ... , a_{n}∈ N_{ ≥ 0}*n*leaves is the maximum over all leaves*i*of the depth of leaf*i*plus*a*. For a non-negative integral constant_{i}*c*the goal is to build a binary tree with weight at most the optimum weight plus*c*. This problem can be solved optimally by a greedy algorithm. However, we are interested in the online version of this problem where we have to insert the leaf*i*with weight*a*into the tree without knowing_{i}*n*and the following weights*a*,_{j}*j> i*. We give necessary and sufficient conditions for an online algorithm to compute trees of weight at most the optimum weight plus*c*. Moreover, we show how these conditions can be verified efficiently. We obtain an online algorithm that computes an optimum tree in*O(n*log*n)*time.Finally, we study a further mathematical model of repeater trees that considers that additional delay caused by a bifurcation of a tree can be distributed partially to the two branches. For

*c∈ R*and a set_{>0}*of two-element sets of non-negative real numbers we consider rooted binary trees with the property that the two edges emanating from every non-leaf are assigned lengths*`L`⊆ {(l_{1},l_{2}) ∈ R^{2}_{ ≥ 0}| l_{1}+l_{2}= c}*l*and_{1}*l*with_{2}*{ l*. We study the asymptotic growth of the maximum number of leaves of bounded depths in such trees and the existence of such trees with leaves at individually specified maximum depths. Our results yield better lower bounds for repeater trees.},_{1},l_{2}} ?`L`url = {http://hdl.handle.net/20.500.11811/4558}

}