# **Optimal Positions of Twists in Global On-Chip Differential Interconnects**

Eisse Mensink, Student Member, IEEE, Daniël Schinkel, Student Member, IEEE, Eric A. M. Klumperink, Member, IEEE, Ed van Tuijl, Member, IEEE, and Bram Nauta, Senior Member, IEEE

Abstract-Crosstalk limits the achievable data rate of global on-chip interconnects on large CMOS ICs. This is especially the case, if low-swing signaling is used to reduce power consumption. Differential interconnects provide a solution for most crosstalk and noise sources, but not for neighbor-to-neighbor crosstalk in a data bus. This neighbor-to-neighbor crosstalk can be reduced with twists in the differential interconnect pairs. To reduce via resistance and metal layer use, we use as few twists as possible by placing only one twist in every even interconnect pair and only two twists in every odd interconnect pair. Analysis shows that there are optimal positions for the twists, which depend on the termination impedances of the interconnects. Theory and measurements on a 10-mm-long bus in 0.13- $\mu$ m CMOS show that only one twist at 50% of the even interconnect pairs, two twists at 30% and 70% of the odd interconnect pairs, and both a low-ohmic source and a low-ohmic load impedance are very effective in mitigating the crosstalk.

Index Terms-Crosstalk, data bus, interconnect, on-chip communication, twists.

# I. INTRODUCTION

UE TO continued scaling of CMOS device feature sizes, operating frequencies of several gigahertz are possible in system-on-chip (SoC) designs. Despite the higher functional density, the total chip size of many CMOS ICs remains large, because more and more functionality is added on-chip. In order to connect various system blocks on these chips, long (global) interconnects are needed that are able to support high data rates. However, due to shrinking dimensions, the total resistance of these global interconnects is becoming very large. This large total resistance, combined with a large total capacitance, results in a very low bandwidth. Thus, the data capacity of the global interconnects is limited and they are becoming a key limiting factor for performance and reliability [1].

To avoid the need for a high number of repeaters [2], various techniques are developed to increase the data capacity of global on-chip interconnects [3]-[7]. These are mainly focused on counteracting the inter symbol interference (ISI), caused by the limited bandwidth. However, next to ISI, crosstalk is also a severe problem, since it reduces the noise margin at the receiver.

Techniques like shielding can provide a solution, but if low-swing signaling is used to reduce power consumption [8],

Digital Object Identifier 10.1109/TVLSI.2007.893661

noise margins become smaller. Then, differential signaling is preferred. In differential systems, most noise sources appear as common mode and can be rejected by the receiver. In this way, differential interconnects are, for instance, not sensitive to crosstalk from orthogonal crossing metal layers. However, differential interconnects do not solve the problem of neighbor-to-neighbor crosstalk: one of the single-ended halves will be closer to an aggressor in the same metal layer and will hence receive more crosstalk.

In CMOS memory cells, twists in differential interconnect-pairs are already widely used to cancel crosstalk between bitlines [9]-[12]. Twists are also used on printed circuit boards [13], [14]. In general, many twists are placed at evenly-spaced intervals along the interconnects. To cancel neighbor-to-neighbor crosstalk, twists are also proposed for on-chip global interconnects [15]-[18]. Again, many twists are placed at evenly-spaced intervals along the interconnects. However, the many vias needed to make the twists add to the already troublesome interconnect resistance [3]. Moreover, each twist requires use of another metal layer. Therefore, the question arises: What is the minimum number of the twists that is needed?

In [19], we show that only one twist in the even interconnects and two twists in the odd interconnects are sufficient. Furthermore, we show that due to the distributed RC nature of the interconnect, there are optimal positions for these twists. These optimal positions are not necessarily at 50% for the single twist and at 25% and 75% for the double twist, as one might expect. It turns out that the optimal positions for the twists depend on the termination impedances of the interconnects.

In this paper, the analysis of [19] is treated in depth and is expanded. To explain the results of this analysis, an intuitive low frequency model is added. Furthermore, eye-diagram properties are calculated. With these eye-diagram properties it is possible to predict the achievable data rate, which can be done both for interconnects with and without twisting. The eye-diagram properties are also used to calculate the sensitivity of the crosstalk reduction to variations in twist position and termination impedances. Finally, more simulation results and an explanation of the measurement method are added.

In Section II, first, the properties and problems of global interconnects are reviewed. After that, in Section III, the optimal positions for the twists, depending on the termination, are calculated. Section IV gives simulation results that verify the results of Section III. Section V calculates eye-diagram properties of the interconnects. Section VI explains the measurement method and shows measurement results from our test chip, while Section VII gives conclusions.

Manuscript received January 25, 2006; revised December 6, 2006. This work was supported in part by the Technology Foundation STW, by the Applied Science Division of NWO, and by the Technology Programme of the Ministry of Economic Affairs.

The authors are with the University of Twente, 7500 AE Enschede, The Netherlands (e-mail: e.mensink@ewi.utwente.nl; d.schinkel@ewi.utwente.nl).



Fig. 1. Interconnect model for 3-D EM-field simulations.

# **II. GLOBAL ON-CHIP INTERCONNECTS**

#### A. Interconnect Model

Fig. 1 shows a model of a global bus (cross section). The global bus is placed in metal 5 as we assume the top metal layer (metal 6) to be reserved for power and clock routing. The perpendicular interconnects in metal 4 and metal 6 are modeled by two metal plates.

In a 0.13- $\mu$ m CMOS process, the width (w) and spacing (s) are chosen for highest bandwidth per cross-sectional area:  $w = s = 0.4 \ \mu$ m [7]. For these narrow interconnects, the distributed resistance  $R' = 0.15 \ k\Omega/mm$  and the total distributed capacitance  $C' = 2C'_G + 2C'_M = 0.23 \ pF/mm$  are extracted with a 3-D EM-field simulator. The distributed inductance  $L' = 0.35 \ nH/mm$  only starts to dominate over the distributed resistance at a frequency of  $R'/L'/(2\pi) = 68 \ GHz$ . For 10-mm-long interconnects, the attenuation at this frequency is very large (>150 dB), so inductance does not play an important role.

The distributed capacitance between two interconnects,  $C'_M = 0.05$  pF/mm, results in crosstalk (see Fig. 2): a signal from source  $V_S$  will not only appear at its own line output ( $V_O$ ), but also at the output of a neighboring victim line ( $V_X$ ). The transfer functions  $H = V_O/V_S$  and  $H_X = V_X/V_S$ , calculated with a distributed *RLC* model, for 10-mm-long interconnects are shown in Fig. 2 (low-ohmic  $R_S$  and high-ohmic  $R_L$ , modeling an open or a small capacitive load).

# B. Interconnect Problems

The transfer functions of Fig. 2 show two problems of global interconnects. First, the interconnect has a limited bandwidth due to the large distributed resistance and capacitance. The bandwidth of only 100 MHz (82 MHz for differential driven case) creates delay and limits the achievable data rate due to ISI. Second, crosstalk from neighboring interconnects also limits the data rate and deteriorates data integrity. Fig. 2 shows that especially for high frequencies, the transfer functions H and  $H_X$  are almost equal. In [7], we propose solutions for the limited bandwidth of the interconnect: pulse-width (PW) equalization in combination with low-ohmic termination at both the transmitter and receiver side increases the achievable data rate with a factor of six. In this paper, we address the second problem: crosstalk.

# C. Solving the Crosstalk Problem

In differential systems, most noise sources appear as common mode and can be rejected by the receiver. In this way, differ-



Fig. 2. Transfer functions of 10-mm-long interconnects.  $R_L = \infty$ , modeling a small capacitive load (termination on a gate).

ential interconnects are for instance not sensitive to crosstalk from orthogonal crossing metal layers. Furthermore, twisting of differential interconnects can also reduce neighbor-to-neighbor crosstalk in a bus. We will show how this reduction can be achieved with only one twist in the even interconnects and only two twists in the odd interconnects. The question that will be addressed in this paper is: At what positions along the interconnect should we place the twists? Due to the distributed nature of the interconnects, the answer to this question is not trivial. It will turn out that the optimal positions of the twists depend on the termination impedances.

Fig. 3 shows how the twists are organized. The interconnects are in metal 5 (light gray) and part of the twists is in metal 4 (dark gray). For illustrative purposes, the sizes of the interconnects and twists are exaggerated, the area of the twist is actually very small compared to the total area of the interconnect. Suppose, now, the positions of the twists are at  $x_1 * l_T$ ,  $x_2 * l_T$ , and  $x_3 * l_T$ , with  $l_T$  the total length of the interconnect. The optimal positions of the twists are found, when the crosstalk is minimized. Therefore, we need to find the transfer function (H) and crosstalk transfer function ( $H_X$ ) of the interconnect structure of Fig. 3 as a function of  $x_1$ ,  $x_2$ , and  $x_3$ . As only one or two twists are used in the interconnects, the total via resistance is small compared to the total interconnect resistance and is assumed to be zero in the analysis.

# **III. TWIST ANALYSIS**

# A. Transfer Functions

In this section, we calculate the transfer functions  $H_{\rm DM}$  (victim is driven differentially and measured differentially at the victim output),  $H_{\rm XDM}$  for differential mode crosstalk (crosstalk from differential aggressor is measured differentially at victim output), and  $H_{\rm XCM}$  for common mode crosstalk (crosstalk from differential aggressor is measured common mode at victim output). These transfer functions are calculated as a function of the twist positions. In Fig. 3, lines 3 and 4 are the aggressor and are driven differentially with a voltage  $V_{S2}$ . Lines 5 and 6 are the victim and are driven differentially



Fig. 3. General model for twisted interconnects.

with a voltage  $V_{S1}$ . We will use a modal analysis with even and odd modes. In even mode,  $V_{S2} = V_{S1}$  and in odd mode  $V_{S2} = -V_{S1}$ . First, we define four transfer functions (see Fig. 3)

$$H_{\text{even}+} = \frac{\text{out}_{+}}{V_{S1}} \Big|_{V_{S2}=V_{S1}}$$

$$H_{\text{even}-} = \frac{\text{out}_{-}}{V_{S1}} \Big|_{V_{S2}=V_{S1}}$$

$$H_{\text{odd}+} = \frac{\text{out}_{+}}{V_{S1}} \Big|_{V_{S2}=-V_{S1}}$$

$$H_{\text{odd}-} = \frac{\text{out}_{-}}{V_{S1}} \Big|_{V_{S2}=-V_{S1}}.$$
(1)

In Section III-B, we will show how these transfer functions can be calculated. With these transfer functions, we define

$$H_{+} = \frac{\operatorname{out}_{+}}{V_{S1}}\Big|_{V_{S2}=0} = \frac{1}{2}(H_{\operatorname{even}+} + H_{\operatorname{odd}+})$$

$$H_{-} = \frac{\operatorname{out}_{-}}{V_{S1}}\Big|_{V_{S2}=0} = \frac{1}{2}(H_{\operatorname{even}-} + H_{\operatorname{odd}-})$$

$$H_{X+} = \frac{\operatorname{out}_{+}}{V_{S2}}\Big|_{V_{S1}=0} = \frac{1}{2}(H_{\operatorname{even}+} - H_{\operatorname{odd}+})$$

$$H_{X-} = \frac{\operatorname{out}_{-}}{V_{S2}}\Big|_{V_{S1}=0} = \frac{1}{2}(H_{\operatorname{even}-} - H_{\operatorname{odd}-}).$$
(2)

Finally, we define  $H_{\rm DM}$ ,  $H_{\rm XDM}$ , and  $H_{\rm XCM}$ 

$$H_{\rm DM} = \frac{{\rm out}_{+} - {\rm out}_{-}}{V_{S1}} \bigg|_{V_{S2}=0} = H_{+} - H_{-}$$

$$H_{\rm XDM} = \frac{{\rm out}_{+} - {\rm out}_{-}}{V_{S2}} \bigg|_{V_{S1}=0} = H_{X+} - H_{X-}$$

$$H_{\rm XCM} = \frac{1}{2} \frac{{\rm out}_{+} + {\rm out}_{-}}{V_{S2}} \bigg|_{V_{S1}=0} = \frac{1}{2} (H_{X+} + H_{X-}). \quad (3)$$

TABLE I VALUES FOR  $M_K$  and  $L_K$  for Every Section K

| k | $\mathbf{M}_{\mathbf{k}}$ |     |        |     | $l_k$               |
|---|---------------------------|-----|--------|-----|---------------------|
|   | line 5                    |     | line 6 |     |                     |
|   | even                      | odd | even   | odd | 1                   |
| 1 | 4                         | 2   | 3      | 3   | $x_1 * l_T$         |
| 2 | 2                         | 4   | 3      | 3   | $(x_2 - x_1)^* l_T$ |
| 3 | 3                         | 3   | 4      | 2   | $(x_3 - x_2)^* l_T$ |
| 4 | 3                         | 3   | 2      | 4   | $(1 - x_3)^* l_T$   |

Combining (2) and (3), we see that  $H_{DM}$ ,  $H_{XDM}$ , and  $H_{XCM}$  can be calculated if  $H_{even+}$ ,  $H_{even-}$ ,  $H_{odd+}$ , and  $H_{odd-}$  are known.

# B. Even- and Odd- Mode Analysis

In this section, we show how we can calculate  $H_{\text{even}+}$ ,  $H_{\text{even}-}$ ,  $H_{\text{odd}+}$ , and  $H_{\text{odd}-}$  with the help of *s*-parameters. The characteristic impedance  $Z_C$  and propagation constant  $\gamma$  of section *k* of a distributed *RC*-line are [20]

$$Z_{\rm Ck} = \sqrt{\frac{R'}{j\omega \left(2C'_G + M_k C'_M\right)}}$$
$$\gamma_k = \sqrt{j\omega R' \left(2C'_G + M_k C'_M\right)}.$$
(4)

 $R', C'_G$ , and  $C'_M$  are defined in Section II-A.  $M_k$  is a Miller multiplication factor and depends on the signal that is on the neighboring interconnects. Since we use a modal analysis, the signals on neighboring lines are correlated. Because the twists divide the interconnect into four sections (see Fig. 3)  $M_k$  can have a different value for every section k.

As an example, let us look at line 6 (see Fig. 3) in even mode  $(V_{S2} = V_{S1})$ . For the first section, the capacitance to line 7, and for the second section, the capacitance to line 8, are seen once (no signals on these lines). As the capacitance to line 5 is seen double (line 5 and 6 are differentially driven),  $M_1 = M_2 = 3$ . For the third section, the capacitance to line 3 is seen double (the signal on line 3 has opposite sign) and the capacitance to line 5 is also seen double, thus,  $M_3 = 4$ . Finally, for the fourth section, the capacitance to line 4 is not seen (the signal on line 4 has equal sign) and, again, the capacitance to line 5 is seen double. Therefore,  $M_4 = 2$ .

All values of  $M_k$  for the lines 5 and 6 (both in even and odd mode) are shown in Table I. Also, the length  $l_k$  of every section is given.

With these values for  $M_k$  and  $l_k$ , the *s*-parameters  $s_{ij}$  [20] of every section k in the signal flow graph in Fig. 3 are

$$S_{ijk} = \frac{1}{2Z_0 Z_{Ck} \cosh(\gamma_k l_k) + (Z_0^2 + Z_{Ck}^2) \sinh(\gamma_k l_k)} \times \begin{bmatrix} (Z_{Ck}^2 - Z_0^2) \sinh(\gamma_k l_k) & 2Z_0 Z_{Ck} \\ 2Z_0 Z_{Ck} & (Z_{Ck}^2 - Z_0^2) \sinh(\gamma_k l_k) \end{bmatrix}.$$
 (5)

 $Z_0$  is a reference impedance and can be chosen freely. The source and load impedance are reflected in  $S_S$  and  $S_L$ 

$$S_S = \frac{Z_S - Z_0}{Z_S + Z_0}, \quad S_L = \frac{Z_L - Z_0}{Z_L + Z_0}.$$
 (6)

Note that in the s-parameter model  $V_S = -(1/2)V_{S1}$  for line 5 and  $V_S = +(1/2)V_{S1}$  for line 6.



Fig. 4. Calculated DM SCR as a function of  $x_2$  and  $R_L$  ( $x_1 = 0, x_3 = 1, l_T = 10$  mm, and  $R_S = 50 \Omega$ ).

Now, with the help of Mason's Rule [20], the transfer functions  $H_{\text{even}+}$ ,  $H_{\text{even}-}$ ,  $H_{\text{odd}+}$ , and  $H_{\text{odd}-}$  can be found and, thus,  $H_{\text{DM}}$ ,  $H_{\text{XDM}}$ , and  $H_{\text{XCM}}$  can be calculated [see (1)–(3)]. As the formulas are very complex, they are not shown here.

# C. Signal-to-Crosstalk Ratio (SCR)

With the help of these transfer functions, the optimal positions for  $x_1, x_2$ , and  $x_3$  can be found. As optimization criterion, we use the SCR, which is defined as

$$SCR = \frac{\text{signal power}}{\text{crosstalk power}} = \frac{\int_{0}^{\infty} X(f) |H_{DM}(f)|^2 df}{\int_{0}^{\infty} X(f) |H_X(f)|^2 df}$$
(7)

where X(f) is the power spectral density of the input signal and  $H_X$  can be either  $H_{XDM}$  or  $H_{XCM}$  giving differential mode (DM) SCR or common mode (CM) SCR.

Fig. 4 shows the DM SCR as a function of  $x_2$ . In this case,  $x_1 = 0, x_3 = 1, l_T = 10$  mm, and  $R_S = 50 \Omega$ . The SCR is highest if  $x_2$  is 0.5 and  $R_L = R_S$ . However, if  $R_L$  is larger then  $R_S$  (as found in conventional on-chip termination, where the wire is terminated to a gate), the optimum value for  $x_2$  is shifted towards 0.7 and the peak value of the SCR decreases. Note that the optimal case, one twist at  $x_2 = 0.5$  and choosing  $R_L = R_S$ , nicely coincides with the fact that for highest bandwidth, both  $R_S$  and  $R_L$  should be chosen low-ohmic [7].

DM crosstalk can be cancelled with the twist at  $x_2$ , but there will still be a CM crosstalk component at the output. This can be removed by the twists at  $x_1$  and  $x_3$ . Fig. 5 shows the SCR for both DM crosstalk and for CM crosstalk as a function of  $x_1$ and  $x_3$  with  $x_2 = 0.5$  and  $R_L = R_S$ . The figure shows that the DM crosstalk is canceled if  $x_3 = 1 - x_1$ . On this line, the CM crosstalk is minimal at  $x_1 \approx 0.3$  and  $x_3 \approx 0.7$ . Fig. 6 again shows the SCR for both DM and CM crosstalk as a function of  $x_1$  and  $x_3$ , but now  $x_2 = 0.7$  and  $R_L = \infty$ . The figure shows that the optimal positions for a high-ohmic termination are at  $x_1 \approx 0.5$  and  $x_3 \approx 0.87$ .



Fig. 5. Calculated contour plot of SCR (in decibels) as a function of  $x_1$  and  $x_3$  ( $x_2 = 0.5$ ,  $l_T = 10$  mm, and  $R_S = R_L = 50 \Omega$ ).



Fig. 6. Calculated contour plot of SCR (in decibels) as a function of  $x_1$  and  $x_3$  ( $x_2 = 0.7, l_T = 10$  mm, and  $R_S = 50 \ \Omega, R_L = \infty$ ).

#### D. Simplified Low-Frequency Model

To gain an intuitive insight and give an explanation for these optimal positions, a simple, low-frequency model is developed. In Fig. 7, two interconnects are drawn. The top interconnect is the aggressor line and the bottom, the victim line. The graph above the aggressor line shows the voltage along this aggressor line. For low frequencies, this voltage simply shows a linear decrease along the line, given by a resistive divider

$$\frac{v_A(x)}{V_S} = \frac{(1-x)R_T + R_L}{R_S + R_T + R_L}$$
(8)

where  $R_T = R' * l_T$  is the total resistance of the interconnect. Assuming the crosstalk voltage at the victim line is much smaller, the voltage  $v_A(x)$  will be present across the distributed capacitance  $C'_M$  and this will generate a current

$$i_C(x) = v_A(x) \cdot j\omega C'_M \cdot dx \tag{9}$$



Fig. 7. Simplified low-frequency model.



Fig. 8. Simplified low-frequency model, low-ohmic termination.



Fig. 9. Simplified low-frequency model, high-ohmic termination.

over a small length dx at position x. This current sees a resistive divider and only part of the current will contribute to the voltage at the output of the victim line (see bottom graph)

$$\frac{dV_X(x)}{i_C(x)} = R_L \frac{R_S + xR_T}{R_S + R_T + R_L}.$$
 (10)

By using (8), (9), and (10), we can find an expression for the change in  $V_X$  over a small length dx at position x (see Figs. 8 and 9)

$$\frac{d\frac{V_X(x)}{V_S}}{dx} = j\omega C'_M R_L \frac{(R_S + xR_T) \cdot ((1-x)R_T + R_L)}{(R_S + R_T + R_L)^2}.$$
(11)

The crosstalk voltage  $V_X$  can be found by integrating over x.

By placing a twist in the interconnect, the integration of the second part of the graph (in Figs. 8 and 9) will have an opposite sign. If both halves have equal area, there will be no crosstalk. For a low-ohmic  $R_L$  this is at x = 0.5 (Fig. 8) and for a high-ohmic  $R_L$  this is at x = 0.7 (Fig. 9). Dividing the graph into four equal parts shows that the optimal positions for  $x_1$  and  $x_3$  are at 0.3 and 0.7 for a low-ohmic  $R_L$  and at 0.5 and 0.87 for a high-ohmic  $R_L$ .



Fig. 10. 3-D EM-field simulation step responses for different positions of the twist  $(x_1 - x_2 - x_3)$  and for two different load resistances. Both differential mode and common-mode step responses are shown. The length of the interconnects  $l_T = 1$  mm.

#### **IV. SIMULATIONS**

#### A. 3-D EM-Field Simulations

In order to check the analytical results on optimal twist positions, a configuration with two differential interconnects has been simulated in a 3-D EM-field simulator. The length  $l_T$  is only 1 mm to limit the simulation time. Note that for  $l_T =$ 1 mm, the crosstalk voltage is much lower than for  $l_T = 10$ mm.

One of the differential interconnects has one twist and the other has two twists. Fig. 10 shows the simulated crosstalk voltage (step response) for different positions of the twists  $(R_S = 50 \ \Omega)$ . For DM crosstalk, the optimal position of the twist  $(x_2)$  is at 0.5 for an  $R_L$  of 50  $\Omega$  and between 0.6 and 0.7 for an  $R_L$  of 20 k $\Omega$ . This coincides with the theory: the simplified model of Section III predicts 0.5 and 0.64, respectively.

For CM crosstalk, the optimal positions of the twists ( $x_1$  and  $x_3$ ) are at 0.3 and 0.7 for an  $R_L$  of 50  $\Omega$  and at 0.35 and 0.8 for an  $R_L$  of 20 k $\Omega$ . Again, this agrees well with the theory



Fig. 11. Lumped model step responses for different positions of the twist  $(x_1 - x_2 - x_3)$  and for two different load resistances. Both differential-mode (DM) and common-mode (CM) step responses are shown. The length of the interconnects  $l_T = 10$  mm.

that predicts  $x_1 = 0.27$  and  $x_3 = 0.73$  for an  $R_L$  of 50  $\Omega$  and  $x_1 = 0.37$  and  $x_3 = 0.82$  for an  $R_L$  of 20 k $\Omega$ .

#### B. Lumped Model

For circuit simulations, a lumped *RC* model of the structure of Fig. 3 was built with 200 lumps per interconnect. For 10-mm-long interconnects, we chose  $R_{\text{lump}} = 7.5 \Omega$ ,  $C_{G,\text{lump}} = 3.25$  fF and  $C_{M,\text{lump}} = 2.5$  fF. The twist positions were varied and the step responses are shown in Fig. 11. The figure shows that  $x_2 = 0.5$  gives mininal DM crosstalk for  $R_L = R_S$  and  $x_2 = 0.7$  for  $R_L$  high-ohmic. The figure also shows that CM crosstalk is reduced by twists at  $x_1 = 0.3$  and  $x_3 = 0.7$  for  $R_L = R_S$  and by twists at  $x_1 = 0.5$  and  $x_3 = 0.87$  for  $R_L$  high-ohmic. These twist positions agree with the analytical and 3-DEM-field solver results (see Sections III and IV-A).

#### V. EYE-DIAGRAM PROPERTIES

# A. Achievable Data Rate

By using a method similar to the method described in [7], it is possible to extract eye-diagram properties from the impulse responses  $h_{\rm DM}(t)$  and  $h_{\rm XDM}(t)$ , which are the inverse fourier transforms of  $H_{\rm DM}(f)$  and  $H_{\rm XDM}(f)$ , respectively. We will look at the differential output out+ - out- (see Fig. 3). It is possible to find the eye height (relative to the maximum received value) and the eye width (relative to one symbol period) for different data rates. For a 10-mm-long differential interconnect in a bus with  $R_S = 65 \ \Omega$ ,  $R_L = 150 \ \Omega$  (values that were realized on our test chip, see Section VI),  $x_1 = 0, x_3 = 1$  and using PW equalization [7], Fig. 12 shows the relative eye width and the relative eye height as a function of data rate. Two cases are shown:  $x_2 = 0$  and  $x_2 = 0.5$ . The figure shows clearly that without twisting, the crosstalk limits the data rate. If we look at



Fig. 12. Eye-diagram properties with  $(x_2 = 0.5)$  and without  $(x_2 = 0)$  twist. At a relative eye height of 0.5, a 2 times higher data rate is possible due to the twist.



Fig. 13. Relative eye height as a function of  $x_2$  and  $R_L$ .

50% eye height, a 2 times higher data rate is possible by using the twist at  $x_2 = 0.5$ .

#### B. Sensitivity

In Fig. 13, the relative eye height is plotted against  $x_2$  (upper) and  $R_L$  (lower). As the eye height is the limiting factor (see Fig. 12), the eye width is not plotted in Fig. 13. Again, a 10-mm-long interconnect is simulated with  $R_S = 65 \Omega$ ,  $R_L = 150 \Omega$ ,  $x_1 = 0, x_2 = 0.5, x_3 = 1$ , and using PW equalization. The data rate is 3 Gb/s.

The upper part of the figure shows that the relative eye height is not much reduced by small changes in the optimal position of  $x_2$ . If the position of  $x_2$  is varied with 1% (100  $\mu$ m), the eye height is only reduced from 0.73 to 0.71. So, the exact placement of the twists is not critical. The lower part of Fig. 13 shows that it is also not critically to exactly match  $R_S$  and  $R_L$ . The relative eye height even increases for  $R_L < R_S$ : although there is slightly more crosstalk, the bandwidth of the interconnect is increased by the smaller load resistance.



Fig. 14. Bus configuration of test chip.

# VI. MEASUREMENTS

# A. Measurement Method

On a test chip [7] in a 0.13- $\mu$ m CMOS process, a bus of seven 10-mm-long differential interconnects is measured with a configuration as shown in Fig. 14. The seven channels are driven by inverters with an  $R_S$  of about 65  $\Omega$ . The  $R_L$  of about 150  $\Omega$ is made with inverters with a feedback resistor. So both  $R_S$  and  $R_L$  are low-ohmic compared to the 1.5 k $\Omega$  interconnect. The low-ohmic termination in combination with PW equalization is used to achieve a data rate of 3 Gb/s. This data rate is measured on channel 4, as described in [7].

In this paper, we show the results of measurements on channels 1 and 6. By measuring crosstalk transfer functions, the effectiveness of the twists will be proven. In order to measure these crosstalk transfer function, we use the same transmitters and receivers that are used for the data rate measurements. Since we only have one data generator available, the transmitters are all driven by the same data. In order to create pseudo-independent data on the seven channels (needed for data rate measurements on channel 4), there is a delay of ten clock periods between every transmitter, realized via on-chip shift registers. Because of the PW equalization in the transmitters, the data is multiplied with a rectangular wave with controllable PW. By setting the PW to 50%, the transmitters transmit a square wave. For a "zero," the square wave is first half a clock period low and after that half a clock period high; for a "one" the square wave is inverted (Manchester code).

To understand how crosstalk information is extracted, assume that the data generator has been generating a "zero" for longer than 70 clock delays. Then, all seven transmitters transmit the same square wave. If the data generator then starts transmitting a "one," first the square wave of channel 1 is inverted. Ten clock delays later, also the square wave of channel 2 is inverted. Another ten clock delays, and the square wave of channel 3 is inverted, and so on.



Fig. 15. Measured voltage waveform at channel 6.



Fig. 16. Measured transfer functions to the output of channel 6.  $H_{66+}$  and  $H_{66-}$  are the transfer functions of channel 6 to out 6+ and out 6-,  $H_{Xi6+}H_{Xi6-}$  are the crosstalk transfer functions from channel *i* to out 6+ and out 6-. The differential output voltages are measured as out 6+ minus out 6-.

The square wave on the channels is filtered by the interconnect. Fig. 15 shows the result as measured on channel 6 (out6+-out6-). The figure shows that the amplitude (and phase) of the sine wave is changing every ten clock delays. This is because the amplitude and phase depend on the total resistance and capacitance of the interconnect and the capacitance of the interconnect depends on the signals on neighboring interconnects. If the signals of two neighboring interconnects are equal, the capacitance between these interconnects is not seen, but if the signals have opposite signs, the capacitance is seen double (Miller Multiplication).

By carefully correlating the output voltage with the clock frequency and filtering the results, the amplitude and phase steps are found. These steps are a measure for the crosstalk transfer



Fig. 17. Measured transfer functions to the output of channel 1.  $H_{11+}$  and  $H_{11-}$  are the transfer functions of channel 1 to out 1+ and out 1-,  $H_{X21+}$  and  $H_{X21-}$  are the crosstalk transfer functions from channel 2 to out 1+ and out 1-. The differential output voltages are measured as out 1+ minus out 1-.



Fig. 18. SE and DIFF eye-diagram measurements on channel 6.

functions at the clock frequency. The crosstalk transfer functions are found by repeating these measurements for other clock frequencies.

# B. Measurement Results

Fig. 16 shows the measured transfer function from channel 6 and the crosstalk transfer functions from channels 5 and 7 to channel 6 (low-ohmic termination). As expected, the crosstalk from channel 5 is less than the crosstalk from channel 7: the double twist in channel 5 at  $x_1 = 0.3$  and  $x_3 = 0.7$  reduces CM crosstalk (see top Fig. 16). Both the crosstalk from channels 5 and 7 is reduced for the differential output: the single twist in channel 6 at  $x_2 = 0.5$  reduces DM crosstalk (see bottom Fig. 16).

The transfer functions of Fig. 17 have a smaller bandwidth due to the high-ohmic termination of channel 1. There is more crosstalk from channel 2 on out1+ than on out1-, because out1- has no signal carrying neighbor. The bottom graph shows that the crosstalk is not reduced for the differential output as there is no twist in channel 1.

In Fig. 18, the measured single ended (SE) output and the differential (DIFF) output of ch. 6 are plotted in eye-diagrams for a data rate of 2.5 Gb/s. For reliable communication, the eye should be open. The eye-diagram for the SE output is almost closed (crosstalk from channel 7). Looking at the DIFF output, the influence of the twist is seen. The eye is almost completely open.

# VII. CONCLUSION

Crosstalk limits the achievable data rate of on-chip global interconnects on large CMOS ICs, especially if low-swing signalling is used. Differential interconnects can be used to suppress certain external noise sources. In order to cancel neighbor-to-neighbor crosstalk, twists are placed in the differential interconnects. It turns out, that only one twist in every even interconnect pair and only two twists in every odd interconnect pair reduce the crosstalk by more than 40 dB.

Our analysis of twists in global on-chip interconnects shows that the optimal positions of the twists depend on the termination of the interconnect. Differential mode crosstalk can be canceled with only one twist at 50% by choosing both a low-ohmic source and a low-ohmic load resistance. Two twists in the neighboring interconnects at 30% and 70% reduce common mode crosstalk. If the source resistance is low-ohmic, but the load resistance is high-ohmic (conventional termination), the optimal positions shift to 70% for the single twist and 50% and 87% for the double twist. However, compared to a low-ohmic load resistance, the bandwidth of the interconnect is reduced and the crosstalk reduction is less. So, low-ohmic termination has both a benefit for bandwidth and for crosstalk suppression.

An analysis with eye-diagram properties shows that a two times higher data rate is possible due to the twisting. This analysis also shows that the exact placement of the twists is not critical. For a practical design [7], the eye height is only 2% smaller with a variation of 100  $\mu$ m in the twist position. In addition, it turns out that it is not critical to exactly match  $R_S$  and  $R_L$ . Normal process spreading of about 25% only reduces the eye opening by a few percent. Measurements show the effectiveness of the twists.

# ACKNOWLEDGMENT

The authors would like to thank Philips Research for chip fabrication.

#### REFERENCES

- [1] R. Ho, K. W. Mai, and M. A. Horowitz, "The future of wires," *Proc. IEEE*, vol. 89, no. 4, pp. 490–504, Apr. 2001.
- [2] H. Bakoglu, Circuits, Interconnections and Packaging for VLSI. Reading, MA: Addison-Wesley, 1990.
- [3] R. Ho, K. Mai, and M. Horowitz, "Efficient on-chip global interconnects," in *Proc. IEEE Symp. VLSI Circuits*, 2003, pp. 271–274.
- [4] H. Kaul and D. Sylvester, "Low-power on-chip communication based on transition-aware global signaling (TAGS)," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 5, pp. 464–476, May 2004.
- [5] D. Schinkel, E. Mensink, E. Klumperink, E. van Tuijl, and B. Nauta, "A 3 Gb/s/ch transceiver for *RC*-limited on-chip interconnects," in *ISSCC Dig. Tech. Papers*, 2005, pp. 386–387.
- [6] A. Katoch, H. Veendrick, and E. Seevinck, "High speed current-mode signaling circuits for on-chip interconnects," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, 2005, pp. 4138–4141.

- [7] D. Schinkel, E. Mensink, E. Klumperink, E. van Tuijl, and B. Nauta, "A 3 Gb/s/ch transceiver for 10 mm uninterrupted RC-limited global on-chip interconnects," *IEEE J. Solid-State Circuits*, vol. 41, no. 1, pp. 297–306, Jan. 2006.
- [8] H. Zhang, V. George, and J. M. Rabaey, "Low-swing on-chip signaling techniques: Effectiveness and robustness," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 8, no. 2, pp. 264–272, Jun. 2000.
- [9] M. Aoki et al., "A 60-n7s 16-Mbit CMOS DRAM with a transposed data-line structure," *IEEE J. Solid-State Circuits*, vol. 23, no. 5, pp. 1113–1119, Oct. 1988.
- [10] H. Hidaka, K. Fujishima, Y. Matsuda, M. Asakura, and T. Yoshihara, "Twisted bit-line architectures for multi-megabit DRAMs," *IEEE J. Solid-State Circuits*, vol. 24, no. 1, pp. 21–27, Feb. 1989.
- [11] D. S. Min and D. W. Langer, "Multiple twisted dataline techniques for multigigabit DRAMs," *IEEE J. Solid-State Circuits*, vol. 34, no. 6, pp. 856–865, Jun. 1999.
- [12] K. Noda et al., "An ultrahigh-density high-speed loadless four-transistor SRAM macro with twisted bitline architecture and triple-well shield," *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 510–515, Mar. 2001.
- [13] J. E. Muyshondt, "Printed circuit board with an integrated twisted pair conductor," U.S. Patent 5 646 368, Jul. 8, 1997.
- [14] D. G. Kam, H. Lee, S. Baek, B. Park, and J. Kim, "GHz twisted differential line structure on printed circuit board to minimize EMI and crosstalk noises," in *Proc. Electron. Components Technol. Conf.*, 2002, pp. 1058–1065.
- [15] G. Zhong, C. Koh, and K. Roy, "A twisted-bundle layout structure for minimizing inductive coupling noise," in *Proc. Int. Conf. Comput.-Aided Des. (ICCAD)*, 2000, pp. 406–411.
- [16] J. Kim et al., "A novel twisted differential line for high speed on-chip interconnections with reduced crosstalk," in Proc. Electr. Packag. Tech. Conf., 2002, pp. 180–183.
- [17] I. Hatirnaz and Y. Leblebici, "Modelling and implementation of twisted differential on-chip interconnects for crosstalk noise reduction," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, 2004, pp. V-185–V-188.
- [18] L. Deng and M. D. F. Wong, "Optimal algorithm for minimizing the number of twists in an on-chip bus," in *Proc. Des., Autom. Test Eur. Conf. Exhibition*, 2004, pp. 1104–1109.
- [19] E. Mensink, D. Schinkel, E. Klumperink, E. van Tuijl, and B. Nauta, "Optimally-placed twists in global on-chip differential interconnects," in *Proc. 31th Eur. Solid-State Circ. Conf. (ESSCIRC)*, Sep. 2005, pp. 475–478.
- [20] P. A. Rizzi, *Microwave Engineering: Passive Circuits*. Englewood Cliffs, NJ: Prentice-Hall, 1988, pp. 168–176, 541-548.



**Eisse Mensink** (S'03) was born on January 10, 1979, in Almelo, The Netherlands. He received the M.Sc. degree in electrical engineering (*cum laude*) from the University of Twente, Enschede, The Netherlands, in 2003, where he is currently pursuing the Ph.D. degree in high-speed on-chip communication.

**Daniël Schinkel** (S'03) was born in Finsterwolde, The Netherlands, in 1978. He received the M.Sc. degree in electrical engineering (*cum laude*) from the University of Twente, Enschede, The Netherlands, in 2003, where he is currently pursuing the Ph.D. degree in high-speed on-chip communication. During his studies he worked on various occasions

During his studies, he worked on various occasions as a trainee at the Mixed-Signal Circuits and Systems Department, Philips Research, Eindhoven, The Netherlands. This work resulted in a number of publications and two patent filings. His research interests include analog and mixed-signal circuit design, sigma-delta data converters, class-D power amplifiers, and high-speed communication circuits.



**Eric A. M. Klumperink** (M'98) was born on April 4, 1960, in Lichtenvoorde, The Netherlands. He received the B.Sc. degree from HTS, Enschede, The Netherlands, in 1982.

After a short period in industry, he joined the Faculty of Electrical Engineering, the University of Twente, Enschede, The Netherlands, in 1984, where he was mainly engaged in analog CMOS circuit design and research. This resulted in several publications and a Ph.D. thesis, in 1997, on the subject of "Transconductance based CMOS circuits." He

is currently an Associate Professor at the IC-Design Laboratory and is also involved in the CTIT Research Institute. He holds four patents and authored and co-authored more than 50 journal and conference papers. His research interests include design issues of high frequency CMOS circuits, especially for front-ends of integrated CMOS transceivers. Since 2006, he serves as Associate Editor for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS.

Prof. Klumperink was a co-recipient of the ISSCC 2002 "Van Vessem Outstanding Paper Award."



Ed (A. J. M.) van Tuijl (M'97) was born in Rotterdam, The Netherlands, on June 20, 1952.

He joined Philips Semiconductors, Eindhoven, The Netherlands, in 1980. As a Designer, he worked on many kinds of small-signal and power audio applications, including A/D and D/A converters. In 1991, he became Design Manager of the audio power and power-conversion product line. In 1992, he joined the University of Twente, Enschede, The Netherlands, as a part-time Professor. After many years at Philips Semiconductors, he joined Philips

Research, Eindhoven, The Netherlands, in 1998 as a Principal Research Scientist. His current research interests include data conversion, high-speed communication, and low-noise oscillators. He is an author or co-author of many papers and holds many patents in the field of analog electronics and data conversion.



**Bram Nauta** (M'91–SM'03) was born in Hengelo, The Netherlands. He received the M.Sc. degree (*cum laude*) in electrical engineering and the Ph.D. degree in analog CMOS filters for very high frequencies from the University of Twente, Enschede, The Netherlands, in 1987 and 1991, respectively.

In 1991, he joined the Mixed-Signal Circuits and Systems Department, Philips Research, Eindhoven, the Netherlands, where he worked on high-speed AD converters and analog key modules. In 1998, he returned to the University of Twente as a full Professor

heading the IC Design group, which is part of the CTIT Research Institute. His current research interests include high-speed analog CMOS circuits. He is also part-time consultant in industry and in 2001 he co-founded Chip Design Works.

Dr. Nauta's Ph.D. thesis was published as a book Analog CMOS Filters for Very High Frequencies (Springer, 1993) and was a recipient of the "Shell Study Tour Award" for his Ph.D. work. He was also a co-recipient of the ISSCC 2002 "Van Vessem Outstanding Paper Award." From 1997 to 1999, he served as an Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING and, in 1998, he served as a Guest Editor for IEEE JOURNAL OF SOLID-STATE CIRCUITS. From 2001 to 2006, he was an Associate Editor for the IEEE JOURNAL OF SOLID-STATE CIRCUITS and he is also a member of the technical program committees of ISSCC, ESSCIRC, and the Symposium on VLSI circuits.