# A high-frequency CMOS multi-modulus divider for PLL frequency synthesizers

**Ching-Yuan Yang** 

Received: 14 January 2007/Revised: 20 February 2008/Accepted: 25 February 2008/Published online: 15 March 2008 © Springer Science+Business Media, LLC 2008

Abstract A high-frequency divide-by-256–271 programmable divider is presented with the improved timing of the multi-modulus divider structure and the high-speed embedded flip-flops. The D flip-flop and logic flip-flop are proposed by using a fast pipeline technique, which contains single-phase, edge-triggered, ratioed, and high-speed technologies. The circuits achieve high-speed by reducing the capacitive load and sharing the delay between the combination logic blocks and the storage elements. By the way, it is suitable for realizing high-speed synchronous counters. The programmable divider using proposed flipflops is measured in 0.25- $\mu$ m CMOS technology with the operating clock frequency reaching as high as 4.7 GHz under the supply voltage of 3V.

**Keywords** Frequency synthesizers · High-speed dynamic circuits · Logic flip-flops · Multi-modulus dividers · Phase-locked loops

### 1 Introduction

To constitute a complete transceiver for modern wireless communication systems, the frequency synthesizer which generates the local oscillator (LO) signal is an indispensable building block. Wherever frequencies are translated, frequency synthesis is crucial to provide clean, stable, and programmable LO signal. The phase-locked loop (PLL) is used for a frequency synthesizer in almost all wireless communication chipsets on the market. The PLL frequency

C.-Y. Yang (🖂)

synthesizer has the potential of combining high-frequency and low-power. Generally, a PLL as shown in Fig. 1 mainly consists of four basic components: a phase detector, a loop filter, a voltage-controlled oscillator (VCO), and a frequency divider (including a prescaler) [1-3]. The phase detector compares the phase of the input signal against the divided phase of the VCO. The output of the phase detector is a measure of the phase difference between the two inputs. The difference voltage is then filtered by the loop filter and applied to the VCO. The control voltage on the VCO changes the frequency in the direction that reduces the phase difference between the input signal and the VCO. To sum up, it is converted from phase to voltage by the phase detector, processed by the loop filter, and converted back to phase by the VCO. In the lock condition, the input and the divided output frequencies are exactly equal. By varying the divide ratio of the divider, the PLL can synthesize a new frequency based upon the reference input while retaining the stability, accuracy, and spectral purity of the original reference. Of course, the addition of the fractional-N division technique is necessary to achieve narrow channel frequency spacing resolution. As shown in Fig. 1, the digital  $\Sigma\Delta$ -modulator generates a stream of integers M(t) which can interpolate a fraction number corresponding to the control input [4, 5].

The VCO is, however, more critical due to the performance of phase noise, and thus is solved externally in many applications. Together with the VCO, the frequency divider operates at the highest frequency in a frequency synthesizer. The frequency divider is usually implemented as a digital counter. The speed of digital counters is, however, limited in CMOS technology. To meet the requirements for high-frequency operation, therefore, a high-frequency PLL in CMOS technology desires a robust divider (prescaler). Today, advanced CMOS techniques

Department of Electrical Engineering, National Chung Hsing University, Taichung 402, Taiwan ROC e-mail: ycy@nchu.edu.tw



Fig. 1 The basic phase-locked loop

offering higher integration density, low-power consumption, and high-speed capability are developed to pursue a high-speed divider.

Dynamic circuit techniques evolved in the last few years into several high-speed CMOS circuit techniques such as Domino, NORA and true single phase clocking (TSPC) circuits. For TSPC circuits, just one single clock signal is required [6]. It has resulted in many complex CMOS circuits operating at several hundred megahertz clock frequencies or more than 1 GHz [7–10]. In this article, a high-speed multi-modulus divider using a fast ratioed logic technique for dynamic D flip-flops (DFFs) and logic flipflops (LFFs) is presented. The proposed flip-flops can reduce the effective propagation delay for realization of high-speed and large-range dividers.

This article is organized as follows. Section 2 describes the architecture of the multi-modulus divider. Section 3 shows how we can do by using better high-speed circuit design in the divider. Finally, Sects. 4 and 5 give the measured results and conclusions, respectively.

### 2 Architecture

To achieve high-speed and low-power design, it is desirable to minimize the amount of circuitry operating at high-frequency. The dual-modulus approach achieves such a structure, and has been successfully used in many high-speed, low-power designs [7–12]. Conventionally, a high-speed programmable divider is implemented using an extension of a dual-modulus divider in which the overall divide-by-2 sections are replaced with divide-by-2/3 blocks [4, 13]. Based on the dual-modulus topology, in this work a multi-modulus divider structure is designed as shown in Fig. 2, which consists of a synchronous divide-by-64 counter as the first stage, and several gates as the controlled stage. The control block eliminates *MC* controlling the divider ratio of 4 or 5 of the synchronous counter,

which is a combination logic circuitry. Depending on the logic value at MC, the first stage division ratio is 4 (MC = 0) or 5 (MC = 1). Its operation can be described as following:

*Divide-by-4*: If signal MC is low, the active circuit consists simply of a cyclic shift register; FF3 is left out (output always high), and the input to FF1 is inverted.

*Divide-by-5*: If the input *MC* is high, the NAND function formed of the outputs of FF2 and FF3 is fed into FF1, resulting in a cyclic shift register with three high periods and two low periods.

The operation of the controlled block is explained as follows. With the inputs  $D_3 - D_0$  at logic zero and MC is zero, the synchronous counter does divide-by-4 only, and the divider functions as a divide-by-256 counter. The signal  $X_i$  is activated according to the asynchronous counter outputs  $F_8$ ,  $F_{16}$ ,  $F_{32}$ , and  $F_{64}$ . By this way, each signal  $X_i$ generates a signal pulse with 2<sup>i</sup>-clock cycle in 16-clock cycle of  $F_4$  and is non-overlapped each other. For each signal  $X_i$ , if the modulus controlled input  $D_i$  is at logic one,  $Y_0$  is the combined cycle of *n* clocks, where  $n = D_3 \cdot 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 + 2^3 +$  $D_2 \cdot 2^2 + D_1 \cdot 2 + D_0$ . By adding the combined signal  $Y_1$ , which is one-fourth of 64-clock cycle of  $F_4$ , MC generates signal pulses of n clocks in this 64-clock cycle. One extra cycle in MC (swallowing a single clock period of  $f_{in}$ ) makes the synchronous counter as a divide-by-5 divider, thus the total cycle makes the divider function as a divideby 256 + n divider. In a word, the output of such divider, together with the modulus-controlled signal MC at logic one, generates signal pulses enabling the synchronous counter to do divide-by-5 with n times in a 256-clock cycle. Afterward the divide-by-4 is resumed. Therefore, the divider can be programmed as divide-by-256-271 function through 4-bit modulus-selection input signals, and the divide-ratio is given as

 $N = 256 + D_3 \cdot 2^3 + D_2 \cdot 2^2 + D_1 \cdot 2 + D_0 \tag{1}$ 

where  $D_n$  is the programmable bit.

As the above description, the multi-modulus divider consists of the synchronous counter, the asynchronous and the control circuit. The control signal, *MC*, determining the divide ratio of the high-speed synchronous divide-by-4/5 counter, is resulted from the asynchronous counter and the control circuit. In this divider, the insertion delay ( $t_{d,MC}$ ) of the control signal is composed of the sum of the propagation delay of the path delay ( $t_{d,CB}$ ) of the control block and the propagation delay ( $t_{d,AC}$ ) of the asynchronous counter; it can be represented as

$$t_{\rm d,MC} = t_{\rm d,AC} + t_{\rm d,CB} \tag{2}$$

Figure 3 presents the timing waveforms of the asynchronous counter and the combination-logic control circuit. The output of each divider (TFF) lags the output of





the previous one with a delay  $t_{d,TFF}$ , in the asynchronous six-stage counter, and each logic gate delay time is assumed to be  $t_{d,L}$  (assuming each gate delay is equal). Then the rise edges of nodes  $X_0-X_3$  lags the input signal  $F_4$ 

$$t_{\rm d,X_0} = t_{\rm d,TFF} + t_{\rm d,L} \tag{3}$$

with a delay, respectively, are given by

$$t_{d,X_1} = 2 \cdot t_{d,TFF} + t_{d,L} \tag{4}$$

$$t_{d,X_2} = 3 \cdot t_{d,TFF} + t_{d,L} \tag{5}$$

and

$$t_{d,X_3} = 4 \cdot t_{d,TFF} + t_{d,L} \tag{6}$$

The path-delay problem can occur creating unwanted spikes on the control signal. In order to solve the unwanted

spikes of the control signal, a DFF latch is inserted as shown in Fig. 4. Also, it can improve the control property and accuracy.

### **3** Circuit description

The operating speed of the divider in Fig. 4 is mainly limited at the divide-by-4/5 counter that is the only part operating at the maximum frequency. To maximize the speed of such a divider, the DFFs and NAND gates among the synchronous counter have to be optimized together [14]. In many highspeed applications, dynamic flip-flops are adopted in divider designs. Also, to operate at high-speed, it is important to



Fig. 3 Timing diagram of the multi-modulus divider with propagation delay consideration





Divide-by-64 counter

reduce the effective capacitance of internal and external nodes, which leads to the reduction of the propagation delay as well as the power consumption. Since charge leakage problems can lead to errors especially in dynamic circuits, it is often worthwhile to provide additional circuitry to combat the problem [15].

### 3.1 High-speed D flip-flop design

As shown in Fig. 5(a), the positive-edge trigger TSPC DFF of Yuan and Svensson has been widely used and it is constructed by a P-C<sup>2</sup>MOS stage, an N-precharge stage, and an N-C<sup>2</sup>MOS stage [6]. This TSPC approach can be extended to realize the ratioed DFF in Fig. 5(b). When CLK = 0, it is in *Hold Mode*. Since MN3 is off in this mode, node b is precharged to  $V_{DD}$  through MP3. Thus, since both of MN4 and MP4 are off, the data is in node  $\overline{Q}$  is held. The P-C<sup>2</sup>MOS stage functions as an pseudo-inverter now, and the data D is transmitted to node a. When CLK = 1, it is in *Evaluation Mode*. If node *a* is 1 in this instant, node b will be pulled down to 0 because MN2 and MN3 are turned on and MP3 is turned off now, and then MP4 is on. At this time the node C becomes 0 and MN6 is turned off, thus the  $\overline{Q}$  becomes 1. If node a is 0, node b is held 1 because MN2 is off. Since the node C is 1 and MN6 is turned on, the  $\overline{Q}$  becomes 0; i.e., the data in node b is transmitted to node  $\overline{O}$ .

The P-C<sup>2</sup>MOS stage of Fig. 5(a) consists of an n-channel transistor and two series p-channel transistors. The propagation delay of this stage is limited at the two series p-channel transistors. When the output node a is pulled up, the effective resistor seen at the node a is two times than

that when pulling down. This can be improved by taking away MP1 from the path. It results in another similar circuit called *clocking pseudo-NMOS inverter*, the input stage in Fig. 5(b). While CLK is low-level with ground, MP2 operates in active region as a pass transistor. The clocking pseudo-NMOS logic forms the pseudo-NMOS inverter, which is similar to the P-C<sup>2</sup>MOS logic. The pseudo-NMOS inverter uses the NMOS transistor to form the logic function, and a single PMOS as a load. Logically, this is identical to the standard CMOS approach. The main difference is that the gate of PMOS transistor MP2 is grounded. It gives

$$V_{\rm SGp} = V_{\rm DD}(\text{at node } a) \tag{7}$$

so that it is always in a conducting mode. To understand the DC characteristics, supposed that a logic 0 input with  $V_{\rm in} < V_{\rm Tn}$ , it places MN1 in cut off. Since MP2 is normally on,

$$V_{\rm OH} = V_{\rm DD} \tag{8}$$

is achieved. If the input voltage of D is  $V_{\rm DD}$ , MN1 is switched on and the output node has a conducting path to ground. Unlike standard CMOS inverter, however, MP2 is still in a conducting mode. This prohibits the voltage of node a from ever reaching 0 where the relative values of (*W/L*) set  $V_{\rm OL}$  [15], [16]. With the input voltage equal to  $V_{\rm DD}$ , MN1 is non-saturated while MP2 is saturated (for a reasonable design with a low  $V_{\rm OL}$ ). Equating current relation gives

$$\frac{\beta_{\rm n}}{2} \left[ 2(V_{\rm DD} - V_{\rm Tn}) V_{\rm OL} - V_{\rm OL}^2 \right] = \frac{\beta_{\rm p}}{2} \left( V_{\rm DD} - \left| V_{\rm Tp} \right| \right)^2 \qquad (9)$$

and we can get





$$V_{\rm OL} = (V_{\rm DD} - V_{\rm Tn}) - \sqrt{(V_{\rm DD} - V_{\rm Tn})^2 - \frac{\beta_{\rm p}}{\beta_{\rm n}} (V_{\rm DD} - |V_{\rm Tp}|)^2}$$
(10)

The value of  $V_{OL}$  is thus set by the driver-to-load ratio

$$\frac{\beta_{\rm n}}{\beta_{\rm p}} = \frac{K_{\rm n}(W/L)_{\rm n}}{K_{\rm p}(W/L)_{\rm p}} \tag{11}$$

where the NMOS transistor (MN1) acts as the logic driver with  $\beta_n$ , while the PMOS transistor (MP2) is being used as an active load with  $\beta_p$ . Ratioed logic places the constraints on the layout and design. For a workable inverter, we must have

$$\frac{\beta_{\rm n}}{\beta_{\rm p}} > \left(\frac{V_{\rm DD} - |V_{\rm Tp}|}{V_{\rm DD} - V_{\rm Tn}}\right)^2 \tag{12}$$

to keep the square root term real. Moreover, a small  $V_{OL}$  requires that  $(\beta_n/\beta_p) >> 1$ . By Eq. 9, the design equation can be obtained as

$$\frac{\beta_{\rm n}}{\beta_{\rm p}} = \frac{\left(V_{\rm DD} - |V_{\rm Tp}|\right)^2}{2(V_{\rm DD} - V_{\rm Tn})V_{\rm OL} - V_{\rm OL}^2}$$
(13)

This provides the minimum driver-to-load ratio needed for a desired  $V_{OL}$ . In order to improve the level of  $V_{OL}$ , the transistor MNX controlled by node *C* is added. While CLK is in the high-level with  $V_{DD}$ , MP2 is cut off and the clocking pseudo-NMOS inverter has the same function as P-C<sup>2</sup>MOS logic. As a result, the clocking pseudo-NMOS inverter can replace the P-C<sup>2</sup>MOS inverter for implementing arbitrary logic functions.

In the input stage of Fig. 5(b), the single clocking pull-up PMOS has much lower resistance and capacitance than a series of stacked PMOS devices as in the P-C<sup>2</sup>MOS logic of Fig. 5(a). The clocking pseudo-NMOS logic is the ratioed logic. If the  $W_p/W_n$  ratio is too large, the strong PMOS always keeps the output close to  $V_{DD}$  when clock signal is low. Therefore, the  $W_p/W_n$  ratio of the clocking-on PMOS and NMOS logic should be designed with care. Comparing P-C<sup>2</sup>MOS circuits with clocking pseudo-NMOS is much smaller than that of P-C<sup>2</sup>MOS in a standard CMOS process. Thus, the speed of the clocking pseudo-NMOS logic could be faster than P-C<sup>2</sup>MOS logic if the N-logic network is not heavy.







Fig. 7 Preamplifier stage

Similarly, the output stage of Fig. 5(b) is another clocking pseudo-PMOS structure, which comes from N-C<sup>2</sup>MOS, but the driving ability is much poor than pseudo-NMOS one. To increase the output swing and reduce the capacitance effect caused by the pseudo-PMOS, an NMOS transistor MN6 controlled by *C* is inserted to the output stage. MN6 is turned off by node *a* via an added inverter before the end of the precharge, so that the discharge path for  $\overline{Q}$  is safely cut off before MN4 is turned on at evaluation. Therefore, as shown in Fig. 5(b), a single-phase, edge-triggered, ratioed D flipflop is adopted for the first-stage high-speed counter.

Since the D flip-flop proposed in [10] may have malfunction as the clock frequency decreases as described in [17], the output stage in this work introduces a feedback loop to improve the pull-up ability. This way also can perform a wide operating frequency range. The D flip-flop contains an improved design is obtained by using a feedback loop to control the conduction of the PMOS in the output stage. In this circuit, MPX is biased by the output voltage of the node Q. When  $V_Q$  is low, the gate of MPX is at near zero voltage, and the current flows to help maintain charge on  $C_{out}$ . If a discharge occurs in the output stage of DFF, then  $\overline{V_Q}$  falls to toward zero voltage, the gate voltage on MPX will eventually charge to  $V_Q = V_{DD}$  through the inverter. This drives MPX into cut off and allows the rest of the discharge to proceed without any hindrance.

## 3.2 High-speed logic flip-flop design and asynchronous counter design

In order to reduce the logic's delay further, the logic flipflop pipeline stage, the NAND gate and DFF merged together, is adopted in Fig. 6 [10]. The asynchronous counter consists of a chain of six TFFs. Its maximum operating frequency has been lowered to one-fourth or onefifth of the input frequency of the divide-by-4/5, so the dynamic DFF in Fig 5(a) is used, in which the power consumption should be considered to be reduced.

### 3.3 Preamplifier circuit

The preamplifier schematic is the same as that in Fig. 7, while including an on-chip decoupling metal-isolator-metal (MIM) capacitor. The MIM capacitor has small



Fig. 8 Layout of the fabricated multi-modulus divider



Fig. 9 Measured maximum and minimum operating frequencies and current consumption of the multi-modulus divider versus supply voltage



Fig. 10 Minimum required input power versus supply voltage

capacitance/area values while provided with a high quality factor in silicon. For this work we fabricated a MIM capacitor with use of the metal-3 and metal-4 levels.

#### 4 Circuit implementation and experimental results

To verify the performance of the high-speed multi-modulus divider described above, the proposed circuit has been implemented in a 0.25- $\mu$ m double-polyquaternary-metal (DPQM) N-well CMOS process. The resulting transistor sizes are shown in Table 1. Figure 8 shows the layout of the multi-modulus divider. The fabricated divider was tested to determine their maximum operating frequency, power consumption, and input sensitivity. Figure 9 shows the measured maximum and minimum clock frequencies with 0-dBm input power versus the supply voltage for the



**Fig. 11** Measured output waveforms of the multi-modulus divider divided by 256–271 ( $f_{in} = 3$  GHz,  $V_{DD} = 2$  V). (**a**) The horizontal scale is 10 ns/div and the vertical scale is 1 V/div, and (**b**) The horizontal scale is 1 ns/div and the vertical scale is 1 V/div for output signals

multi-modulus divider, respectively, as well as the corresponding current consumption. The minimum input signal power required by the divider is measured at the operating frequencies of 1–3 GHz, respectively, as a function of frequency as depicted in Fig. 10. Figure 11 shows output waveforms of all kinds of the divided ratios at the operating frequency of 3 GHz for the supply voltage of 2 V. Table 1 summarizes the overall specifications of the proposed multi-modulus divider.

### 5 Conclusion

In this article, a high-speed multi-modulus divider is discussed and implemented. A fast pipeline technique using high-speed single-phase edge-triggered ratioed LFF and DFF is presented. The edge-triggered LFF and DFF as well as their interfaces to the surrounding circuits are simple. For synchronous counters, the circuits achieve high-speed capability by sharing the delay between the combination of logic blocks and DFFs, which significantly reduces the cycle time of pipeline stages. The high-speed multi-modulus divider is developed with the proposed DFFs and LFFs, and fabricated in a standard CMOS technology.

Acknowledgments The author would like to thank the SHARP Technology Company, Japan, for the fabrication of the chip, and Prof. S.-I. Liu, Department of Electronic Engineering, National Taiwan University, Taiwan, for discussion about this paper. This work was supported by the National Science Council, Taiwan, under Research Grant NSC-95-2220-E-005-003.

### References

- 1. Gardner, F. M. (1979). *Phaselock techniques* (2nd ed.). New York: Wiley.
- 2. Wolaver, D. H. (1991). *Phase-locked loop circuit design*. New Jersey: Prentic-Hall, Inc.
- 3. Razavi, B. (1996). *Monolithic phase-locked loops and clock recovery*. New Jersey: IEEE press.
- Perrott, M. H., Tewksbury III, T. L., & Sodini, C. G. (1997). A 27-mW CMOS fractional-N synthesizer using digital compensation for a 2.5-Mb/s GFSK modulation. *IEEE Journal of Solid-State Circuits*, 32(12), 2048–2060.
- Riley, T. A., Copeland, M. A., & Kwasniewski, T. A. (1995). Delta-sigma modulation in fraction-N frequency synthesis. *IEEE Journal of Solid-State Circuits*, 28(5), 553–559.
- Yuan, J. & Svensson, C. (1989). High-speed CMOS circuit technique. *IEEE Journal of Solid-State Circuits*, 24(1), 62–70.
- Huang, Q., & Rogenmoser, R. (1996). Speed optimization of edge-triggered CMOS circuits for gigahertz single-phase clocks. *IEEE Journal of Solid-State Circuits*, 31(3), 456–465.
- Chang, B., Park, J., & Kim, W. (1996). A 1.2 GHz CMOS dualmodulus prescaler using new dynamic D-type flip-flop. *IEEE Journal of Solid-State Circuits*, 31(5), 749–752.
- Larsson, P. (1996). High-speed architecture for a programmable frequency divider and a dual-modulus prescaler. *IEEE Journal of Solid-State Circuits*, 31(5), 744–748.

- Yang, C.-Y., Dehng, G.-K., Hsu, J.-M., & Liu, S.-I. (1998). New dynamic flip-flops for high-speed dual-modulus prescaler. *IEEE Journal of Solid-State Circuits*, 33(10), 1568–1571.
- Chi, B., & Shi, B. (2003). New implementation of phaseswitching technique and its applications to GHz dual-modulus prescalers. *IEE Proceedings of the Circuits Devices and Systems*, 150(5), 429–433.
- Yang, D.-J., & Kenneth, K. O. (2004). A 14-GHz 256/257 dualmodulus prescaler with secondary feedback and its application to a monolithic CMOS 10.4-GHz phase-locked loop. *IEEE Transactions on Microwave Theory and Techniques*, 52(2), 461–468.
- Zarei, H., Shoei, O., Fakhrai, S. M., & Zakeri, M. M. (2000). A 1.4 GHz/2.7 V programmable frequency divider for DRRS standard in 0.6µm CMOS process. *IEEE International Conference on Electronics, Circuits and Systems.*
- Rogenmoser, R. & Huang, Q. (1996). An 800-MHz 1-µm CMOS pielined 8-b adder using true single-phase clock logic-flip-flops. *IEEE Journal of Solid-State Circuits*, 31(3), 401–409.
- Gu, R. X., Sharaf, K. M., & Elmasry, M. I. (1996). *High-performance digital VLSI circuit design*. Dordrecht: Kluwer Academic Publishers.
- 16. Uyemura, J. P. (1999). *CMOS logic circuit design*. Dordrecht: Kluwer Academic Publishers.
- Sung, K., & Kim, L. (2000). Comments on New dynamic flipflops for high-speed dual-modulus prescaler. *IEEE Journal of Solid-State Circuits*, 35(6), 919–920.



Ching-Yuan Yang received the B.S. degree in Electrical Engineering from the Tatung Institute of Technology, Taipei, Taiwan, in 1990, and the M.S. and Ph.D. degree in Electrical Engineering from National Taiwan University, Taipei, in 1996 and 2000, respectively. During 2000–2002, he was on the faculty of Huafan University, Taipei, Taiwan. Since 2002, he has been on the faculty of National Chung Hsing Univer-

sity, Taichung, Taiwan, where he is currently an Associate Professor with the Department of Electrical Engineering. His research interests are in the area of mixed-signal integrated circuits and systems for wireline and wireless communications.