# An Ultra Low Power, Compact UWB Receiver with Automatic Threshold Recovery in 65 nm CMOS

Baradwaj Vigraham, Peter R. Kinget

Dept. of Electrical Engineering, Columbia University, New York (NY, USA)

Abstract—A compact UWB receiver operating at 4.85 GHz is presented in a 65 nm CMOS technology. This aggressively duty cycled, non-coherent OOK receiver occupies an active area of only 0.4 mm<sup>2</sup>, thanks to the use of few inductors and RF  $G_m$ -C filters. It achieves a sensitivity of -88 dBm at a data rate of 1 Mbps (for a BER of  $10^{-3}$ ) while consuming energy at a gradient of 450 pJ/bit from a 1.35 V supply. The receiver incorporates a single bit comparator based demodulator, with automatic threshold recovery for digitization.

*Index Terms*—CMOS integrated circuits, low-power electronics, receivers, RF, Active Filters, ultra-wideband (UWB), IR-UWB.

## I. INTRODUCTION

With the FCC opening a broad spectrum of 3.1 - 10.6 GHz for the unlicensed use of UWB communication low power radios employing IR-UWB (Impulse radio - UWB) have become ubiquitous in sensor networks, medical and other applications requiring low fidelity data transfer. Often, these applications, running on scavenged energy, target communication over 1-10 m distances and focus on minimizing power consumption by trading off complexity [1]–[4].

UWB communication can broadly be classified into two types : FM-UWB, a two-step FM based approach [3] which operates similar to narrow band radio and IR-UWB where the information is encoded in short pulses (< 4 ns duration) spaced apart at the data rate. However, the latter is the method of choice in the literature, mainly owing to the high sensitivity performance for a given energy consumption. The appeal of IR-UWB stems from its ability to shutdown during radio silence and wake up when a pulse is expected.

The requirement of UWB transceivers to power down in the absence of transmission, coupled with its short pulse nature excludes the use of coherent downconversion in these receivers. Additionally, to reduce the design complexity, a popular [1], [2], [4] design choice for receiver frontends is to use self-mixing (non-coherent) to downconvert the RF signal. This in turn, dictates a multi-stage RF gain > 50 dB for a reasonable conversion gain in the self mixer.

The demodulation of the received pulses is usually performed by the use of an ADC [2] or by slicing with an externally supplied threshold voltage [1]. While the former provides a reliable translation of the analog pulses into the digital domain for processing, it suffers from the requirement of a high precision local frequency reference, making the solution proposed in [1], an attractive choice. However, it can be seen that the optimal threshold for slicing the received pulses changes with the signal power and generating this threshold poses a significant challenge.

In this work, a 3-stage, low area RF frontend is proposed to achieve the desired gain. Two of the stages in the RF frontend are realized as  $G_m$ -C filters to conserve area. The signal is downconverted using a self-mixer, amplified and fed into a demodulator. In the demodulator, a novel threshold voltage recovery loop is used to slice the analog pulses to obtain an RZ digital representation of the channel.

The rest of the paper is organized as follows: The system architecture and the individual blocks are discussed in section II before presenting the measurement results in section III. Finally, this work is compared against other receivers in section IV.





Fig. 1. Symbol representation for OOK operation

The receiver architecture and the associated symbol representation used in this work are shown in figures 2 and 1 respectively. The RF front-end consists of 3 stages - LNA, STG1 and STG2. While the LNA uses inductors, STG1 and STG2 are realized as RF  $G_m$ -C filters. A gilbert cell selfmixer is then used to downconvert the RF signal. Following this, the signal is amplified in the baseband using a VGA and an analog demodulator is used to resolve the pulses into an RZ pulse stream.

By construction, the receiver operates asynchronously and does not rely on any local phase/frequency reference. As a result, during the start up of a packet (preamble), the receiver operates in a continuous mode and recovers a digital representation of the channel, to be processed by a digital backend.

# A. Source degenerated LNA exploiting mutual inductance

The LNA is a source-degenerated (SD) LNA with an additional  $\pi$ -matching network with the bondwire and pad capacitance to operate across the lower UWB band (3.6-5.2



Fig. 2. A block diagram of the receiver and axillary testing circuits

GHz). A SD-LNA relying on mutual inductance between the gate and source coils is proposed in [5]. In this work, a variant of this architecture is used as shown in figure 3. The total gate and source inductance  $(L_g + L_s)$  in a standard SD-LNA is realized with a single 5 turn, square spiral inductor. The outer 3 turns (the larger half) of the spiral are used to realize the gate inductance, while the inner 2 turns realize the source coil. Thus, the total inductance  $(L_g + L_s)$  required to resonate the  $C_{gs}$  of the device is realized using the equivalent 5 turn spiral inductor, thereby reducing the area of the LNA.

Though the input of the LNA is broadband matched, the output of the LNA is designed to be narrow band with an LC tank load. The tank is made tunable using a accumulation/depletion mode varactor bank as an alternative to a switched capacitor bank, to improve its quality factor.



Fig. 3. A schematic representation of the mutual inductance based source degenerated LNA used in the receiver

# B. Compact, widely tunable RF $G_m - C$ filters

The self-mixing nature of downconversion used in the receiver necessitates a significantly large gain (> 50 dB) in the RF section, to achieve a reasonable conversion gain in the mixer. The first stage of gain is provided by the LNA which has been discussed earlier. Following this, aimed at constraining the area, two stages of tunable bandpass filtering and amplification is achieved using two  $G_m - C$  filters at

the desired center frequency ( $\approx$  3-5 GHz). A schematic representation of these filters is shown in figure 4.

The first of the two active filter stages, labeled STG1, performs a single ended to differential conversion in the current mode. This provides a two fold advantage - it improves the dynamic range of the filter and relaxes the stability constraints by forcing the parasitic coupling across stages, to be common mode.<sup>1</sup> The second stage, labeled STG2, further amplifies the signal, to achieve complete switching of the LO port of the self-mixer that follows.

The transconductors used in the filters are shown in figure 4. The transconductors which convert the input voltage signal into current (labeled  $g_{m,in}$ ) are implemented as psuedo-differential cascoded (M3 and M2) common-source stages (M4 and M1) with current re-use. Transistor M0 (operating at the edge of saturation), is used to regulate the DC current in the transconductors via a replica circuit to reduce sensitivity to the supply and process variations. An additional decoupling capacitor Cdecoup is used to bypass M0 for the signal frequencies. The gain of each stage is made tunable with a 3 bit control by placing multiple such transconductors in parallel. The gyrator consists of CMOS differential pair transconductors (labeled  $g_{m,g}$ ) with a tunable negative resistance to adjust the quality factor of the filters. The parasitic capacitances from the transistors and routing parasitics together, form the capacitances for the gyrators. The frequency and the quality factor of the two stages can thus be individually varied through the bias current in the corresponding differential pairs using digital control bits.

# C. Self-mixer & VGA

As shown in figure 5, the downconversion of the received signal is achieved by self mixing and is implemented using a gilbert cell mixer. In order to maximize the conversion gain, the output of STG2 is connected to the switching port of the mixer and that of STG1 to the linear port of the mixer. The output current of the mixer is converted to voltage by a TIA

<sup>&</sup>lt;sup>1</sup>This is a serious concern in non-coherent receivers due to the cascade of multiple stages and large gain before the mixer.



Fig. 4. Schematic representation of the compact RF Gm-C filters used for STG1 and STG2

and a tunable OTA-R amplifier (VGA). The output of the VGA is then fed into the demodulator for threshold recovery and subsequent digitization.



Fig. 5. A schematic representation of the self-mixer and baseband amplifiers

# D. Demodulator with automatic threshold recovery

A block level representation of the demodulator is shown figure 6. The amplified baseband signal received from the VGA is first integrated using a continuous time integrator built using an OTA-RC architecture, to limit the noise bandwidth into the demodulator. Since the integration duration is precisely the duration of the pulse, a matched filtering of the received pulse is achieved, optimizing the signal and noise bandwidths. In order to compensate for offsets and non-zero DC of an OOK waveform, an auto-zero loop is placed around the integrator. This suppresses the gain of the integrator at DC and frequencies below 1 MHz, forming a bandpass filter and making the integrator insensitive to flicker noise.

In a parallel path, the signal from the VGA is compared with a coarse threshold voltage (adjustable digitally, labeled vth<sub>coarse</sub>) to get a digital view of the channel. This is achieved by using a continuous time comparator (a cascade of gain stages, labeled  $S_1$ ) without the use of a sampling clock. Labeled as out<sub>aux</sub>, this signal represents the presence of a pulse in the channel when high. It can be adjusted to minimize the bit error rate, if desired, by adjusting the threshold voltage labeled vth<sub>coarse</sub>. However, it can be seen that the BER achieved using this method would be sensitive to the user set threshold and the received signal power.

The threshold recovery is performed as follows: The output of the integrator is tracked on a capacitor  $C_0$  using switches gated by  $out_{aux}$ . Since  $out_{aux}$  represents the duration of the pulse in the channel, the voltage sampled on to  $C_0$  is the integral of the received baseband pulse and represents the symbol point in a constellation diagram of the receiver's output. The value sampled on the capacitor  $C_0$  is accumulated on another capacitor,  $C_{inf}$  during the hold phase (out<sub>aux</sub> is low). The voltage accumulated on  $C_{inf}$  over multiple symbols gives the average position of the symbol in the constellation diagram. The output of the integrator is then sliced at half of the recovered threshold (differentially,  $v_{thp}$  and the common mode voltage,  $v_{cm}$  are used) using another comparator, S<sub>2</sub>. The demodulator outputs two digital RZ signals out and out<sub>aux</sub> which can be used to recover the transmitted clock and convert the data into an NRZ stream.



Fig. 6. A schematic representation of the demodulator with the automatic threshold recovery loop

#### **III. MEASUREMENT RESULTS**

A die photograph of the receiver prototype with an active area of  $0.4 \text{ mm}^2$  including the testing circuits, is shown in figure 7. The die is  $1.1 \text{ mm}^2$  in area and includes a UWB transmitter for testing purposes.



Fig. 7. Die photo of the receiver front end

BER measurements performed on the UWB receiver are shown in figure 8. The solid curves represent the performance without interferers and the dashed curves with a 2.4 GHz continuous wave interferer at -30 dBm. It can be seen that the BER observed using the user set threshold voltage on BER<sub>out,aux</sub> is -89 dBm, 1 dB better than the one using the recovered threshold voltage (BER<sub>out</sub>) which achieves -88 dBm. However, as expected, BER<sub>out</sub> is much less sensitive to the user settings compared to BER<sub>out,aux</sub>. This is evident in the measurements with interferers where BER<sub>out,aux</sub> flattens out at large signal powers while BER<sub>out</sub> continues to improve with signal power.



Fig. 8. Measured BER under different conditions

#### IV. COMPARISON TO STATE OF THE ART

The receiver designed in this work consumes 1.3 mW from a 1.35 V supply while operating at a data rate of 1 Mbps. A significant fraction of this power (0.85 mW) corresponds to that of bias generation circuits. As shown in figure 2, additional circuitry is included to dutycycle the bias generation circuits at a fraction of the data rate (dutycycling ratio tunable from  $\frac{5}{64}$  to  $\frac{5}{512}$ ). The bias voltages are stored on capacitors as a form of analog memory. However, this could not be demonstrated on this receiver at the time of these measurements. A comparison of this receiver to other UWB receivers operating in the lower UWB band is shown in table I.

The measurements reported in section III are with the bias circuits operating under continuous operation mode, while dutycycling the receiver at 3% on time. Since, the backend clock recovery has not been implemented in this design, the pulse to dutycycle the receiver is provided externally. The usually reported FoM of energy consumed per bit is the gradient in power consumption with respect to the data rate and is measured to be 450 pJ/bit, similar to the calculation done in [2]. A comparison of the receivers including the idle

power consumptions is also included in table I.

# V. CONCLUSION

A compact UWB receiver with a small active area of  $0.4 \text{ mm}^2$  has been demonstrated, thanks to active bandpass filters implemented using a  $G_m - C$  topology. With the use of aggressive duty cycling and an automatic threshold recovery based demodulator, the receiver achieves a sensitivity of - 88 dBm at 1 Mbps. From table I, it can be seen that this receiver achieves the lowest energy consumption amongst other receivers operating in the lower UWB band for the same sensitivity, even after including the idle power consumption.

| TABLE I            |
|--------------------|
| PERFORMANCE SUMMAR |

|                                | [6]     | [2]        | [4]  | This work  |  |
|--------------------------------|---------|------------|------|------------|--|
| CMOS tech. (nm)                | 90      | 90         | 90   | 65         |  |
| V <sub>dd</sub> (V)            | -       | 1.0        | 0.65 | 1.35       |  |
| Freq. band (GHz)               | 3.5-4.5 | 3-5        | 3-5  | 4.85       |  |
| Data rate (Mbps)               | 0.15    | 16         | 0.1  | 1.0        |  |
| Sensitivity (dBm)              | -86     | -76        | -99  | -88        |  |
| (at 10 <sup>-3</sup> BER)      |         |            |      |            |  |
| Receiver Power                 | 0.077   | 22.5       | 0.25 | 0.45       |  |
| (mW)                           |         |            |      |            |  |
| Active area (mm <sup>2</sup> ) | 1.7     | $1.5^{2}$  | 2.2  | 0.4        |  |
| Normalized quantities          |         |            |      |            |  |
| Energy consumed                | 500     | 1400       | 2500 | 450        |  |
| (pJ/bit)                       |         | $(2320)^3$ |      | $(1300)^3$ |  |
| Sens. normalized               |         |            |      |            |  |
| to 1 Mbps (dBm)                | -78     | -88        | -89  | -88        |  |
| (at 10 <sup>-3</sup> BER)      |         |            |      |            |  |

## **ACKNOWLEDGMENTS**

This work was supported by financial support from Texas Instruments and NSF grant CCF-0964497. We would also like to thank ST Microelectronics for fabrication support, Integrand software for licensing EMX, Berkeley Design Automation for licensing AFS and Mr. Jayanth Kuppambatti for several helpful technical discussions.

#### REFERENCES

- M. Crepaldi *et al.*, "An Ultra-Low-Power interference-robust IR-UWB transceiver chipset using self-synchronizing OOK modulation," in *Proc. IEEE Int. Solid-State Circuits Conf. Digest of Technical Papers (ISSCC)*, 2010, pp. 226–227.
- [2] D. C. Daly et al., "A Pulsed UWB Receiver SoC for Insect Motion Control," *IEEE J. Solid-State Circuits*, vol. 45, no. 1, pp. 153–166, 2010.
- [3] Y. Dong et al., "A 9mW high band FM-UWB receiver front-end," in Proc. 34th European Solid-State Circuits Conf. ESSCIRC 2008, pp. 302– 305.
- [4] F. S. Lee and A. P. Chandrakasan, "A 2.5 nJ/bit 0.65 V Pulsed UWB Receiver in 90 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 42, no. 12, pp. 2851–2859, 2007.
- [5] M. H. Zarifi and J. Frounchi, "Design of an input matching network for RF CMOS LNAs using stack inductors," in *Proc. Int. Conf. Computer* and Communication Engineering ICCCE 2008, pp. 672–676.
- [6] X. Y. Wang et al., "A self-synchronized, crystal-less, 86μW, dual-band impulse radio for ad-hoc wireless networks," in Proc. IEEE Radio Frequency Integrated Circuits Symp. (RFIC), 2011, pp. 1–4.

<sup>2</sup>extrapolated from die photo and total reported area <sup>3</sup>Including the idle power consumption