## UC San Diego UC San Diego Electronic Theses and Dissertations

## Title

Synchronization at low SNR in MIMO communications

## Permalink

https://escholarship.org/uc/item/0vg896gp

#### **Author** Amde, Manish

# Publication Date 2010

Peer reviewed|Thesis/dissertation

#### UNIVERSITY OF CALIFORNIA, SAN DIEGO

#### Synchronization at Low SNR in MIMO Communications

A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy

in

Electrical Engineering (Electronic Circuits and Systems)

by

Manish Amde

Committee in charge:

Professor Kenneth Yun, Chair Professor Rene Cruz, Co-Chair Professor William Hodgkiss Professor Ryan Kastner Professor Laurence Milstein Professor Curt Schurgers

2010

Copyright Manish Amde, 2010 All rights reserved. The dissertation of Manish Amde is approved, and it is acceptable in quality and form for publication on microfilm and electronically:

Co-Chair

Chair

University of California, San Diego

2010

DEDICATION

To my parents, Gaurishankar and Sharan Amde.

## TABLE OF CONTENTS

| Signature Pa           | ge                                                                                                                                                                                                                                                                                                                                                                                    |  |  |  |  |
|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| Dedication             |                                                                                                                                                                                                                                                                                                                                                                                       |  |  |  |  |
| Table of Contents    v |                                                                                                                                                                                                                                                                                                                                                                                       |  |  |  |  |
| List of Figure         | es                                                                                                                                                                                                                                                                                                                                                                                    |  |  |  |  |
| List of Tables         | six                                                                                                                                                                                                                                                                                                                                                                                   |  |  |  |  |
| Acknowledge            | ments $\ldots \ldots x$                                                                                                                                                                                                                                                                           |  |  |  |  |
| Vita and Pub           | plications                                                                                                                                                                                                                                                                                                                                                                            |  |  |  |  |
| Abstract of t          | he Dissertation                                                                                                                                                                                                                                                                                                                                                                       |  |  |  |  |
| Chapter 1              | Introduction11.1Pilot and Packet-based Communications21.2Low SNR Communications31.3MIMO Communications41.4Synchronization61.4.1Direct-Sequence Spread-Spectrum71.4.2Code Acquisition101.5Radio Prototyping and Experimentation101.6Contributions111.6.1Parallel Code Acquisition121.6.2Synchronization System121.6.3Radio Prototyping and Experimentation131.7Dissertation Overview13 |  |  |  |  |
| Chapter 2              | Parallel Code Acquisition: Performance Analysis152.1System Model152.1.1Transmittler152.1.2Channel Attenuation and Noise Addition182.2Parallel Code Acquisition192.2.1Architecture202.2.2De-spreading202.2.3Post-detection Integration222.2.4System Receiver Operating Characteristics242.3Optimal Transmission Strategy in Multiple Transmit Antenna Systems242.4SISO Performance31   |  |  |  |  |

|              | $2.4.1$ Pilot-Based $\ldots \ldots \ldots \ldots \ldots \ldots \ldots 33$                                                                       |
|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
|              | 2.4.2 Packet-Based                                                                                                                              |
|              | 2.5 Multiple Antenna Performance                                                                                                                |
|              | 2.6 Summary $\ldots \ldots \ldots \ldots \ldots \ldots \ldots 30$                                                                               |
| Chapter 3    | Parallel Code Acquisition: Architecture and Implementation . 39                                                                                 |
|              | 3.1 Energy Detection                                                                                                                            |
|              | 3.2 Parallel De-spreading                                                                                                                       |
|              | 3.3 Post-detection Integration                                                                                                                  |
|              | 3.4 Parallel Code Acquisition                                                                                                                   |
|              | 3.5 Prototype Implementation                                                                                                                    |
|              | 3.5.1 Hardware Prototype $\ldots \ldots \ldots \ldots \ldots \ldots 48$                                                                         |
|              | $3.5.2$ Experiment Settings $\ldots \ldots \ldots \ldots \ldots 49$                                                                             |
|              | 3.5.3 Results                                                                                                                                   |
|              | 3.6 Results of Offline Processing in a Controlled Environment 5                                                                                 |
|              | 3.7 Summary $\ldots$ 5.                                                                                                                         |
| Chapter 4    | A Low SNR Synchronization System                                                                                                                |
|              | 4.1 System Architecture                                                                                                                         |
|              | 4.2 System Design $\ldots \ldots 59$                 |
|              | 4.2.1 System Overview $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots 59$                                                              |
|              | 4.2.2 Spatial Range and Receiver Sensitivity 6                                                                                                  |
|              | $4.2.3  \text{Packet Format} \dots \dots$ |
|              | $4.2.4  \text{Channel Model}  \dots  \dots  \dots  \dots  \dots  \dots  65$                                                                     |
|              | 4.3 System Performance $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots 64$                                                      |
|              | 4.4 Radio Prototype 60                                                                                                                          |
|              | 4.5 Verification Testbed $\ldots \ldots $                                 |
|              | 4.5.1 Digital Baseband Test Setup 69                                                                                                            |
|              | $4.5.2  \text{RF Test Setup} \dots \dots$ |
|              | 4.5.3 Hardware Test Results                                                                                                                     |
|              | 4.6 Experimental Results                                                                                                                        |
|              | $4.6.1  \text{Calibration}  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  $                                                    |
|              | 4.6.2 Outdoor Experiments $\ldots \ldots \ldots \ldots \ldots 74$                                                                               |
|              | 4.7 Summary $\ldots$ 78                                                                                                                         |
| Chapter 5    | Discussion and Conclusion                                                                                                                       |
|              | 5.1 Future Work $\ldots$ 8                                                                                                                      |
| Bibliography | y                                                                                                                                               |

### LIST OF FIGURES

| Figure     | 1.1:  | Packet format: The preamble is used for performing the syn-<br>chronization operations such as code acquisition, carrier recov-<br>ery and frame synchronization. The payload contains the useful |    |
|------------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|            |       | data for communication.                                                                                                                                                                           | 3  |
| Figure 1   | 1.2:  | Basic MIMO packet-based transmitter and receiver                                                                                                                                                  | 5  |
| Figure     | 1.3:  | Direct-Sequence Spread-Spectrum.                                                                                                                                                                  | 9  |
| Figure 2   | 2.1:  | MIMO-DSSS system.                                                                                                                                                                                 | 16 |
| Figure 2   | 2.2:  | MIMO parallel code acquisition system architecture                                                                                                                                                | 19 |
| Figure 2   | 2.3:  | Pilot-based SISO performance in Rayleigh fading environment.                                                                                                                                      | 32 |
| Figure 2   | 2.4:  | Sliding correlator output for packet-based system                                                                                                                                                 | 33 |
| Figure 2   | 2.5:  | SISO performance for pilot and packet-based systems                                                                                                                                               | 34 |
| Figure 2   | 2.6:  | Parallel code acquisition performance for $1 \times 1$ SISO, $1 \times 2$ SIMO,                                                                                                                   |    |
|            |       | $2 \times 1$ MISO and $2 \times 2$ MIMO pilot-based and packet-based systems.                                                                                                                     | 37 |
| Figure 3   | 3.1:  | Energy detection block using digitized baseband samples. $\ldots$                                                                                                                                 | 41 |
| Figure 3   | 3.2:  | Pipelined parallel de-spreading from the signal processing per-                                                                                                                                   |    |
|            |       | spective                                                                                                                                                                                          | 42 |
| Figure 3   | 3.3:  | Pipelined VLSI architecture of the parallel de-spreader                                                                                                                                           | 43 |
| Figure 3   | 3.4:  | FPGA implementation of the parallel de-spreader.                                                                                                                                                  | 44 |
| Figure 3   | 3.5:  | Post-detection integration block diagram.                                                                                                                                                         | 46 |
| Figure 3   | 3.6:  | Parallel code acquisition block diagram.                                                                                                                                                          | 47 |
| Figure 3   | 3.7:  | Lab setup.                                                                                                                                                                                        | 49 |
| Figure 3   | 3.8:  | Modified parallel code acquisition algorithm output observed                                                                                                                                      | 50 |
| <b>D</b> . | 2.0   | every sampling cycle using Chipscope Pro at $-110 \mathrm{dBm}$                                                                                                                                   | 50 |
| Figure .   | 3.9:  | modified parallel code acquisition algorithm output every sam-                                                                                                                                    |    |
|            |       | n al strong the 110 dDre                                                                                                                                                                          | 50 |
| Figure     | 9 10. | Madified percellel and a conviction algorithm output around and                                                                                                                                   | 5Z |
| riguie.    | 5.10. | pling cucle with continuous transmission and received input sig                                                                                                                                   |    |
|            |       | pling cycle with continuous transmission and received input sig-                                                                                                                                  | 59 |
|            |       |                                                                                                                                                                                                   | 52 |
| Figure 4   | 4.1:  | MIMO synchronization system transmitter                                                                                                                                                           | 57 |
| Figure 4   | 4.2:  | MIMO receiver synchronization order for staggered transmis-                                                                                                                                       |    |
|            |       | sion strategy. $\operatorname{Sync}_{ij}$ refers to synchronization of the $i^{th}$ transmit                                                                                                      |    |
|            |       | antenna at the $j^{th}$ receive antenna                                                                                                                                                           | 58 |
| Figure 4   | 4.3:  | Receiver block diagram.                                                                                                                                                                           | 60 |
| Figure 4   | 4.4:  | Frame format.                                                                                                                                                                                     | 62 |
| Figure 4   | 4.5:  | Multipath channel model.                                                                                                                                                                          | 64 |
| Figure 4   | 4.6:  | System-level synchronization under AWGN.                                                                                                                                                          | 65 |
| Figure 4   | 4.7:  | System-level synchronization under multipath Rayleigh fading.                                                                                                                                     | 66 |

| Figure 4.8:     | Block diagram of the radio prototyping platform               | 67 |
|-----------------|---------------------------------------------------------------|----|
| Figure 4.9:     | Radio prototype setup in lab                                  | 67 |
| Figure 4.10:    | Digital baseband test setup                                   | 70 |
| Figure 4.11:    | RF test setup                                                 | 71 |
| Figure 4.12:    | Simulation and verification results under AWGN                | 71 |
| Figure 4.13:    | Simulation and verification results for multipath environment | 72 |
| Figure 4.14:    | Sensitivity experiment results                                | 73 |
| Figure $4.15$ : | Experimental setup at the rooftop                             | 75 |
| Figure 4.16:    | Rooftop layout                                                | 75 |
| Figure $4.17$ : | SB performance at the two outdoor test sites                  | 76 |
| Figure 4.18:    | Library walk photo.                                           | 77 |
| Figure 4.19:    | UCSD Library Walk layout.                                     | 77 |

### LIST OF TABLES

| Table 3.1: | FPGA implementation details for the parallel de-spreader block. | 45 |
|------------|-----------------------------------------------------------------|----|
| Table 3.2: | FPGA implementation details for the PDI block                   | 46 |
| Table 3.3: | FPGA implementation details for the parallel code acquisition   |    |
| Table 3.4: | block                                                           | 47 |
|            | acquisition block.                                              | 49 |
| Table 4.1: | FPGA implementation details for the synchronization system.     | 69 |

#### ACKNOWLEDGEMENTS

This dissertation would not have been possible without the contributions of many people.

I owe my deepest gratitude to Kenneth Yun, my advisor, whose guidance and faith in my abilities led to the fulfillment of this research. Ken has spent an incredible amount of time on my research. He has provided great technical insights and carefully sorted out the good ideas during our long discussions, been extremely patient with my writings, and never curbed my enthusiasm while I explored multiple research directions. At the same time, he has provided me with the clarity of thought and the focus to get work done. He has taught me the importance of implementation, without which I would have never experienced the joy of seeing my research come to life.

I am hugely indebted to Rene Cruz, my co-advisor, who has been the motivating factor behind this research. Rene has always provided a fresh perspective to look at hard problems, and his suggestions have often lead to simple, elegant solutions. His strong theoretical background and his ability to grasp hardware constraints were crucial in the development of robust wireless algorithms. He has also been a great source of interesting research problems, and my scientific curiosity would have been unfulfilled but for him.

I would like to thank Laurence Milstein for his tremendous patience and support in laying down a solid theoretical foundation for my thesis. I have learned a lot from his digital communications classes which provided me with a theoretical platform upon which to build. Prof. Milstein has provided deep theoretical insights over numerous discussions, and has painstakingly read through innumerable iterations of my theoretical analysis. "Don't rush research" is a lesson that will always stay with me.

I would like to thank Joel Marciano for introducing me to the analog and RF world. Without him, over-the-air experimentation would have remained a pipe dream. He has spent many sleepless nights along with me in the lab getting the radio prototypes to work. His strong expertise and vast experience in communication systems design, along with his constant motivation kept me going long after I had abandoned all hopes of seeing the prototypes in action.

I would like to thank Curt Schurgers, William Hodgkiss and Ryan Kastner for serving on my committee and providing useful contributions.

I would like to thank past and present colleagues for their help. Jaewook Shim deserves a special mention for his help and support. His incredible work ethic and digital design expertise helped in the fast implementation of the complex synchronization system prototype. He has been very supportive over the last three years, and a wonderful source for solving any implementation problems. I would also like to thank Sushil Singh and Yoav Nebat who, as senior graduate students, provided valuable advice during the early stages of my research.

I would like to thank Luciano Lavagno and Supratik Chakraborty for introducing me to the world of academic research. Without them, I never would have applied to graduate school.

I would like to thank the ECE department, especially Gennie Miranda and M'Lissa Michelson, for helping me with the administrative affairs. I am also thankful to the Calit2 department for providing state-of-the-art lab facilities.

Shruti has supported me through thick and thin. Her selfless love helped me survive the tough times. Shomita, back home in India, has been a true best friend, taking up my responsibility in my absence. Suchit, Anish, Kiran, Mayank, Ankit, Saumya, Diwaker, Himanshu, Saurabh, Amit and Srikanth have been great friends in San Diego, and their constant bantering provided much-needed relief at the end of long days. Andy, Kanishka, Rahul, Andrea and Nishanth have taken care of my health by making me exercise despite my busy schedule. My raucous group of friends at the UCSD Cricket Club provided a great source of enjoyment and an avenue of relaxation away from work. I would also like to thank Shruti, Mayank and Andy for proof-reading my dissertation.

I consider myself lucky to have received guidance from wonderful teachers since my childhood. I would like to thank them all. I would also like to thank my extended family, especially my grandparents, for their affection and care.

Finally, I would like to thank my parents. Without their love, encouragement and sacrifices, none of this would be possible. Chapter 2 is partially reprinted from the following paper: M. Amde, L. Milstein, K. Yun and R. Cruz, *Parallel Code Acquisition in MIMO Direct-Sequence Spread-Spectrum Communications*, to be submitted to the IEEE Transactions on Communications. The dissertation author is the primary author of this paper.

Chapter 3 is partially reprinted from the paper: M. Amde, J. Marciano, R. Cruz, and K. Yun, *Code acquisition at low SINR in spread spectrum communications*, Invited Paper in ISSSTA'06: The 9th International Symposium on Spread Spectrum Techniques and Applications, Manaus, Brazil, Aug. 2006. The dissertation author is the primary author of this paper.

Chapter 4 is partially reprinted from the two papers: M. Amde, J. Shim, J. Marciano, K. Yun, and R. Cruz, A Low SINR Synchronization System for Direct-Sequence Spread-Spectrum Communications: Radio Prototype, Verification Testbed, and Experimental Results, in Tridentcom'08: The 4th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, Innsbruck, Austria, Mar. 2008, and J. Shim, M. Amde, K. Yun, and R. Cruz, Synchronization at low SINR in asynchronous directsequence spread-spectrum communications, Best Paper Award in ICSNC'07: The Second International Conference on Systems and Networks Communications, Cap Esterel, France, Aug. 2007. The work was carried out jointly with Jaewook Shim in the ECE dept., UCSD.

VITA

| 1999-2003 | <ul><li>B.Tech., Electrical Engineering,</li><li>Indian Institute of Technology, Bombay.</li></ul> |
|-----------|----------------------------------------------------------------------------------------------------|
| 2003-2006 | M.S., Electrical Engineering,<br>University of California, San Diego.                              |
| 2006-2010 | Ph.D., Electrical Engineering,<br>University of California, San Diego.                             |

#### PUBLICATIONS

M. Amde, L. Milstein, K. Yun and R. Cruz, *Parallel Code Acquisition in MIMO Direct-Sequence Spread-Spectrum Communications*, to be submitted to the IEEE Transactions on Communications.

M. Amde, J. Shim, J. Marciano, K. Yun, and R. Cruz, A Low SINR Synchronization System for Direct-Sequence Spread-Spectrum Communications: Radio Prototype, Verification Testbed, and Experimental Results, in Tridentcom'08: The 4th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, Innsbruck, Austria, Mar. 2008.

J. Shim, M. Amde, K. Yun, and R. Cruz, *Synchronization at low SINR in asynchronous direct-sequence spread-spectrum communications*, Best Paper Award in ICSNC'07: The Second International Conference on Systems and Networks Communications, Cap Esterel, France, Aug. 2007.

M. Amde, J. Marciano, R. Cruz, and K. Yun, *Code acquisition at low SINR in spread spectrum communications*, Invited Paper in ISSSTA'06: The 9th International Symposium on Spread Spectrum Techniques and Applications, Manaus, Brazil, Aug. 2006.

M. Amde, J. Marciano, S. Singh, C. Akin, R. Cruz, and K. Yun, *Packet detection and acquisition at low SINR in spread-spectrum based wireless communications*, in WCNC'06: Wireless Communication and Networking, Las Vegas, USA, Apr. 2006.

M. Amde, S. Singh, R. Cruz, K. Yun, J. Marciano, and C. Akin, *Design of a wireless transceiver for ad-hoc wireless networks*, (demo) in Mobiquitous 2005, San Diego, USA, July 2005.

S. Singh, V. Do, K. James, K. Yun, R. Cruz, and M. Amde, *QoS enabled broadband access through optical rings*, in LCN'04: The 32nd IEEE Conference on Local Computer Networks, Tampa, USA, Nov. 2004.

M. Amde, T. Felicijan, A. Efthymiou, D. Edwards and L. Lavagno, Asynchronous on-chip networks, chapter in the book System On Chip: Next Generation Electronics, edited by B. Al-Hashimi, published by IET, ISBN: 0-86341-552-0

M. Amde, I. Blunno and C. Sotiriou, *Automating the design of an asynchronous DLX microprocessor*, in DAC 03: in the 40th Design Automation Conference, Anaheim, USA, Jun. 2003.

#### ABSTRACT OF THE DISSERTATION

#### Synchronization at Low SNR in MIMO Communications

by

Manish Amde

Doctor of Philosophy in Electrical Engineering (Electronic Circuits and Systems)

University of California, San Diego, 2010

Professor Kenneth Yun, Chair Professor Rene Cruz, Co-Chair

A key requirement for the increased reliability, range and throughput of the wireless communications is the ability to synchronize in a low signal-tonoise ratio (SNR) environment. It is particularly important in multiple-input and multiple-output (MIMO) communications, where a separate synchronization needs to be performed for each transmit-receive antenna pair. Moreover, the SNR for synchronization in MIMO communications is generally lower than in the single-input and single-output (SISO) case, since the transmit power is distributed amongst the multiple transmit antennas for a fixed total transmit power. Thus, the synchronization is a potential bottleneck for performance improvements in future wireless communications.

This dissertation presents a synchronization architecture for packet-based MIMO communications. Specifically, it describes a direct-sequence spread-spectrum (DSSS) based synchronization system for improving the synchronization performance at low SNR, utilizing a parallel code acquisition scheme.

This dissertation presents the performance analysis for the packet-based

SISO communications as well as for the pilot and packet-based MIMO communications. It proposes a *staggered* transmission strategy for the parallel code acquisition in systems with multiple transmitter antennas, and also presents the proof for its optimality. Furthermore, it describes an architecture for the parallel code acquisition and presents the implementation of the SISO acquisition system (which is a basic building block for the MIMO acquisition system) on a radio prototype. Finally, it reports the experimental results that confirm the reliable operation at low SNR.

The parallel code acquisition forms the backbone of the proposed MIMO synchronization system. This dissertation presents the performance analysis for the SISO synchronization system (which is a basic building block for the MIMO synchronization system) and describes its implementation on a radio prototype. Digital and RF tests verify the accurate translation of the synchronization system into hardware. Calibrations in the lab and experiments conducted at outdoor test sites confirm the ability to synchronize at low SNR.

# Chapter 1

# Introduction

Phenomenal strides have been made in the field of wireless communications during the last couple of decades, primarily due to the advances in communication theory, very-large-scale integration (VLSI) technology and radio frequency (RF) circuit design. Groundbreaking research in communication theory, such as multiple antenna communications (MIMO) [1–3], improved the reliability and the throughput of wireless communications. At the same time, the improvement in the VLSI technology, in line with Moore's law<sup>1</sup> [4], made the hardware implementation of these algorithms feasible. Finally, the advances in RF circuit design has enabled communication over a variety of radio frequencies at low power and wide spectral bandwidths.

Wireless communication has the potential to enable ubiquitous connectivity. However, the fickle nature of the wireless channel leads to variation in the signal level at the receiver making fast and reliable communication a challenge. Also, simultaneous transmissions over a shared wireless medium create interferences during the signal reception.

Wireless communication has brought about myriad societal benefits. Its uses include personal communications, telemedicine, video streaming, and data transfer. However, the current state-of-the-art in wireless communication is incapable of meeting the demands of the next-generation applications. Potential

 $<sup>^{1}</sup>$ Gordon Moore's prediction in 1965 that the number of transistors on integrated circuits would double roughly every 18 months.

benefits of wireless communications and the challenges associated with communicating over the wireless medium are the motivations behind this dissertation.

This dissertation treats both the theory and the implementation as equally important components for developing wireless systems. Without performing actual implementations, simplifying assumptions made in the theoretical analysis may lead to performance analyses that are far removed from the real world scenarios. Without a theoretical analysis, ad hoc approaches to hardware design could lead to sub-optimal, and often erroneous, implementations.

## 1.1 Pilot and Packet-based Communications

From a synchronization perspective, wireless physical layer communications can be broadly classified into two categories: pilot-based and packet-based communications. Pilot-based communication schemes, such as cellular communications, rely upon an omnipresent pilot for synchronization. On the other hand, packetbased communication schemes, such as wireless LANs (WLANs) [5], communicate asynchronously using packets and perform synchronization on a per-packet basis.

In the pilot-based system, a large number of radios (e.g., mobile phone users) synchronize to a base station using a dedicated channel for synchronization. The cost of spectrum for the common synchronization channel is amortized over a large number of numbers. However, such systems require a network infrastructure that is expensive to set up and unsuitable for fast deployment. It also creates a single point of failure in the network. Since the radios need to be synchronized even in the absence of any data communication, it could lead to an inefficient use of the spectrum and an increase in the battery power consumption at the radios. Finally, it makes the network conspicuous and prone to malicious attacks.

Packet-based communication, as shown in Figure 1.1, is used in networks which do not require a central authority for coordination such as wireless ad hoc networks [6]. The decentralized nature of the communication network mitigates the aforementioned disadvantages of pilot-based systems. However, due to the absence of a pilot signal, the synchronization needs to be performed *per packet* 



Figure 1.1: Packet format: The preamble is used for performing the synchronization operations such as code acquisition, carrier recovery and frame synchronization. The payload contains the useful data for communication.

(for e.g., 802.11-based wireless networks [5]). As shown in Figure 1.1, a preamble is attached ahead of the data payload for this purpose. Clearly, a large preamble would lead to a significant reduction in goodput – the actual data throughput experienced by the users [7]. Therefore, a fast and accurate synchronization is essential to derive the benefits of packet-based networks.

## **1.2 Low SNR Communications**

The signal to noise ratio (SNR) is an important yardstick used in the characterization of the physical layer performance. A high SNR at the receiver allows an accurate synchronization. Various modulation formats can be used to exploit the high SNR available at receiver for decoding, thus enabling high data rates. However, the fluctuating nature of the wireless channel (due to changes in the environment even when the transmitter and receiver are fixed) leads to a variation in the SNR at the receiver. The reduction in SNR is caused by a long distance between the transmitter and the receiver, the communication channel being *in a deep fade* [1], or an interference from adjacent wireless transmissions over the same frequency spectrum.

Wireless communication in the low signal to noise ratio (SNR) environment is getting increasingly important, although the low SNR makes communication difficult. The ability to operate at low SNR<sup>2</sup> increases the spatial range and reliability of ad hoc networks by successfully receiving packets at low signal strength or under harsh interference. Moreover, in some applications, it may

 $<sup>^2\</sup>mathrm{The}$  interference is assumed to be Gaussian and added to the noise term to calculate the SNR.

be desirable to support an extended spatial range even at the cost of the data rate, which facilitates a gradual degradation in throughput rather than allowing an abrupt disconnect at low SNR. A deliberate low SNR operation, achieved by transmitting at low power and reducing the interference to other radios, may also be beneficial for improving the overall throughput in wireless networks [8].

Once synchronized, error correction techniques can be used for a reliable estimation of transmitted data symbols even if a significant percentage of raw data symbols is erroneous because of the low SNR. However, the synchronization is difficult to achieve at low SNR. It is particularly difficult in packet-based communications since the time to synchronize is limited. Therefore, the synchronization presents a significant hurdle towards enabling packet-based communications at low SNR.

## **1.3 MIMO Communications**

Multiple-input and multiple-output (MIMO) [1–3] schemes are getting increasingly popular for improving the spectral efficiency and the link reliability of the wireless networks. The spectral efficiency can be improved using spatial multiplexing techniques which offers a linear increase in the capacity with the minimum of the transmit and receive antennas (min(M, N)) where M is the number of transmit antennas and N is the number of receive antennas) for a fixed bandwidth and fixed total transmit power [2,3]. The increase in SNR due to the array gain and diversity gain [2,3] can improve the link reliability. In addition, cooperative communication schemes are being proposed to leverage the advantages of MIMO by creating a virtual MIMO link through cooperation amongst multiple SISO transmitter and receivers [9]. MIMO schemes, therefore, can be used not only for increasing throughput at high SNR but also for improving reliability at low SNR.

A basic packet-based MIMO system is presented in Figure 1.2. On the transmitter side, the data is split into multiple transmit streams using space-time coding and interleaving. The transmit streams are packetized, i.e., synchronization



(b) Packet-based MIMO receiver

Figure 1.2: Basic MIMO packet-based transmitter and receiver.

preambles are attached to the data before transmission, and transmitted over the air using multiple RF up converters and antennas.

On the receiver side, the received signal on each of the antennas is downconverted. The received signal on each receive antenna consists of a sum of the signals transmitted over all the transmit antennas. Hence, M synchronizations need to be performed on each of the N downconverted signals leading to a total of M \* N synchronizations. The synchronization parameters, including channel estimates, are passed on to the space-time decoding and deinterleaving block to recover the original data.

MIMO publications, in most cases, assume perfect synchronization and ignore the synchronization system during the analysis [2, 3, 10-12]. The benefits of MIMO are seen after synchronization, i.e., after the symbol synchronization has been attained for all the multiple transmitted streams at each of the receiver After the synchronization, the average SNR per symbol achieved antennas. through an appropriate MIMO combining is typically higher compared to that obtained in the SISO communication [2,3], which is the main advantage offered by the multiple antenna processing. However, the multiple antenna transmission leads to a smaller available average SNR at the receiver for synchronization since the total transmit power is divided amongst the multiple antennas for a fixed total power constraint<sup>3</sup>. Moreover, simultaneous transmissions over multiple antennas add self-interference during the synchronization process, further degrading the performance. Clearly, an inability to achieve proper synchronization is a severe yet often neglected bottleneck that effectively prevents the full benefits of MIMO communication from being realized.

## 1.4 Synchronization

Wireless communication involves transmission and reception of information at radio frequencies. At the transmitter, the binary information is converted to baseband symbols which are used to modulate a radio frequency carrier wave.

<sup>&</sup>lt;sup>3</sup>In the United States, the maximum EIRP allowed (the peak transmit power plus the isotropic antenna gain) per radio in the 2.4GHz ISM band is +36 dBm.

The synchronization is a process of accurately recovering the arrival time, the symbol rate and the carrier wave characteristics in order to successfully demodulate the received signal. It is the first operation that is performed at the receiver. Once the synchronization is achieved, various demodulation techniques can be used to accurately recover the transmitted binary information [1, 2].

The primary functions of synchronization are:

- *Timing synchronization*, which recovers the rate at which the transmitted symbols were generated at baseband.
- *Carrier synchronization*, which predicts the correct phase and frequency of the carrier used for transmission.
- *Frame synchronization*, which finds the correct arrival time of the received data.

In packet-based communications, these functions need to be performed in a brief amount of time (during the reception of the preamble). Thus, the synchronization at low SNR in packet-based communications is more difficult to achieve.

The current state-of-the-art packet-based systems generally use the same modulation for transmitting the preamble and the data payload. For example, the IEEE 802.11a wireless standard [5] uses an OFDM modulation for both the preamble and the payload. This may reduce the cost of implementation but could be sub-optimal for synchronization and, hence, lead to a reduction in the system performance. The criterion used for choosing a modulation for the payload (for example, throughput) can be different from the criterion required for synchronization (for example, reliability). Therefore, a decoupling of the preamble and payload modulation formats could lead to an improvement in the system performance.

#### 1.4.1 Direct-Sequence Spread-Spectrum

The low SNR synchronization can be achieved with a higher energy per symbol in the direct-sequence spread-spectrum (DSSS) modulation, that is, by increasing the spreading gain at the cost of lowering the symbol rate for a fixed spectral bandwidth [13]. The DSSS modulation also has a low probability of intercept and is highly resistant to jamming and interference [13]. Moreover, it can be used to counter frequency-selective fading and is suitable for multiple access of the allocated spectrum due to its unique spreading properties [13]. For these advantages, the DSSS modulation was chosen for this dissertation.

Figure 1.3 shows the basic operation of DSSS communication. Each of the binary data symbols is multiplied by a pseudorandom (PN) sequence [13, 14] consisting of m elements, referred to as *chips* [13, 15]. This leads to an expansion in the bandwidth of the signal – a process known as *spreading* [13, 15]. The multiplication of the spread signal by the same PN sequence at the receiver, a process known as *de-spreading* [13, 15], leads to a recovery of data symbols at the receiver.

The benefit of the DSSS scheme is more noticeable in the presence of noise at the receiver, as shown in the frequency domain representation of the received data. The de-spreading operation allows a recovery of the received signal embedded in the noise by improving the the symbol SNR by a factor of m, also known as the *spreading gain* of the DSSS system.

It is important to note that the spreading gain can be obtained only when the PN sequence present in the received DSSS signal is aligned perfectly with the locally generated replica for an accurate de-spreading operation. The process of aligning the two PN sequences at the receiver is known as *code acquisition* [13] or *chip acquisition*. Once code acquisition is achieved, the rest of the synchronization operations can be performed in a standard fashion [15]. The post-de-spreading SNR for these operations is improved by a factor of the spreading gain which leads to a better synchronization performance. Thus, the code acquisition is a crucial component for enabling a low SNR synchronization in the DSSS-based synchronization system. In Chapter 4, it will be shown that the performance of the synchronization performance.



Figure 1.3: Direct-Sequence Spread-Spectrum.

#### 1.4.2 Code Acquisition

The code acquisition requires testing all possible hypotheses for a correct alignment of the spreading code in the received signal with the local PN correlator used at the receiver for the de-spreading operation. This hypothesis testing was traditionally carried out in a serial manner [13,16,17], as the hardware complexity, cost and size prohibited the use of parallel correlators to reduce the acquisition time. It is interesting to note that the tradeoff between the acquisition time and the hardware complexity was mentioned a couple of decades ago [13], when the authors hinted at more parallel implementations in the future with an improved VLSI technology. Since a brief synchronization time is vital for the packet-based systems, parallel code acquisition will be focused upon in this dissertation.

Most of the prior work on code acquisition has focused on pilot-based SISO communication. An explanation of both serial and parallel code acquisition has been provided in [15]. An in-depth analysis of serial code acquisition has been performed in [18,19]. A survey of parallel code acquisition using estimation techniques is presented in [20]. Performance analysis of parallel acquisition over frequency-selective fading channels has been performed in [21,22]. Parallel acquisition in the presence of Doppler shift and data modulation has been investigated in [23].

The parallel acquisition analysis has been extended to include multiple antennas at the receiver in [24–29]. Hanzo's recent work includes multiple transmit antennas [30, 31]; however, it performs the analysis only for the serial acquisition. Based upon an extensive literature survey, the author thinks that no prior work exists for the parallel code acquisition in packet-based SISO, and pilot and packetbased MIMO systems: the performance analysis for which is presented in this dissertation.

## **1.5** Radio Prototyping and Experimentation

After theoretical analyses and system simulations, the wireless systems are prototyped on reconfigurable hardware testbeds. These hardware testbeds provide a platform for experimentation and performance evaluation under real-world constraints. These constraints include issues such as circuit parasitics, fixed word length effects, dynamic range limitations, noise, and other physical impairments. These hardware platforms typically include radio frequency (RF) analog front end electronics, a fully configurable digital backend using Field Programmable Gate Arrays (FPGA) or Digital Signal Processors (DSP) along with relevant test and measurement equipment, such as logic analyzers, spectrum analyzers, and signal analyzers. Advances in analog to digital interface circuits have motivated Software-Defined Radio (SDR) techniques where a portion of the analog front end functionality is implemented in digital logic. The availability of these reconfigurable hardware testbeds and associated techniques have enabled further innovation in wireless communications research by bridging the gap between theory and practice.

Over-the-air experiments provide insights into the performance of the algorithms in real-world operating conditions. First and foremost, these experiments provide a proof-of-concept: an essential component of any engineering research. Experiments in different outdoor environments provide performance results over a variety of wireless channels. This can point out potential flaws in the assumptions made during theoretical analysis and system simulations, thus improving the quality of the wireless algorithms over multiple theory-to-practice iterations.

Keeping the aforementioned advantages in mind, the wireless systems proposed and analyzed in this dissertation have also been prototyped. The prototypes were calibrated in the lab to ensure that the implementation confirms to the design specifications. Finally, over-the-air experiments were carried out to understand the performance in real world scenarios.

## **1.6** Contributions

This dissertation presents a low SNR synchronization system for MIMO communications. It covers all aspects of a communications system design: theoretical analysis, system simulations, prototype implementation and real-world experimentation; thus, going from "theory to practice." The main contributions of this dissertation are described below.

#### **1.6.1** Parallel Code Acquisition

Based on an extensive literature search, the author believes that the performance analysis for the parallel code acquisition for MIMO-DSSS systems does not exist in the literature. This dissertation presents the performance analysis for the pilot-based and, the traditionally overlooked, packet-based MIMO-DSSS systems. It also presents and proves the optimality of using a staggered transmission strategy for parallel code acquisition in DSSS systems with multiple transmit antennas.

On the implementation side, this dissertation presents the architecture for the parallel code acquisition and shows the feasibility of its implementation in current VLSI technology. It also reports the results of the experiments conducted to verify the reliable operation of the parallel code acquisition algorithm under low SNR.

#### 1.6.2 Synchronization System

This dissertation presents the design of a synchronization front-end for MIMO communications. The synchronization system uses staggered preambles modulated using DSSS for transmission and is designed for low SNR synchronization. The parallel code acquisition algorithm forms the backbone of the synchronization system. The performance of the synchronization system is verified using simulations for different channel conditions.

On the implementation side, this dissertation presents the architecture for the synchronization system and shows its implementation for a single transmitreceive antenna pair: the building block for the MIMO system. Digital and RF tests are performed to verify the correctness of the implementation on the reconfigurable hardware. Controlled experiments are performed in the lab to calibrate the performance of the synchronization system. Finally, experiments at outdoor test sites are conducted to confirm the reliable operation at low SNR.

#### **1.6.3** Radio Prototyping and Experimentation

This dissertation provides several examples for the prototyping of the physical layer of the wireless systems. It provides a detailed description of the various reconfigurable prototyping platforms used for prototyping the algorithms. It also provides the details about the test-and-measurement instruments used for the verification of the prototype implementations. It also presents the rationale behind the various experiments conducted to verify the performance of the parallel code acquisition as well as the synchronization system prototypes.

This dissertation will serve as a reference for researchers who plan to implement their algorithms on radio prototyping platforms in the future. It also hopes to motivate other researchers to develop further insight by testing their algorithms via over-the-air experiments.

## 1.7 Dissertation Overview

The rest of this dissertation is organized as follows:

Chapter 2 investigates the performance of the parallel code acquisition for single (SISO) and multiple (MIMO) antenna based direct-sequence spreadspectrum (DSSS) communication. The performance of the parallel code acquisition scheme is analyzed for two scenarios: a separate pilot channel used for acquisition, and a preamble used for acquisition in packet-based communication. Also presented is a staggered transmission strategy for the optimal parallel code acquisition in a DSSS system with multiple transmit antennas.

Chapter 3 presents the architecture of the MIMO parallel code acquisition system. It provides a detailed description of the micro-architectures for the hardware intensive aspects of the parallel code acquisition. It also describes the prototype implementation for one transmit-receive antenna pair which is the building block for the MIMO system. Finally, experiments are conducted to verify the operation at low SNR and the results are reported.

Chapter 4 presents a synchronization system based upon DSSS modulation for reliable synchronization at low SNR in MIMO systems. It is based upon the parallel code acquisition algorithm described in Chapters 2 and 3. It describes the implementation of the SISO synchronization system, the building block for the MIMO system, on a radio prototype. The implementation is verified on a hardware testbed capable of emulating wireless channels. After the verification, the system performance is calibrated using controlled experiments in the laboratory. Finally, it reports the results of the experiments conducted at two outdoor test sites to observe the performance of the system over real wireless channels.

Chapter 5 provides a discussion of the main ideas in the dissertation visà-vis the results presented in the earlier chapters. It also discusses the potential for future research work in the area of synchronization. Finally, it presents the concluding remarks.

# Chapter 2

# Parallel Code Acquisition: Performance Analysis

This chapter presents the performance analysis for parallel code acquisition in MIMO-DSSS systems. The acquisition performance is discussed not only for the pilot-based DSSS systems, but also for the traditionally overlooked packet-based DSSS systems.

Section 2.1 describes the MIMO-DSSS system model. Section 2.2 shows a theoretical analysis of the parallel code acquisition performance for the pilot-based system. Section 2.3 describes the staggered transmission strategy for the optimal parallel code acquisition in the DSSS system with multiple transmit antennas. Sections 2.4 and 2.5 show the parallel code acquisition performance results for pilot and packet-based systems with single and multiple transmit antennas, respectively. Finally, the conclusions for this chapter are presented in Section 2.6.

## 2.1 System Model

#### 2.1.1 Transmitter

Figure 2.1 represents a MIMO direct-sequence spread-spectrum (MIMO-DSSS) system with M transmit antennas and N receive antennas. All the



Figure 2.1: MIMO-DSSS system.

transmit antennas are assumed to transmit over the same frequency spectrum. The preamble transmission from each transmit antenna is given by Equation (2.1):

$$s_i(t) = \sqrt{2Px_i(t)}c_i(t)\cos(w_c t) \ \forall i \in \{1, M\}$$
 (2.1)

where P is the maximum average power that can be transmitted over the available spectrum,  $c_i(t)$  is a spreading sequence and  $x_i(t)$  is a power scaling factor at each transmit antenna.

In a pilot-based system, the pilot is a spreading sequence that is repeated continuously to aid code acquisition. However, the code acquisition can be performed only during the reception of the preamble in the packet-based system. Let the part of the preamble used for code acquisition in the packet-based system consist of L repetitions of a spreading sequence. For ease of analysis, we assume that L is divisible by M such that L = MR, where R is a positive integer.

The total transmit power constraint over all the transmit antennas is ensured by choosing the power scaling factors such that

$$\sum_{i=1}^{M} x_i(t) = 1 \ \forall t \in (0, MR)$$
(2.2)

Each  $x_i(t)$  is assumed to be constant over one symbol period  $T_s$ , but can vary over the length of the preamble. We introduce a power scaling sequence  $X_i$ , comprising MR elements  $(x_i^1, x_i^2, \ldots, x_i^{MR})$ , which denotes the value of  $x_i(t)$  over the each transmitted symbol of the preamble.

The spreading sequence,  $c_i(t)$ , consists of m chips  $\{c_i^j\}_{j=1}^m$  where  $c_i^j \in \{-1, +1\}$ . The chipping period  $T_c$  is given by  $T_c = T_s/m$ . We assume that  $c_i(t)$  has the following properties  $\forall i \in \{1, M\}$  and integer k:

$$\int_{0}^{T_{s}} c_{i}(t)c_{i}(t-kT_{c}) dt = mT_{c} \forall (k \mod m = 0)$$
(2.3)

$$\int_{0}^{T_{s}} c_{i}(t)c_{i}(t-kT_{c}) dt = 0 \ \forall \ (k \mod m \neq 0)$$
(2.4)

Equations (2.3) and (2.4) describe the autocorrelation properties of a single sequence. Equation (2.4) is asymptotically true for long sequences but generally has a low value (a constant value of  $-T_c$  in m-sequences [14]) for short sequences. The crosscorrelation sequence  $\rho_{ijk}$  with an offset of k chips between spreading codes  $c_i$  and  $c_j$  is given by  $\{c_i^1 c_j^{k+1}, c_i^2 c_j^{k+2}, \ldots, c_i^{m-k-1} c_j^m, \ldots, c_i^m c_j^k\}$ 

#### 2.1.2 Channel Attenuation and Noise Addition

The transmitted signal is attenuated over the channel and the AWGN is added at the antenna. Since the carrier recovery is performed after the acquisition, there exists a frequency mismatch between the oscillators used at the transmitter and receiver leading to imperfect downconversion. This can be represented as an SNR loss for small frequency offsets [15]. There is a graceful degradation in the parallel code acquisition performance with increasing frequency offset  $\Delta \omega_c$  as long as the condition  $\Delta \omega_c T_s \ll \pi$  is satisfied [15]. A zero frequency offset has been assumed for ease of analysis.

The received signal before downconversion at each receive antenna is given by

$$r_j(t) = \sum_{i=1}^{M} h_{ij}(t) s_i(t - \tau_{ij}(t)) + n_w(t)$$
(2.5)

where  $j \in \{1, 2, ..., N\}$ ,  $n_w(t)$  is AWGN with two-sided spectral density  $N_0/2$ ,  $h_{ij}(t) = |h_{ij}(t)|e^{j\theta_{chan}(t)}$  represents the impulse response (attenuation) of a flat fading wireless channel, and  $\tau_{ij}(t)$  is a random delay between the  $i^{\text{th}}$  transmit antenna and the  $j^{\text{th}}$  receive antenna. The coherence time [1, 2] of the channel is assumed to be longer than the reception of the preamble. Hence,  $h_{ij}(t)$  and  $\tau_{ij}(t)$  remain constant over the duration of the entire preamble. For notational convenience, the time dependence of these variables is not shown in the following equations.

In a frequency-selective fading channel [1,2], the use of the spread-spectrum modulation leads to multiple resolvable paths at the receiver. Each of these paths can be acquired using the same parallel code acquisition architecture employed in the flat fading [1,2] scenario. However, the performance analysis will also depend upon the multipath intensity profile [32] as well as the joint fading characteristics of the paths. We will concentrate only on the flat fading channel in the rest of the chapter.



(a) MIMO parallel code acquisition system.



(b) Parallel code acquisition block.



(c) Energy detection block.

Figure 2.2: MIMO parallel code acquisition system architecture

## 2.2 Parallel Code Acquisition

The analysis in this section is performed only for the pilot-based scenario where the spreading codes are always available for acquisition. Due to the unknown
arrival time, the theoretical analysis for the acquisition during the reception of the MIMO-DSSS packet became intractable for theoretical analysis. Thus, the packetbased acquisition was analyzed via simulations. However, the pilot-based analysis will provide an accurate prediction of the packet-based performance that will be presented in the later sections.

The preamble, as described in Section 2.1.1 and shown in Figure 1.1, is a repeated spreading code used for code acquisition in the packet-based system. In the remainder of the paper, the term *preamble* will also refer to the continuous spreading code that needs to be acquired in a pilot-based system. However, the actual meaning will be clear from the context.

### 2.2.1 Architecture

Figure 2.2 shows the hierarchical architecture for the parallel code acquisition system in MIMO-DSSS system. Figure 2.2a shows the top-level architecture for MIMO parallel code acquisition system. A parallel acquisition is performed for each transmit code at every receive antenna leading to a total of MN parallel code acquisition blocks. Figure 2.2b shows the architecture of the parallel code acquisition block which comprises m energy detectors. Figure 2.2c shows one of m energy detectors used in the parallel code acquisition block. Each energy detector is used with a different shift k of the spreading sequence to search all possible m code phases for the correct phase. For ease of analysis, we assume a *chip-synchronous* system (i.e., the chips of the spreading sequence are sampled at the optimum SNR after downconversion and matched filtering). Only one sample per chip is assumed for the analysis.

### 2.2.2 De-spreading

After a perfect de-spreading with code  $c_i(t - \tau_{ij})$  at the correctly aligned energy detector, we get

$$y_{ij}^{lI} = f_{ij}^{lI} + w_{ij\tau_{ij}}^{lI} + \underline{n}_{ij}^{lI}$$
(2.6)

$$y_{ij}^{lQ} = f_{ij}^{lQ} + w_{ij\tau_{ij}}^{lQ} + \underline{n}_{ij}^{lQ}$$
(2.7)

$$\sigma_N^2 = N_0 T_s = m N_0 T_c \tag{2.8}$$

The signal terms are

$$f_{ij}^{lI} = \alpha_{ij}^{l} m T_c \cos(\theta_{ij}) \tag{2.9}$$

$$f_{ij}^{lQ} = \alpha_{ij}^{l} m T_c \sin(\theta_{ij}) \tag{2.10}$$

where  $\theta_{ij}$  is uniformly distributed over  $(0, 2\pi)$  and constant over the reception of the preamble, and  $\alpha_{ij}^l$  is given by

$$\alpha_{ij}^l = |\underline{h}_{ij}| \sqrt{2P x_i^l} \tag{2.11}$$

Let  $\tau_{ij} - \tau_{vj} = k_{ivj}T_c + \delta_{ivj}$ , where  $k_{ivj}$  is an integer and  $\delta_{ivj} \in (0, T_c)$ . The self-interference terms are given by

$$w_{ij\tau_{ij}}^{lI} = \sum_{v=1,v\neq i}^{M} \alpha_{vj}^{l} \cos(\theta_{vj}) \sum_{f=1}^{m} [\rho_{ijk_{ivj}}^{f}(T_{c} - \delta_{ivj}) + \rho_{ij(k_{ivj}-1)}^{f} \delta_{ivj}]$$

$$w_{ij\tau_{ij}}^{lQ} = \sum_{v=1,v\neq i}^{M} \alpha_{vj}^{l} \sin(\theta_{vj}) \sum_{f=1}^{m} [\rho_{ijk_{ivj}}^{f}(T_{c} - \delta_{ivj}) + \rho_{ij(k_{ivj}-1)}^{f} \delta_{ivj}]$$

$$(2.12)$$

$$(2.13)$$

where  $\rho$  is the crosscorrelation sequence described in Section 2.1.1.

The self-interference terms given in Equations (2.12) and (2.13) are random in nature due to the channel fading and the random multipath intensity profile. For a small M/m ratio, the self-interference due to the simultaneous transmissions over multiple antennas (given in Equations (2.12) and (2.13)) can be approximated as a zero-mean Gaussian variable with non-zero variance  $\sigma_w^2$ . This is analogous to the Gaussian approximations made for multiple-access interference [33–35] in the CDMA systems.

The Gaussian approximation has been adopted for the rest of the chapter to keep the theoretical analysis tractable. However, we will note later in Section 2.3

that the self-interference is absent when the staggered transmission strategy is used for optimal parallel code acquisition. We do not wish to drop the self-interference terms till we prove the optimality of the staggered transmission later in the paper.

The variance of the zero-mean Gaussian noise (due to interference and AWGN) in each of the I and Q branches can be written as

$$\sigma^2 = \sigma_w^2 + \sigma_N^2 \tag{2.14}$$

The SNR  $\mu$  for each symbol p, conditioned on the channel and the power scaling factor, can be calculated from Equations (2.9), (2.10) and (2.14):

$$\mu^{p} = \frac{(\alpha^{p} m T_{c})^{2}}{2\sigma^{2}} = \frac{(\alpha^{p} T_{s})^{2}}{2\sigma^{2}}$$
(2.15)

Using Equation (2.4), the signal term vanishes after de-spreading at the wrongly aligned energy detectors. Therefore,  $y_{ij}^{lI}$  and  $y_{ij}^{lQ}$  in Equations (2.6) and (2.7), respectively, can be treated as zero-mean Gaussian random variables with variance  $\sigma^2$  after imperfect de-spreading.

As shown in Figure 2.2a, the parallel code acquisition analysis will be the same for all MN acquisitions that need to be performed, and the systemwide acquisition performance can be calculated from the individual acquisition performances. We now analyze the performance of the acquisition of a spreading code at one of the receive antennas. We drop the subscripts *i* (transmit antenna) and *j* (receive antenna) on the variable  $\alpha_{ij}^l$  and sequence  $x_i^1, x_i^2, \dots, x_i^{MR}$  for ease of exposition.

### 2.2.3 Post-detection Integration

The energy detector output for  $l^{th}$  symbol, shown in Figure 2.2c, is given by  $z^l$ . We define Z as the test statistic for *post-detection integration* (PDI) [15] which uses multiple observations of the energy detector output,  $z^l$ , to improve the acquisition performance at low SNR:

$$Z = \sum_{l=1}^{L} z^l \tag{2.16}$$

Assuming perfect de-spreading, the test statistic Z is a noncentral chisquare random variable [1], and its density is given by

$$p_{1}(Z) = \begin{cases} \frac{1}{2\sigma^{2}} \left(\frac{Z}{s^{2}}\right)^{\frac{L-1}{2}} \exp\left[-\frac{1}{2\sigma^{2}} \left(Z^{2} + s^{2}\right)\right] I_{L-1}\left(\frac{\sqrt{Z}s}{\sigma^{2}}\right) & \text{if } Z \ge 0; \\ 0 & \text{if } Z < 0. \end{cases}$$
(2.17)

where

$$s^{2} = T_{s}^{2} \sum_{l=1}^{L} (\alpha^{l})^{2} = 2T_{s}^{2} |h_{ij}|^{2} P \sum_{l=1}^{L} x^{l}$$
(2.18)

After de-spreading at each of the wrongly aligned energy detectors, the test statistic Z is a central chi-square random variable [1], and its density is given by

$$p_0(Z) = \begin{cases} \frac{z^{L-1}}{\sigma^{2L}2^L L!} \exp[-\frac{Z^2}{2\sigma^2}] & \text{if } z \ge 0; \\ 0 & \text{if } z < 0. \end{cases}$$
(2.19)

Let C be the PDI output of the energy detector matched to the correct code phase. Its density is given by Equation (2.17). Let  $W_j$  be the PDI output of the energy detector matched to the wrong phase  $j \in \{1, m - 1\}$ . The set of  $(m - 1) W_j$ 's are i.i.d random variables with density given by Equation (2.19). Let w be the maximum amongst the  $W_j$ 's. By order statistics, we have

$$p_w(w) = (m-1)p_0(w)F_0^{m-2}(w)$$
(2.20)

where  $F_0(w)$  is the distribution of each of the  $W_j$ 's.

As shown in Figure 2.2b, parallel code acquisition is performed by selecting the maximum amongst the correct test statistic and (m-1) wrong competing test statistics and comparing it to a threshold  $\gamma$ . If the maximum of the test statistics is greater than the threshold, the code is assumed to be *acquired*. The probability of correct acquisition and wrong acquisition of the code phase are given by

$$P_{Acq} = Prob(C > \gamma , All (m-1)W_j < C) = \int_{\gamma}^{\infty} p_1(x) dx \left\{ \int_0^x p_0(z) dz \right\}^{(m-1)}$$
(2.21)

$$P_{WA} = Prob(w > \gamma , C < w)$$

$$= Prob(w > \gamma, C < \gamma) + Prob(w > C, C > \gamma)$$

$$= \int_{\gamma}^{\infty} p_{w}(w) dw \int_{0}^{\gamma} p_{1}(z) dz$$

$$+ \int_{\gamma}^{\infty} p_{1}(z) dz \int_{z}^{\infty} p_{w}(w) dw ,$$

$$(2.22)$$

respectively.

### 2.2.4 System Receiver Operating Characteristics

Let the probabilities of correct acquisition and wrong acquisition of the  $i^{\text{th}}$  spreading code at the  $j^{\text{th}}$  receive antenna be denoted by  $P_{Acq}^{ij}$  and  $P_{WA}^{ij}$ , respectively. These probabilities can be calculated using Equations (2.21) and (2.22). The system-wide correct acquisition and wrong acquisition for a MIMO-DSSS system will be defined using these equations.

The system-wide correct acquisition occurs when all the spreading codes are acquired correctly at each of the receive antennas:

$$P_{Acq}^{system} = \prod_{\substack{i \in \{1, M\}\\ j \in \{1, N\}}} P_{Acq}^{ij}$$
(2.23)

A system-wide wrong acquisition occurs when there is a wrong code acquisition on at least one of the antennas:

$$P_{WA}^{system} = 1 - \prod_{\substack{i \in \{1, M\}\\ j \in \{1, N\}}} (1 - P_{WA}^{ij})$$
(2.24)

# 2.3 Optimal Transmission Strategy in Multiple Transmit Antenna Systems

In the previous section, we analyzed the performance of a parallel acquisition in MIMO-DSSS system. We observe from Equations (2.11), (2.17), (2.18), (2.21), and (2.22) that the performance of the code acquisition scheme in multiple transmit antenna systems depends upon the power scaling factors allocated to the spreading codes at each of the M transmit antennas over the MR preamble symbols. In this section, we will derive the distribution of the scaling factors for optimal parallel code acquisition in a system with multiple transmit antennas.

Let us consider a power scaling matrix  $\overline{\psi}$  given by

$$\overline{\psi} = \begin{bmatrix} x_1^1 & x_1^2 & \dots & x_1^{MR} \\ x_2^1 & x_2^2 & \dots & x_2^{MR} \\ \dots & \dots & \dots & \dots \\ x_M^1 & x_M^2 & \dots & x_M^{MR} \end{bmatrix}$$
(2.25)

which describes the power scaling factors at all M antennas over all MR preamble symbols.

There are two constraints on the elements of the matrix:

1. As given in Equation (2.2), the total average transmitted power is constrained:

$$\sum_{k=1}^{M} x_k^q = 1 \quad \forall q \in \{1, MR\}$$
 (2.26)

2. In the absence of the channel state information (CSI), the energy is equally allocated to the codes transmitted at each of the M antennas over the entire preamble:

$$\sum_{k=1}^{MR} x_s^k = R \quad \forall s \in \{1, M\}$$

$$(2.27)$$

Staggered Transmission Theorem. With known arrival time of the preamble, under row and column transmit power constraints, the  $\overline{\psi}_{staggered}$  power scaling matrix (or a matrix obtained by row and column permutation of  $\overline{\psi}_{staggered}$ ) used at the transmitter combined with PDI-based parallel acquisition performed for each code over R corresponding code-bearing symbols at each receive antenna provides the optimal system ROC for code acquisition in a MIMO system.

$$\overline{\psi}_{staggered} = \left[ \begin{matrix} M \text{ times} \\ J_{1 \times R} & 0_{1 \times R} & \dots & 0_{1 \times R} \\ 0_{1 \times R} & J_{1 \times R} & \dots & 0_{1 \times R} \\ 0_{1 \times R} & 0_{1 \times R} & \dots & 0_{1 \times R} \\ \dots & \dots & \dots & 0_{1 \times R} \\ 0_{1 \times R} & 0_{1 \times R} & \dots & J_{1 \times R} \end{matrix} \right] \right\} M \text{ rows}$$
(2.28)

 $J_{1\times R}$  is a  $1 \times R$  unit matrix with all elements 1 and  $0_{1\times R}$  is a  $1 \times R$  zero matrix with all elements 0.

We observe that the elements of the matrix  $\psi_{staggered}$  satisfy the constraints given in Equations (2.26) and (2.27). The reasons for the choice of this transmission scheme are two-fold:

- 1. Less noise during the acquisition: the total power available for each transmitted spreading code is divided into fewer symbols for the PDI-based energy detection. The parallel acquisition is performed only over the appropriate Rcode-bearing symbols in the preamble, eliminating the noise from the other M(R-1) symbols in the preamble.
- 2. No self-interference: the self-interference caused by simultaneous transmissions from multiple antennas is absent in this scheme. All other power scaling matrices would lead to self-interference and hence degradation in performance.

Since each transmit code is received at a different time, the same N parallel code acquisition blocks (one at each receive antenna) can be used for the acquisition of the M spreading codes. This leads to the reduction of distinct parallel code acquisition blocks in the hardware implementation from MN to N.

The power scaling matrix obtained by the row or column reordering of  $\psi_{staggered}$  matrix would also lead to the same optimal system ROC performance, although it could lead to a more complex implementation since the reduction from MN to N distinct parallel code acquisition blocks might not be possible.

# **Proof of Optimality**

This section begins with some formal definitions followed by a few lemmas and their proofs. The definitions and lemmas are then used to prove the theorem.

### Definitions

- $X_i$ : An arbitrary power scaling sequence (introduced in Section 2.1.1) used at transmit antenna *i*. It also corresponds to the *i*<sup>th</sup> row of the matrix in Equation (2.25).
- $X_{iK}^{sparse}$ : The power scaling sequence used at transmit antenna *i* with any *K* non-zero elements, where *K* is an arbitrary integer.
- X<sub>i</sub><sup>sta</sup>: The power scaling sequence used at transmit antenna *i* when the staggered transmission is used. It has *R* non-zero elements which are all 1. It also corresponds to the *i*<sup>th</sup> row of the matrix in Equation (2.28).
- $ROC_{ij}(X_i, K)$ : The receiver operating characteristic (ROC) for acquiring code *i* at receive antenna *j* using power scaling sequence  $X_i$  and PDI-based parallel acquisition performed over *K* symbols.
- System ROC: The ROC for the entire MIMO code acquisition system as defined by Equations (2.23) and (2.24).
- Column constraint: The constraint given in Equation (2.26) for distributing the total power equally amongst all the spreading codes.
- Row constraint: The constraint given in Equation (2.27) for the total average transmitted power.
- Self-interference: The interference during the acquisition of a code transmitted via one antenna caused by simultaneous transmissions of one or more codes on the other antennas from the same transmitter. See Section 2.2.2 for details.

- $<,>,\leq,\geq$ : The inequality operators are used to show the comparison between two ROCs. For example,  $ROC_{ij}(A, B) > ROC_{kl}(C, D)$  implies that  $ROC_{ij}(A, B)$  is always superior to  $ROC_{kl}(C, D)$  for all values of SNR.
- =: The equality operator is used to show that two ROCs are identical for all values of SNR.

Lemma 1. For a known arrival time of the preamble, under no self-interference, and with row and column constraints on the power scaling matrix at the transmitter, the ROC for each transmitted spreading code at each receive antenna is functionally independent of the power scaling matrix when the PDI is performed over MR symbols.

In other words,

$$ROC_{ij}(X_i, MR) = ROC_{ij}(X_i^{sta}, MR) = ROC_{ij}(X_{iK}^{sparse}, MR)$$
(2.29)

 $\forall i, j \text{ and } \forall K \in (0, MR)$ 

Constraints: Row constraint, Column constraint Assumptions: No self-interference, Known arrival time

**Proof of Lemma 1.** With no self-interference,  $\sigma^2 = \sigma_N^2$  and  $ROC_{ij}(X_i, MR)$  only depends on the power scaling sequence  $X_i$  and does not depend on the power scaling sequences  $(X_j, j \neq i)$  used for the other codes.

Equations (2.27) and (2.18) ensure that the probability density of the energy detector outputs under a correct de-spreading, given by Equation (2.17), is the same for all transmission schemes. The non-central chi-square distribution due to the addition of squared Gaussian random variables with the same variance is defined by two parameters: the number of degrees of freedom (twice the number of symbols in the PDI operation) and the sum of the squares of the means of the Gaussian variables as given in Equation (2.18). Similarly, the density under wrong de-spreading, given by Equation (2.19), remains the same as well.

**Lemma 2.** For a known arrival time of the preamble, under no self-interference and with row and column constraints on the power scaling matrix at the transmitter, distributing the total transmit power per code over fewer symbols, and performing PDI-based acquisition over the code-bearing symbols only, leads to an improvement in ROC for each transmitted spreading code at each receive antenna.

In other words,

$$ROC_{ij}(X_{i(K-1)}^{sparse}, K-1) > ROC_{ij}(X_{iK}^{sparse}, K) \quad \forall i, j \text{ and } \forall K \in (0, MR) \quad (2.30)$$

Constraints: Row constraint, Column constraint Assumption: No self-interference, Known arrival time

**Proof of Lemma 2.** With no self-interference, the same arguments as those given in the proof of Lemma 1 (using Equations (2.17), (2.19), (2.18) and (2.26)) can be used to show that:

$$ROC_{ij}(X_{i(K-1)}^{sparse}, K) = ROC_{ij}(X_{iK}^{sparse}, K)$$
(2.31)

Note that the PDI-based acquisition for obtaining the  $ROC_{ij}(X_{i(K-1)}^{sparse}, K)$  is performed over (K-1) code-bearing symbols and one noise-only symbol in the preamble. Removing the noise-only symbol leads to improvement in performance. Hence,

$$ROC_{ij}(X_{i(K-1)}^{sparse}, K-1) > ROC_{ij}(X_{i(K-1)}^{sparse}, K)$$
 (2.32)

Combining Equations (2.31) and (2.32), we get (2.30).  $\Box$ 

**Lemma 3.** As described in the Section 2.1.1, the length of the preamble L = MR. At least R symbols per spreading code are required to satisfy the row and column constraints.

**Proof of Lemma 3.** The maximum value of any element in the power scaling matrix given is 1. Therefore, at least R non-zero symbols are required to satisfy the row constraint. In this scenario, the column constraint can be satisfied by using a power scaling matrix which is identical to the  $\overline{\psi}_{staggered}$  (or obtained by row and column permutation of  $\overline{\psi}_{staggered}$ ).

**Lemma 4.** There is no self-interference when the staggered transmission scheme is used with the  $\overline{\psi}_{staggered}$  power scaling matrix.

**Proof of Lemma 4.** This can be proved easily by inspection of the  $\overline{\psi}_{staggered}$  matrix given in Equation (2.28) and reproduced below

$$\overline{\psi}_{staggered} = \overbrace{\begin{bmatrix} J_{1 \times R} & 0_{1 \times R} & \dots & 0_{1 \times R} \\ 0_{1 \times R} & J_{1 \times R} & \dots & 0_{1 \times R} \\ 0_{1 \times R} & 0_{1 \times R} & \dots & 0_{1 \times R} \\ \dots & \dots & \dots & 0_{1 \times R} \\ 0_{1 \times R} & 0_{1 \times R} & \dots & J_{1 \times R} \end{bmatrix}}^{\text{M rows}}$$

Only one code is transmitted at any given time.

**Proof of the Staggered Transmission Theorem**. Self-interference is detrimental and will always lead to a worse ROC performance by raising the noise floor (see Equation (2.14)).

From Lemma 1 and Lemma 2, we see that under no self-interference and a known arrival time, the ROC is optimal if we use the fewest symbols to transmit the spreading code and perform PDI only over the code-bearing symbols.

Lemma 3 shows that at least R symbols are needed for transmitting each code. This leads to optimal ROC for each code at every receive antenna under the assumption of no self-interference. To accomplish this, the  $\overline{\psi}_{staggered}$  power scaling matrix (or row and column permutation of  $\overline{\psi}_{staggered}$ ) should be used at the transmitter. Thus, using R symbols for transmitting each code and performing PDI over the R symbols at each antenna will lead to optimal ROC performance for each code at every receive antenna under no self-interference.

Lemma 4 shows that there is no self-interference when the staggered transmission is used. Therefore, using the staggered transmission leads to an optimal ROC for each code at every receive antenna due to lowest possible noise during the acquisition process and no self-interference. Moreover, the ROCs are independent due to no self-interference and PDI-based acquisition performed only over the code-bearing symbols. This optimal ROC for each code at every antenna leads to optimal system ROC (given by the Equations (2.23) and (2.24)).

### Known arrival time of the preamble

The Staggered Transmission Theorem assumes that the arrival time is known. This assumption is not true for packet-based acquisitions. Even in the pilot-based scenario, the start of the repeating preamble cannot be known. The acquisition of the first of the M transmitted codes will be done asynchronously. However, we will observe via simulation results in the upcoming sections that the asynchronous packet-based performance is close to the synchronous performance with an omnipresent pilot (known arrival time) at low probabilities of wrong acquisition, which is a desirable region of operation in most systems.

# 2.4 SISO Performance

### 2.4.1 Pilot-Based

In Figure 2.3, we have plotted the parallel acquisition performance (using Equations (2.21) and (2.22)) of a pilot-based SISO system under different spreading factors (SF), chip SNRs (EcNo) and PDI values under a flat, Rayleigh fading environment. Figures 2.3a, 2.3b and 2.3c show the parallel acquisition performance for varying SF, EcNo and PDI, respectively. Figure 2.3d shows the performance with all the three parameters (SF, EcNo and PDI) varying. A slow fading environment is assumed where the fade remains constant over the reception of a single preamble, but changes value over multiple preambles. Equations (2.21) and (2.22) are averaged over the Rayleigh fade of the channel coefficient h introduced in Section 2.1.2.

We used an SF of 63, an EcNo of  $-15 \,\text{dB}$  and a PDI of 4 as the baseline case. Figures 2.3a, 2.3b and 2.3c show that, as expected, there is an improvement in performance—a higher probability of correct acquisition for a given value of wrong acquisition—as we increase the SF, EcNo or PDI. Figure 2.3d reflects the relative sensitivity of the parallel code acquisition performance with variations in SF, EcNo and PDI. We observe that, compared to the baseline case, the performance improvement obtained by doubling the chip SNR (3 dB increase in



Figure 2.3: Pilot-based SISO performance in Rayleigh fading environment.

EcNo) is greater than the improvement due to the doubling of SF or PDI. The SF performance improvement is lower due to the increase in the number of competing wrong code phases during the parallel code acquisition. The lower performance improvement due to the PDI can be attributed to the non-coherent combining loss [36]. To verify the theory, we have also simulated the SISO system using m-sequences [14] as spreading codes. The simulation results for the baseline case are overlayed in Figure 2.3d. We note that the simulation results are identical to the theoretical results.



Figure 2.4: Sliding correlator output for packet-based system.

### 2.4.2 Packet-Based

The parallel code acquisition block is implemented in a pipelined fashion as a sliding correlator [37]. As the preamble of a packet slides into the receiver one chip at a time, a new correlation (linear de-spreading followed by non-linear energy detection, as shown in Figure 2.2c) is performed and the output is compared to a threshold. Figure 2.4 shows the output of the sliding correlator for packet-based systems. For clarity, the noise is chosen to be zero in this example. The preamble is chosen to be 8 symbols, the chip amplitude 1/63 units, the SF 63, and the PDF length equal to the number of the preamble symbols. We observe that the peaks corresponding to the correct code phase repeat every 63 chips and the peak amplitude increases steadily over the 8 symbols.

The theoretical analysis of the packet-based system became intractable due to the unknown arrival time of the packet, and hence we resorted to simulations to analyze the performance. As shown in Figure 2.4, the simulation begins when the first chip of the preamble slides into the receiver. The code is assumed to be acquired when the output of the sliding correlator crosses the acquisition threshold. The simulation ends either when the threshold is crossed, or after the entire preamble has passed through the sliding correlator. Figure 2.5 shows the performance of the packet and pilot-based parallel code acquisition systems with an SF of 63, an EcNo of  $-12 \,\text{dB}$  and a PDI over 16 symbols (the length of preamble). The variance of the zero-mean Gaussian noise per symbol, described in Equation (2.8), is taken to be unity. A non-fading AWGN channel is assumed for the packet-based simulations.



Figure 2.5: SISO performance for pilot and packet-based systems.

We first analyze the performance of the pilot-based acquisition. We observe that when, the threshold is low, the wrong acquisition probability (not shown on the graph), calculated using Equation (2.22), is very low and the correct acquisition probability, calculated using Equation (2.21), is high. An increase in the threshold leads to a monotonic decrease in performance for both correct and wrong acquisition in the pilot-based acquisition.

However, in the packet-based case, the wrong acquisition probability is very high even for low thresholds. This is due to the following deleterious effects:

• Late start: Since (m-1) wrong correlations take place before the correct one

for every symbol sliding chip-wise into the receiver, there is a high probability of wrong acquisition when the threshold is too low.

- Edge effects: This is due to (m-1) partial cross-correlations at the start of the packet. The low periodic cross-correlation property for the spreading sequences (given in Equation (2.4)) cannot be guaranteed for these partial cross-correlations.
- Low PDI: The effects of PDI are only noticeable after the entire preamble has entered into the acquistion block. If the threshold is too low, the correlations at the start of the preamble are strong enough to cross the threshold.

We observe that, unlike the pilot-based system, the threshold for the packet-based system has a double-sided constraint: if the threshold is too low, the probability of wrong acquisition is high, but if the threshold is very high, the probability of acquiring the packet is low. When the probability of wrong acquisition is low, the performance closely resembles that of the pilot case; the threshold can be determined using the Equations (2.21) and (2.22).

In order to theoretically determine the threshold value above which the pilot and packet-based performances coincide, we calculate the threshold value for an *acceptable* (low enough) probability of wrong acquisition during the absence of the signal in the pilot-based scenario. In Figure 2.5, for PDI = 16 and SF = 63, this value is around 80 units. The system designers can use this strategy to theoretically set the threshold and predict the performance of the packet-based system.

## 2.5 Multiple Antenna Performance

We use Equations (2.21), (2.22), (2.23) and (2.24) to calculate the performance for a pilot-based system. We simulate the system for the packet-based scenario in the same manner as described in Section 2.4.2. For multiple transmit antennas, we use Kasami [38] codes, because of their low cross-correlation properties, instead of m-sequences. However, other spreading codes with good autocorrelation and cross-correlation properties can also be used. We also use the staggered transmission strategy described in the Section 2.3.

In the packet simulation presented in this section, we allocated 16 symbols in the preamble for each transmit antenna—for example, in a two transmit antenna system, the length of the preamble is 32 symbols. The PDI over 16 symbols is carried out for acquiring each code. We choose a threshold of 80 units and above (noise variance per symbol is unity) to ensure a low probability of wrong acquisition, as described in Section 2.4.1. A non-fading AWGN channel is assumed for the pilot and packet-based system simulations.

Figure 2.6 shows the probability of system-wide correct acquisition for the pilot and packet-based parallel acquisition in  $1 \times 1$  SIMO,  $1 \times 2$  SIMO,  $2 \times 1$  MISO and  $2 \times 2$  MIMO systems. We observe that the performances of the pilot and packet-based systems are identical for all cases. Only the MIMO case is plotted for the pilot-based system. The SISO performance is the best, as expected, since its performance is contingent on the correct acquisition of a single spreading code. The SIMO and MISO performances are identical, since two separate parallel acquisitions (on the two receive antennas for SIMO and for the two staggered spreading codes for MISO) need to be performed. The MIMO performance is the worst, because the system-wide correct acquisition requires four correct parallel acquisitions—two spreading codes on each of the two antennas.

### 2.6 Summary

The performance analysis for both pilot and packet-based parallel code acquisition schemes in single and multiple antenna DSSS systems is presented in this chapter. The results are presented for various SNRs, spreading factors, and preamble lengths, and for different transmit and receive antenna configurations. An optimal staggered transmission strategy is presented for parallel code acquisition in DSSS systems with multiple transmit antennas.



Figure 2.6: Parallel code acquisition performance for  $1 \times 1$  SISO,  $1 \times 2$  SIMO,  $2 \times 1$  MISO and  $2 \times 2$  MIMO pilot-based and packet-based systems.

# Acknowledgment

The work in this chapter is partially reprinted from the following paper: M. Amde, L. Milstein, K. Yun and R. Cruz, *Parallel Code Acquisition in MIMO Direct-Sequence Spread-Spectrum Communications*, to be submitted to the IEEE Transactions on Communications. The dissertation author is the primary author of this paper.

# Chapter 3

# Parallel Code Acquisition: Architecture and Implementation

The MIMO parallel code acquisition system was introduced in Section 2.2.1. The block diagram of the system is shown in Figure 2.2b. The MIMO parallel code acquisition system consists of M \* N distinct parallel code acquisition blocks, where M is the number of transmit antennas and N is the number of receive antennas. However, if a staggered transmission strategy is used (as pointed out in Section 2.3), only N parallel code acquisition blocks are needed. Each of the N blocks (one at each receive antenna) can be re-used to perform M acquisitions.

In a MIMO-DSSS system, the parallel code acquisition blocks at each of the receive antennas are used to *acquire* the spreading codes used by the transmit antennas. The block diagram for the parallel code acquisition block is shown in Figure 2.2a. The parallel acquisition block comprises m energy detection blocks. The block diagram for the energy detection block is shown in Figure 2.2c. One can see clearly that the MIMO parallel code acquisition operation is hardware intensive due to the multiple layers of parallelism in its structure.

This chapter presents the VLSI architecture, the FPGA prototype implementation and the testing results for the parallel code acquisition block: a building block for the MIMO parallel code acquisition system. Section 3.1 describes the discrete time operation of the energy detection block. Sections 3.2 and 3.3 present the VLSI architecture and implementation for the parallel de-spreader and postdetection integration operations, respectively. Section 3.4 describes the VLSI architecture of the parallel code acquisition block based on the parallel de-spreading, post-detection integration and other basic arithmetic operations. It also presents the FPGA implementation details. Section 3.5 describes the implementation of the simplified parallel code acquisition algorithm on a hardware prototype, along with the experimental results. Section 3.6 demonstrates the performance of the algorithm on stored data samples captured over the wireless medium. The chapter summary is presented in Section 3.7.

### 3.1 Energy Detection

As shown in the the Figure 2.2a, the parallel code acquisition block for a chip-synchronous <sup>1</sup> system, consists of m parallel energy detection blocks for finding out all the possible code phases (code alignments). In practice, the correct time to sample a chip is not known before acquisition. Therefore, the chips are oversampled to maximize the SNR (for the correct code phase) during the despreading operation. An oversampling factor of r leads to r \* m possible code phases. Thus, r \* m energy detection operations per symbol need to be performed *in parallel*.

Based upon Figure 2.2b, the baseband section<sup>2</sup> of the energy detection block using digitized baseband samples is shown in Figure 3.1. For the parallel code acquisition, this operation has to be carried out for all possible shifts of the spreading code  $k \in (0, r * m - 1)$ ; thus requiring parallel energy detection.

### 3.2 Parallel De-spreading

The de-spreading operation for the in-phase (I) and quadrature-phase (Q) branches is shown in the Figure 3.1. The parallel de-spreading (required for the

<sup>&</sup>lt;sup>1</sup>Assuming one sample per chip sampled at the maximum SNR.

<sup>&</sup>lt;sup>2</sup>The RF down-conversion and analog-to-digital conversion blocks can be shared between all the energy detection blocks at each receive antenna.



Figure 3.1: Energy detection block using digitized baseband samples.

parallel energy detection) is the most complex operation for the parallel code acquisition.

In theory, the r \* m de-spreaders operate simultaneously on one complete symbol. However, in practice, the symbol slides into the baseband section of the receiver in a serial fashion. Performing parallel energy detection after all the r \* msamples of the symbol have entered the acquisition block would lead to a high latency. Thus, the implementation is pipelined to reduce the complexity and the latency.

Figure 3.2 shows the block diagram of the pipelined parallel de-spreading operation from a signal processing point-of-view. The preamble is de-multiplexed and down-sampled for the parallel de-spreading of each of the r \* m possible code phases. Each de-spreader performs a correlation with a different code phase of the local spreading code. The signals are then serialized by up-sampling and interleaving to obtain the de-spreading output for a different code phase every sampling period.

The VLSI architecture of the pipelined parallel de-spreader is shown in



Figure 3.2: Pipelined parallel de-spreading from the signal processing perspective.

Figure 3.3. The pipelined implementation consists of (m-1) FIFOs, each of depth r. The FIFO<sub>i</sub> stores the following partial sum for a sample x[n]:

$$sum_i = \sum_{j=1}^{i} x[n - r * (i - j)]c^j \quad . \tag{3.1}$$

The output of the parallel de-spreader can we written as

$$sum_m = \sum_{j=1}^m x[n - r * (m - j)]c^j \quad , \tag{3.2}$$

which is the de-spread output for code phase k = r \* (m - 1).

We observe that a newly de-spread output is presented every sampling cycle for a different alignment of the received signal and the local spreading sequence. The correlation calculation starts as soon as one sample is presented to the despreader without waiting for the entire symbol to slide into the de-spreader. This reduces the latency of operation. Also, the m multipliers and m adders are shared by all the parallel branches leading to a huge saving in the hardware costs.

Finally, the VLSI implementation of the parallel de-spreader is shown in Figure 3.4. Each FIFO is implemented using a dual-ported RAM with one port



Figure 3.3: Pipelined VLSI architecture of the parallel de-spreader.





used for writing to the FIFO and the other port used for reading. Even though the size of the address space r for each RAM is small, (m-1) unique dual-ported RAMs are required.

The FPGA implementation details for the implementation of this block on a Xilinx Virtex-II Pro FPGA (XC2VP70) FPGA [39] is given in Table 3.1. Lookup tables (LUTs) are used to implement the combinational Boolean logic inside the FPGA [39]. Block RAMs (BRAMs) are on-chip dual-ported memories distributed inside the FPGA fabric for fast memory read/write operation [39]. r = 8 and m = 63 were the implementation parameters used. The logic was synthesized for a clock period of 12.5 ns. As expected, m - 1 (where m = 63) distinct BRAMs are used by this implementation.

Table 3.1: FPGA implementation details for the parallel de-spreader block.

| Logic Distribution     | Used | Available | Utilized |
|------------------------|------|-----------|----------|
| Number of 4-Input LUTs | 3960 | 66176     | 5%       |
| Number of BRAMs        | 62   | 328       | 18%      |

# 3.3 Post-detection Integration

The post-detection integration (PDI) operation, as described in the Section 2.2.3, sums the energy over L symbol periods for improving the parallel code acquisition performance at low SNR.

The PDI block can be implemented in a pipelined fashion using two FIFOs as shown in Figure 3.5. Using the parallel de-spreader architecture, described in Section 3.2, a new energy detection signal is presented to the PDI block every sampling period. The output of the PDI block can be written in mathematical terms:

$$out[n] = out[n - m * r] + in[n] - in[n - m * R * L]$$
 (3.3)

where in is the input (from the energy detection block) to the PDI block and out

is the output of the PDI block. Therefore, a new PDI output for a different code phase is calculated every sampling period.

FIFO 1 stores the energy detector output samples (in) over the entire PDI operation. It is of length r \* m \* L, where L is the number of symbols used for the PDI operation. FIFO 2 is of length r \* m, the number of possible code phases, and it stores the PDI outputs (out) for all the code phases. Hence, the PDI operation requires two large memory blocks for its implementation.



Figure 3.5: Post-detection integration block diagram.

The details for the FPGA implementation of this block on a Xilinx Virtex-II Pro FPGA (XC2VP70) FPGA [40] is given in Table 3.1. The implementation parameters were r = 8, m = 63 and L = 32. The logic was synthesized for a clock period of 12.5 ns. The FIFOs were implemented using dual-ported memories.

Table 3.2: FPGA implementation details for the PDI block.

| Logic Distribution     | Used | Available | Utilized |
|------------------------|------|-----------|----------|
| Number of 4-Input LUTs | 462  | 66176     | 1%       |
| Number of BRAMs        | 59   | 328       | 18%      |

# 3.4 Parallel Code Acquisition



Figure 3.6: Parallel code acquisition block diagram.

The parallel code acquisition algorithm, as shown in Figure 3.6, consists of a combination of the parallel de-spreading, the post-detection integration and a "square-and-add" operation. The FPGA implementation details for a Xilinx Virtex-II Pro FPGA (XC2VP70) FPGA are given in the Table 3.3 below. The logic was synthesized for a clock period of 12.5 ns. The total number of BRAMs used is equal to the sum of the BRAMs in the two parallel de-spreading operations (see Table 3.1) and the PDI operation (see Table 3.2). Additionally, two multipliers are used for the squaring operation.

| Logic Distribution     | Used | Avail. | Utilized |
|------------------------|------|--------|----------|
| Number of 4-Input LUTs | 6013 | 66176  | 9%       |
| Number of BRAMs        | 183  | 328    | 55%      |
| Number of DSP48s       | 2    | 328    | 1%       |

Table 3.3: FPGA implementation details for the parallel code acquisition block.

### 3.5 **Prototype Implementation**

### 3.5.1 Hardware Prototype

The parallel code acquisition algorithm was implemented on a digital prototyping board which hosts a Xilinx Virtex-II Pro (XC2VP20) FPGA on it. Due to the limitations in the size of the XC2VP20 FPGA, only the parallel despreader for the *I*-branch could be implemented, thus utilizing only half the signal power <sup>3</sup>.

The PDI operation was carried out by summing up the *absolute* values of the de-spread outputs over multiple symbols. This compromise did not result in a significant loss in the detection performance. For non-coherent detection, the linear law (sum of absolute values) and square law (sum of squares) detectors have very small difference in performance for all SNRs [36].

The symbol rate was chosen to be 40 Kbps and a preamble length of 5 bytes was used. The algorithm utilized the energy of the entire preamble (L = 40). A chipping rate of 2 MHz (m = 50) was used and the sampling rate of the ADC was set to 64 MHz (r = 32). A brief summary of the FPGA implementation is shown in Table 1. For a detailed understanding of these parameters, one can refer to the Virtex-II Pro documentation [39].

An evaluation board for the AD9862 chip, a Mixed Signal Front-End (MxFE) Processor from Analog Devices, was used for converting a digital signal from the FPGA board to an analog intermediate frequency (IF) signal as well as for converting an analog IF signal to digital. On the transmit (Tx) path, the MxFE had two 14-bit 128 MSPS D/A converters and digital mixers for frequency upconversion. On the receive (Rx) path, the MxFE had two 12-bit A/D converters that could sample up to 64 MSPS. The output of the DAC and the input of the ADC were centered at 44 MHz.

The MxFE board was interfaced to a transceiver evaluation board from RFMagic, which operated at an RF frequency of 2.4 GHz (ISM band). On the RF

<sup>&</sup>lt;sup>3</sup>The presence of a small frequency offset due to the lack of carrier recovery ensured that the energy of the signal was equally divided between the I and the Q branches over multiple symbols.

side, this board was interfaced to commercially available antennas or an attenuator for controlled tests. The hardware setup in the lab is shown in Figure 3.7.

Table 3.4: FPGA implementation details for the modified parallel code acquisition block.

| Logic Distibution      | Used | Avail. | Utilized |
|------------------------|------|--------|----------|
| Number of 4-Input LUTs | 4054 | 18560  | 21%      |
| Number of BRAMs        | 85   | 88     | 96%      |



Figure 3.7: Lab setup.

### 3.5.2 Experiment Settings

The transmitter implemented on the prototyping board transmitted the preambles continuously, with a zero time interval between packets. In order to prove that the algorithm works at low signal strength, the test was conducted over a wireless channel with an input signal of  $-110 \,\mathrm{dBm}$  measured at the antenna output of the receiver.

### 3.5.3 Results

The output of the modified parallel code acquisition algorithm was examined using the Chipscope Pro software from Xilinx, which provided an on-chip debug and real-time logic analysis environment. The output of the algorithm was observed at random intervals of time for a duration of 4096 sampling cycles. The plot of the output is shown in Figure 3.8. The highest peak is at 181 units and repeats every 1600 sampling cycles. The second highest peak which could be attributed to a multi-path component is at around 110 units and also repeats every 1600 sampling cycles. Peaks repeating every 1600 sampling cycles  $(r \times m = 32 \times 50 = 1600)$  were observed which, as expected, is equal to one symbol period. A threshold between 110 and 180 can be used to correctly acquire the received signal.



Figure 3.8: Modified parallel code acquisition algorithm output observed every sampling cycle using Chipscope Pro at  $-110 \, \text{dBm}$ .

The performance of the modified parallel code acquisition algorithm was also evaluated through offline processing of received wireless baseband data. This data was obtained by looping back the RF transmissions to the RF receiver through attenuators. The attenuators were used to control the signal level at the receiver antenna. The attenuators were adjusted so that the received signal level measured at the receiver antenna output was -110 dBm. A packet transmitter and a digital IF receiver were implemented on a Virtex-4 FX FPGA [40] on the Memec Virtex-4 FX LC development kit. This FPGA was interfaced to a computer via Gigabit Ethernet for collecting large blocks of data at extremely high rates for offline processing. The data acquisition prototype was low-cost and mobile, and eliminated the requirement for expensive and bulky deep memory logic analyzers. The same AD9862 and RFMagic evaluation boards described in Section 3.5 were used to complete the transceiver.

The received samples from the ADC were sent to the computer where they were stored for analysis. The modified parallel code acquisition algorithm, described in Section 3.5, was implemented in software and its performance was analyzed using the stored samples. This setup provided a powerful tool to capture packets over the air or in a controlled environment using attenuators.

The FPGA used for the prototype implementation in the Section 3.5 could only implement the I branch of the parallel energy detection algorithm due to the size limitations, thus working with only half the power of the incoming signal. The magnitude of the de-spread output  $(y_{ij}^l)$  in Figure 3.1) was summed over multiple symbols and used as a test statistic.

Figures 3.9 and 3.10 show the output of the modified parallel code acquisition algorithm. The output was computed off-line every sampling period for data obtained from actual preamble transmissions received by the RF front end hardware and data collection system. In this implementation, r = 16 samples/chip,



Figure 3.9: Modified parallel code acquisition algorithm output every sampling cycle with alternating transmission and received input signal strength  $-110 \, \text{dBm}$ .



Figure 3.10: Modified parallel code acquisition algorithm output every sampling cycle with continuous transmission and received input signal strength  $-110 \,\mathrm{dBm}$ .

m = 50 chips/symbol , and the preamble length along with the PDI L is equal to 40 bits.

For Figure 3.9, the received signal consisted of on-off preamble transmission at regular, equally-spaced intervals. The triangular envelope depicts the algorithm output as the preamble slides in and out of the PDI detector. The peak of each envelope coincides with the correct time for code acquisition and is obtained by examining the energy of all the bits in the packet.

Figure 3.10 shows the results for a continuous packet transmission with zero interval between consecutive transmissions, and provides a close-up detail of the peaks within the preamble transmission time interval. As expected, the peaks recur after every r \* m = 800 sampling periods, i.e., one symbol period. These results show that the algorithm is able to effectively process low-level signals obtained from the prototype data collection system.

## 3.7 Summary

The architecture for a parallel code acquisition operation is presented in this chapter. The micro-architectures for the important building blocks, namely, the parallel de-spreader and the post-detection integration, are described in detail. The FPGA implementation details are also presented.

A simplified parallel code acquisition algorithm is implemented on a radio prototype to show the feasibility of implementation in hardware and the ability of the algorithm to acquire the spreading code with low signal strength. Offline analysis of the captured preamble transmissions verifies the ability to acquire at low SNR.

## Acknowledgements

The work in this chapter is partially reprinted from the paper: M. Amde, J. Marciano, R. Cruz, and K. Yun, *Code acquisition at low SINR in spread spectrum communications*, Invited Paper in ISSSTA'06: The 9th International Symposium

on Spread Spectrum Techniques and Applications, Manaus, Brazil, Aug. 2006 [41]. The dissertation author is the primary author of this paper.

# Chapter 4

# A Low SNR Synchronization System

This chapter presents the architecture of a MIMO synchronization system. It also describes a radio prototype implementation for one transmit-receive antenna pair synchronization block used as building blocks for the MIMO synchronization system. Finally, it concludes by describing the laboratory and outdoor over-the-air experiments and the results of the experiments.

The chapter is organized as follows. Section 4.1 presents the architecture of the proposed DSSS-based MIMO synchronization system. Section 4.2 describes the design of the SISO synchronization system, a building block of the MIMO system, and Section 4.3 the simulation results. Section 4.4 describes the radio prototype and Section 4.5 the hardware verification testbed. Section 4.6 presents the experimental setup and results. Finally, we draw conclusions in Section 4.7.

# 4.1 System Architecture

The advantages presented in the Section 1.4.1 led to the choice of the DSSS modulation for synchronization at low SNR. The synchronization in a DSSS receiver involves a code acquisition, a code tracking, a carrier recovery, a multipath detection and a frame synchronization [1,13,15] as shown in Figure 4.3. The code acquisition [13] ensures a correct de-spreading operation in the receiver by aligning
the spreading code in the incoming packet with a local copy of the spreading code at the receiver. The parallel code acquisition algorithm, analyzed in Chapter 2 and implemented in Chapter 3, forms the backbone of the synchronization system.

Once the acquisition is achieved, the tracking [42, 43] maintains the alignment of the local spreading code with the received signal over the reception of an entire packet. The carrier recovery [44] is performed in the coherent demodulation schemes for correctly estimating the carrier frequency and phase from the incoming signal. The multipath detection estimates the multipath delay profile of the channel for the subsequent RAKE combining [1,13] operation. Finally, the frame synchronization [45] finds the start of the payload in the received packet. The synchronization system described in this chapter is designed for a non-coherent payload. Hence, the carrier recovery block is not implemented.

The basic MIMO transmitter is shown in Figure 1.2a. Figure 4.1 shows a transmission strategy for the MIMO synchronization system. The preambles used for synchronizing each transmit-receive antenna pair are modulated using DSSS for low SNR synchronization. It is assumed that a staggered transmission strategy is used for transmitting the preambles: that is, the synchronization time is divided equally for transmitting each preamble at the maximum transmit power. Each antenna transmits the preamble at the maximum transmit power in the allocated time and switches off when the other antennas transmit the preambles. This design choice is motivated by the optimality of the staggered transmission, shown in Section 2.3 for the MIMO parallel code acquisition as it leads to no self-interference and less noise during the acquisition process. The payload from each transmit antenna can be transmitted simultaneously to leverage the benefits of MIMO post-synchronization. Figure 1.2a shows an example transmission where the payload from each antenna is transmitted with the same power P/M.

The basic MIMO receiver is shown in Figure 1.2b. The synchronization system presented in Figure 1.2b is used to synchronize each transmit-receive antenna pair, thus performing M \* N synchronizations. If all the preambles are transmitted simultaneously, M \* N synchronization blocks will be required to operate in parallel resulting in a large hardware implementation. This implementation problem



Figure 4.1: MIMO synchronization system transmitter.

is alleviated when the staggered transmission strategy proposed in Figure 4.1 is adopted. Figure 4.2 presents the order in which the synchronization takes place at the receive antennas during the reception of the staggered preambles. Each receive antenna synchronizes to only one transmit antenna at a time. This reduces the number of parallel synchronization blocks from M \* N to N.



Figure 4.2: MIMO receiver synchronization order for staggered transmission strategy. Sync<sub>ij</sub> refers to synchronization of the  $i^{th}$  transmit antenna at the  $j^{th}$  receive antenna.

To reduce the synchronization time, the RF upconverters (see Figure 1.2a) used at the transmit antennas, and the RF downconverters used at the receive

antennas (see Figure 1.2b), can be synchronized in frequency as well as phase. This would reduce the synchronization time since the carrier synchronization has to be performed only for one transmit-receive antenna pair. Also, the knowledge of the multipath delay spread can be exploited to reduce the search interval for timing and frame synchronization after the first transmit-receive antenna pair synchronization. Such techniques cannot be applied in cooperative MIMO communications [9] since each user has its own separate oscillator for generating the clocks for the baseband and RF circuits. Moreover, the channel estimation is typically performed in a staggered fashion [5, 46] to avoid any self-interference due to simultaneous transmissions over the transmit antennas.

It is important to note that at least one complete preamble synchronization has to be performed for a transmit-receive antenna pair (for example,  $Sync_{11}$  in Figure 4.2) before exploiting any a priori knowledge of the carrier synchronization at the transmit and receive antennas, or the characteristics of the channel. The first transmit-receive antenna pair synchronization is the same as the synchronization in a SISO system. The SISO synchronization system can be easily extended to a MIMO synchronization system by duplication. The SISO synchronization system will be called the "synchronization system", or simply "system", for convenience henceforth in this chapter. The SISO synchronization system design and implementation will be the discussed in the rest of the chapter.

## 4.2 System Design

#### 4.2.1 System Overview

The system for synchronizing a DSSS modulated preamble is depicted in Fig. 4.3. The received RF signal is down-converted to baseband in-phase and quadrature-phase (I and Q) branches. It is then sampled by an analog-to-digital converter (ADC) to obtain digital baseband samples. The sampling frequency of the ADC defines the number of samples per chip available for synchronization at the receiver and is a key parameter in determining the performance of individual



Figure 4.3: Receiver block diagram.

synchronization algorithms as well as the hardware implementation.

The baseband synchronization blocks are typically implemented in digital hardware. The *code acquisition* block aligns the incoming signal with the locally generated spreading sequence in the receiver for an accurate de-spreading operation. The parallel code acquisition scheme, described in Chapters 2 and 3, is used for fast acquisition at low SNR. The *multipath search* block resolves all the incoming multipaths<sup>1</sup> using a parallel search algorithm which reuses the parallel code acquisition block. The multipath information is passed over to the RAKE [1,13] to improve the SNR for decoding by combining the energy present in the discrete incoming multipaths. The *frame synchronization* block finds the start of payload data in the incoming packet even during the presence of individual symbol errors. The *tracking* block maintains the symbol-level synchronization at the receiver by compensating for the timing drift during the course of the reception of the entire packet.

<sup>&</sup>lt;sup>1</sup>The term *multipaths* refers to signals arriving at the receiver via multiple paths.

The carrier recovery block is crucial for the coherent detection of the payload since it is responsible for correcting the frequency and the phase offsets at the baseband caused by an inaccurate RF down-conversion. For this particular implementation, the synchronization system was designed for a non-coherent payload which operates correctly for small frequency offsets  $\Delta \omega_c T_s \ll \pi$ , where  $\Delta \omega_c$  is the residual frequency offset after down-conversion and  $T_s$  is the period of the symbols used in the period. There is a graceful degradation in the de-spreading [15], and hence the synchronization, performance with increasing frequency offset. Although a carrier recovery block was not implemented in the proposed system, the carrier recovery techniques described in [1, 44] can be easily adapted for this synchronization scheme, if a coherent operation were desired.

#### 4.2.2 Spatial Range and Receiver Sensitivity

The receiver sensitivity, i.e., the lowest strength of the signal that a receiver can detect accurately, is a practical figure of merit in the comparison of the reliability of wireless receivers. We illustrate a strategy used for extending the range of networks using the 802.11b [5] wireless LAN standard as a reference. Commercially available 802.11b receivers promise a receiver sensitivity of around -90 dBm. Suppose that an ad-hoc network requires nodes with a receiver sensitivity of -105 dBm to extend the range of the network by 15 dB, or, in terms of distance, an increase of free-space coverage by 5.6 times [1,47]. The spreading gain required to compensate for the low signal strength can be calculated using the equation below.

The total thermal noise power calculated for a 22 MHz RF bandwidth (i.e., a chipping rate of 11 M chips / sec), similar to the 802.11b specification, is approximately -90.58 dBm [47], assuming that the noise figure for a RF front-end is 10 dB. The minimum SNR for a reliable operation is assumed to be 3 dB for *post* 

*de-spreading*, which will be justified in detail in Section 4.3. Hence, the spreading gain required for the reliable operation is 17.42 dB which corresponds to a spreading ratio of 55.26. Since the lengths of spreading codes with good autocorrelation and crosscorrelation properties are  $2^n - 1$  [14], we chose a spreading code of length 63 in our design.

It is important to note that the receiver sensitivity can be modified using different spreading gains for different applications. Also, the random interference due to simultaneous transmissions by other wireless devices will raise the noise floor, thus affecting the receiver sensitivity.

#### 4.2.3 Packet Format

| ········· |
|-----------|
|           |

Figure 4.4: Frame format.

Fig. 4.4 shows the frame format for the proposed system. A frame comprises a preamble, a start frame delimiter (SFD), and payload data. The preamble consists of M symbols (bits) of value *logic one*, each of which is spread by an identical pseudorandom (PN) sequence which has an impulse-like autocorrelation property [14]. The code acquisition is performed by aligning a locally generated PN sequence with the PN sequence in the incoming signal during the reception of a preamble. After the code acquisition is completed, a parallel search for multipaths is performed subsequently during the preamble reception.

The frame-level synchronization is achieved by the SFD detection after the multipath search. Although one PN sequence is used as a spreading sequence for the preamble because it consists of just one repeating symbol, the SFD detection requires two. The SFD and payload data are a combination of two symbols (denoting *logic one* and *logic minus one*) based on a binary orthogonal signaling.

In order to perform a non-coherent detection, two orthogonal *Kasami* sequences are used for spreading — one for each symbol. Kasami sequences have slightly sub-optimal autocorrelation properties as compared to the PN sequence, but they have very good crosscorrelation properties needed for a non-coherent detection of the payload [48].

The sequence of symbols that comprise the SFD are chosen to form a PN sequence at the symbol-level for their good autocorrelation property. Thus, there are two levels of spreading used for encoding the SFD: each symbol of the SFD is spread using one of the two Kasami sequences, and the symbols (comprising the SFD) themselves form a PN sequence.

Finally, the code tracking maintains the alignment for the de-spreading operation over the reception of the entire payload.

#### 4.2.4 Channel Model

A frequency-selective (multipath) Rayleigh fading model is used for simulations. Suppose there are  $L_p$  path components. Let  $\tau_l$  denote the relative delay of the  $l^{th}$  path from the first (reference) path, e.g.,  $\tau_1 = 0$ . The delay profile is given by  $\{\tau_l\}_{l=1}^{L_p}$ . The small-scale fading is modeled by a Rayleigh distribution and the path loss by an exponentially decaying multipath intensity profile (MIP) [32]:

$$\Omega_l = \Omega_1 \ e^{-\tau_l/\tau_{max}}, \quad l = 1, 2, 3, ..., L_p$$

where  $\Omega_l$  is the average power of the  $l^{th}$  path and  $\tau_{max}$  represents the maximum delay spread.

The specific delay profile used in the simulation, as shown in Fig. 4.5, is:

$$\{\tau_l\}_{l=1}^{L_P} = \{0, 3.88, 10.63, 15.75, 19.75, 22.13, 26.5\}$$

where the unit is a chip period  $T_c$ . Therefore, for a spreading ratio of 63, the maximum delay spread is chosen to be 0.42 times a bit period. In other words, the maximum delay spread is assumed to be less than a half symbol period.



Figure 4.5: Multipath channel model.

## 4.3 System Performance

The system performance was analyzed using simulations. A preamble length of 64 symbols was used for the packet shown in the Figure 4.4. A 31-bit pesudorandom (PN) sequence was chosen as the SFD for improving the frame synchronization performance at low SNR by leveraging its impulse-like autocorrelation property. The parallel code acquisition algorithm executed a postdetection integration (PDI) over 32 symbols. The design details of the remainder of the synchronization algorithm after the code acquisition can be found in [49]. As mentioned in Section 4.2.2, the SNR is calculated after the de-spreading operation which provides a spreading gain of approximately 18 dB (for a spreading ratio of 63).

Fig. 4.6 shows the system-level performance of the entire synchronization block (tracking excluded) under AWGN. The  $P_{CODEACQ}$  curve describes the probability of correct acquisition for the code acquisition algorithm for various post de-spreading SNRs. The threshold for acquisition, as described in Section 2.2, is selected for a small wrong acquisition probability ( $P_{WA}$ ) of 10<sup>-6</sup>. We see that for a fixed  $P_{WA}$ ,  $P_{CODEACQ}$  increases with increasing SNR and is very close to 1



Figure 4.6: System-level synchronization under AWGN.

at 3 dB. The  $P_{SFD}$  curve shows the probability of frame synchronization given correct code acquisition. The  $P_{SYNC}$  curve, which is obtained by multiplying the  $P_{CODEACQ}$  and  $P_{SFD}$  curves, is the probability of frame synchronization for the entire synchronization block. We observe that  $P_{SYNC}$  is nearly 1 at 3 dB SNR.

Fig. 4.7 shows the system level performance under a multipath Rayleigh fading environment. The SNR under the multipath Rayleigh fading conditions is calculated from the signal power  $\Omega_1$ , the strongest average multipath signal given in Equation (4.1). Note that is does not denote the symbol-level SNR post-RAKE combining of the multipath signals.

The performance of the synchronization block is better at low ( $\leq 3 \text{ dB}$ ) SNR, compared to Fig. 4.6, due to a multipath diversity gain. We again observe that the probability of synchronization is higher than 0.95 for SNR of 3 dB. We observe that at 4 dB SNR the synchronization probability is close to 1, which implies that all the packets arriving at 4 dB post de-spreading SNR (-14 dB input SNR) will be received correctly.



Figure 4.7: System-level synchronization under multipath Rayleigh fading.

It is important to note that, as shown in Figures 4.6 and 4.7, the code acquisition is the bottleneck for synchronization. Once the acquisition is achieved, the post-de-spreading SNR is sufficient for performing the rest of the synchronization operations.

## 4.4 Radio Prototype

The synchronization block (SB) is implemented on a radio prototyping platform to demonstrate the feasibility of its implementation.

Figure 4.8 shows a block diagram of the radio prototyping platform and Figure 4.9 the actual setup in lab. The RF receiver front-end is implemented using a National Instruments preamplifier (PXI-5690) [50] and a downconverter (PXI-5600) [51] which are housed in a NI PXI chassis (PXI-1045) [52]. The digital baseband component of the SB is implemented on a Xilinx Virtex-4 (XC4VSX35) FPGA [40]. This FPGA is part of a Xilinx XtremeDSP Development Kit for



Figure 4.8: Block diagram of the radio prototyping platform.



Figure 4.9: Radio prototype setup in lab.

Virtex-4 [53] which also hosts an Analog Devices AD6645 [54] 14-bit ADC with a maximum sampling rate of 105 MSPS. The FPGA is interfaced to the PC using a JTAG cable for conducting real-time measurments using the Chipscope [55] software logic analyzer provided by Xilinx.

The PXI-5690 RF preamplifier has gain and noise figure characteristics that optimize the dynamic range and sensitivity of the NI PXI-5600 RF downconverter. The typical noise figure of the preamplifier is 5 dB [56]. The National Instruments PXI-5600 downconverts a RF signal of up to 2.7 GHz to a low intermediate frequency (IF) of 15 MHz. The noise density of the PXI-5690 and PXI-5600 combination ranges from  $-162 \, \text{dBm/Hz}$  to  $-164 \, \text{dBm/Hz}$ , depending on the frequency range [56]. The downconverted IF signal is first digitized at 80 MSPS by the ADC and then digitally downconverted to baseband before being processed by the baseband section of the SB.

The following parameters were used for designing a VLSI implementation of the digital baseband section:

- Chipping rate = 10 Mchips/sec
- Sampling rate = 80 Msamples/sec
- Oversampling ratio  $(N_s) = 8$
- Spreading ratio  $(N_c) = 63$
- Symbol rate = 158.73 Kbits/sec
- Length of PDI operation = 32 symbol periods
- Number of RAKE fingers  $(N_{FINGER}) = 4$

The area utilized on the FPGA was *directly proportional* to the number of symbols used in the post-detection integration operation for code acquisition and the spreading ratio. A brief summary of the FPGA implementation is shown in Table 4.1. Lookup tables (LUTs) are used to implement the combinational Boolean logic inside the FPGA [40]. Block RAMs (BRAMs) are on-chip dualported memories distributed inside the FPGA fabric for fast memory read/write operation [40]. For better understanding of these parameters, one can refer to the Virtex-4 documentation [40].

| Logic Distribution     | Used  | Avail. | Utilized |
|------------------------|-------|--------|----------|
| Number of 4-Input LUTs | 25298 | 30720  | 82%      |
| Number of BRAMs        | 157   | 192    | 81%      |
| Number of DSP48s       | 18    | 192    | 9%       |

Table 4.1: FPGA implementation details for the synchronization system.

## 4.5 Verification Testbed

The SB implementation is verified using the hardware test setups available on the testbed. The verification consists of two steps:

- *Digital Baseband Test*: Basedband test vectors used for simulations are fed to the SB implementation on the FPGA as digital inputs for a verification of the digital hardware. Note that the signals used for the simulations and digital hardware verification are *exactly identical* including the noise characteristic.
- *RF Test*: The baseband test vectors used for simulations (without the noise) are upconverted to RF and fed as input to the SB. This tests the RF downconversion and analog-to-digital conversion components of the prototype. The thermal noise is generated by the analog and RF circuitry/test equipment.

#### 4.5.1 Digital Baseband Test Setup

Figure 4.10 shows a block diagram of the digital baseband test setup. Baseband test vectors are applied as a digital stimulus to test the digital subsystem of the SB. This is achieved by using a combination of the Agilent Baseband Studio N5110B [57] and N5102A [58] modules. N5110B Baseband Studio for Waveform Capture and Playback allows a playback of custom digital IQ test vectors of up to 512 million samples. Agilent N5102A Baseband Studio Digital Signal Interface module provides an easily controllable digital interface to connect to the digital subsystem. An Agilent logic analyser is used for monitoring and calibrating the performance of the digital logic inside the FPGA. The Agilent logic analyzer consists of a 16901A Logic Analysis System [59] and a 16950B measurement module [60]. The 16901A is a 2-slot logic analysis mainframe and the 16950B is a 68-channel 4 GHz timing, 667 MHz state logic analysis measurement module.



Figure 4.10: Digital baseband test setup

#### 4.5.2 RF Test Setup

Figure 4.11 shows a block diagram of the RF test setup. In this setup, baseband test vectors are up-converted to 2.4 GHz by an Agilent ESG 4438C [61] signal generator which serves as an RF transmitter. The transmission bandwidth is set to 20 MHz, for a chipping rate of 10 Mchips/sec, similar to that used in WLANs [5]. The transmitter output is connected to the radio prototype input via a variable attenuator. This setup emulates a single line-of-sight link between a transmitter and a receiver. The received signal level can be adjusted using the variable attenuator. The logic analyzer in this setup is used for real-time monitoring of the SB performance. It is also used for calculating the received SNR for accurate calibration of test results.



Figure 4.11: RF test setup

#### 4.5.3 Hardware Test Results

Figure 4.12 shows the probability of correct synchronization for the SB at various SNRs under additive white Gaussian noise (AWGN) environment. As mentioned earlier in Section 4.2.2, the SNR is calculated after the de-spreading operation which provides a spreading gain of approximately 18 dB (for a spreading ratio of 63). In Figure 4.12, we observe that the digital verification results closely match the simulation results and thus confirm the correctness of the digital implementation. We also observe that the controlled RF verification tests, using the setup described in Section 4.5.2, accurately match the simulation and digital verification results.



Figure 4.12: Simulation and verification results under AWGN.

Figure 4.13 shows that digital verification results for the multipath Rayleigh fading channel described in Section 4.2.4 accurately match the simulation results. Controlled RF verification tests, similar to the experiments conducted for the AWGN environment, could not be conducted due to the difficulty in emulating a specific multipath environment in the laboratory. From Figures 4.12 and 4.13, we can conclude that the implementation of the SB on the prototyping testbed faithfully translates the SB design into a real hardware.



Figure 4.13: Simulation and verification results for multipath environment.

## 4.6 Experimental Results

The performance of the SB prototype is calibrated in a controlled lab setup. After the calibration, the SB prototype is evaluated at two outdoor test sites using over-the-air packet receptions in highly scattering multipath environments.

#### 4.6.1 Calibration

The ability of the SB to receive low SNR signals has already been established via simulations and hardware tests in Sections 4.2 and 4.5, respectively. Calibrations were also conducted to calculate the sensitivity of the SB prototype. A sensitivity is the minimum signal level required for reliable operation of the SB. A 95% synchronization rate was used as a yardstick for reliable operation of the SB. The RF test setup described in Section 4.5.2 is used for conducting the sensitivity calibration. The channel used for the calibration is equivalent to an non-fading channel with AWGN added at the receiver. The received signal level is controlled using an attenuator between the Tx RF output and the Rx RF input.



Figure 4.14: Sensitivity experiment results.

The receiver sensitivity is used as a figure of merit for calibrating the performance of wireless receivers. For example, according to the 802.11 standard [5], the receiver sensitivity is the minimum input signal level for which the frame error ratio is less than 8% with a payload length of 1024 bytes encoded using 2 Mbps DQPSK modulation. The sensitivity for commercial 802.11a/b/g wireless receivers varies from vendor to vendor and also depends on the data rates and modulation used.  $-94 \,\mathrm{dBm}$  is the lowest sensitivity level encountered in literature [62].

From Figure 4.14, we observe that the system synchronizes reliably at low signal levels at the RF input of the receiver. For example, we are able to synchronize 99.37% of the received packets when the signal level is -101.7 dBm. The sensitivity of the SB is observed to be at most -102 dBm since it correctly synchronizes more than 95% of the received packets at that signal level. However, we are unable to compare the SB sensitivity to the receiver sensitivity of the commercial 802.11a/b/g receivers because our receiver is designed only for the synchronization test, not for receiving the payload.

#### 4.6.2 Outdoor Experiments

The setup for outdoor experiments is similar to the RF test setup described in Section 4.5.2 with the attenuator replaced by commercial off-the-shelf omnidirectional antennas. The antennas are placed at a height of one meter while conducting the experiments. The received signal power is controlled by varying the transmit power. The average received signal power is measured using the spectrum analyzer placed at the receiver. The multipath delay profile and fading characteristics of the channel were not measured or characterized during the experiment.

The rooftop of Atkinson Hall building at University of California, San Diego was used as the first test site. Figure 4.15 shows the experimental setup at the rooftop. Figure 4.16 shows the layout of the test site. The transmitter (Tx) and the receiver (Rx) are separated by 54 meters. Although there is a line-of-sight between Tx and Rx, the proximity of the Tx and Rx to metal walls, vents, and other structures causes multipath propagations. The multipath intensity profile can be assumed to be constant over time since Tx, Rx, and scatterers are stationary.

Figure 4.17 shows the performance of the SB on the rooftop averaged over 50,000 packet transmissions. We observe that almost all the packets are correctly received at a signal level of -92.5 dBm and above. The received signal level of -92.5 dBm corresponds to -27.4 dBm of transmit power.



Figure 4.15: Experimental setup at the rooftop.



Figure 4.16: Rooftop layout.



Figure 4.17: SB performance at the two outdoor test sites.

The second test site is a long pathway in front of the Geisel Library at University of California, San Diego — also known as the Library Walk. The transmitter and receiver were placed at a distance of 137 meters for the experiment. In addition to the line-of-sight component, there were multipaths coming into the receiver due to trees, surrounding buildings, and people walking across the experimental setup. Unlike the first outdoor experiment, the multipath intensity profile at this test site cannot be assumed to be constant over time as the scatterers are mobile and change positions over time. The layout of the library walk is given in Figure 4.19 and a photograph of the experimental setup is given in Figure 4.18.

Figure 4.17 shows the performance of the SB on the library walk averaged over 50,000 packet transmissions. We observe that almost all the packets are correctly received at a signal level of -93 dBm and above. The -93 dBm received signal level corresponds to -13.4 dBm of transmit power.

The results from the rooftop experiments show that the SB is able to synchronize reliably over 54 meters in a rich multipath environment for a transmit



Figure 4.18: Library walk photo.



Figure 4.19: UCSD Library Walk layout.

power of -27.4 dBm. The results from the library walk show that the SB is able to synchronize reliably over 137 meters in an urban line-of-sight setting for a transmit power of -13.4 dBm. Note that 30 dBm is the maximum transmit power limit with omnidirectional directional antennas over the 2.4 GHz unlicensed ISM band in the United States [63] (other countries have similar limits). Hence, the spatial range can be further improved by raising the transmit power compared to the experimental setups. For example, a 6 dB increase in transmit power can double the spatial range in free space [47]. A test site was not found by the author where one could place the transmitter and receiver more than 137 meters apart to reliably conduct our experiments with higher transmit powers.

## 4.7 Summary

A DSSS-based synchronization system is proposed for synchronization at a low SNR in asynchronous packet-based MIMO communications. The synchronization system was designed for staggered preamble transmission to eliminate self-interference during synchronization and to reduce non-coherent combining loss in the parallel code acquisition. The MIMO system uses as building blocks the single transmit-receive antenna pair synchronization blocks employed in the SISO communications.

In order to demonstrate the low SNR operation in asynchronous packetbased DSSS communications, the SISO synchronization system was designed and implemented on a reconfigurable radio prototyping platform. It uses the parallel code acquisition algorithm, analyzed in Chapter 2 and prototyped in Chapter 3, for fast acquisition at low SNR. By utilizing the post-acquisition spreading gain, the parallel code acquisition also improves the SNR for the rest of the synchronization functions which can then be performed in a traditional fashion.

The prototype is verified on a hardware testbed, and controlled lab experiments are carried out to calibrate its performance. Finally, experiments are conducted at two outdoor test sites to evaluate its performance over real wireless channels. Experimental results show that the prototype has a lower receiver sensitivity compared to commercial 802.11 receivers. It also functions reliably over long distances in outdoor environments.

### Acknowledgements

The work in this chapter is partially reprinted from the two papers: M. Amde, J. Shim, J. Marciano, K. Yun, and R. Cruz, A Low SINR Synchronization System for Direct-Sequence Spread-Spectrum Communications: Radio Prototype, Verification Testbed, and Experimental Results, in Tridentcom'08: The 4th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, Innsbruck, Austria, Mar. 2008 [37], and J. Shim, M. Amde, K. Yun, and R. Cruz, Synchronization at low SINR in asynchronous direct-sequence spread-spectrum communications, Best Paper Award in ICSNC'07: The Second International Conference on Systems and Networks Communications, Cap Esterel, France, Aug. 2007 [49]. The work was carried out jointly with Jaewook Shim in the ECE dept., UCSD.

## Chapter 5

# **Discussion and Conclusion**

MIMO communications, in its various forms, has the potential to counter the two major drawbacks of wireless communications: reliability and throughput. However, many practical problems need to be solved to obtain the benefits predicted in theory. One of the key problems is synchronization: a potential bottleneck that could handicap the performance in future MIMO systems. This problem is particularly acute in packet-based systems since the time to synchronize is limited.

This dissertation is focused on the topic of packet-based MIMO synchronization, a critical, yet often neglected, function for enabling packet-based MIMO communications. Specifically, it presents the theory, architecture, implementation and experimental results for the synchronization at low SNR in the packet-based MIMO communications. By decoupling the synchronization and decoding functions, this system can be used as a generic synchronization front-end for any MIMO communication.

This dissertation presents the following important results:

Parallel Code Acquisition Analysis: It describes the performance analysis for the parallel code acquisition in MIMO-DSSS systems. It shows the theoretical analysis for the pilot-based system. It also presents a scheme to theoretically predict the packet-based performance based upon the corresponding pilot-based performance.

Staggered Transmission Strategy: It proposes a staggered transmission

strategy for optimal code acquisition in DSSS systems with multiple antennas. Formal proof of the optimality is provided.

Parallel Code Acquisition Architecture and Implementation: It presents the parallel code acquisition architecture and shows the feasibility of implementation in the current VLSI technology.

Synchronization System Design: It presents the design of a DSSS-based synchronization front-end for the low SNR synchronization in MIMO systems. The design is analyzed via simulations.

Synchronization System Architecture and Implementation: It describes the synchronization system architecture and implements the single antenna building block for the MIMO system on a reconfigurable radio prototyping platform.

*Experiments*: It presents the verification of the parallel code acquisition and synchronization system prototypes through controlled baseband and RF experiments in the lab. It also reports the results of the experiments conducted at the two outdoor test sites.

*Radio Prototyping and Experimentation*: It provides several examples for performing radio prototyping and outdoor experimentation. It will serve as a reference to future researchers.

## 5.1 Future Work

The work presented in this dissertation can be extended in multiple directions:

- Optimal transmission strategy: The optimality of the staggered transmission strategy is proved in the absence of any channel state information (CSI) at the transmitter. The optimal transmission strategy for known CSI at the transmitter needs to be investigated.
- MIMO prototype implementation: The prototype implementation of the SISO synchronization system (the building block for the MIMO synchronization system) has been performed. The MIMO prototype implementation could not be performed due to the limitations of the digital logic available

on the reconfigurable prototyping platform, and can be implemented on a different platform in the future.

- Decoupling the preamble and the payload modulation format: The preamble and payload modulation formats can be decoupled using the synchronization front-end presented in this dissertation. For example, a system could use the DSSS modulation format in the preamble for reliable synchronization, and an OFDM [1,2] payload to meet the throughput requirements of the system. The potential benefits of such an approach needs further investigation.
- Cooperative MIMO communications: The MIMO synchronization techniques presented in the dissertation are described for point-to-point communication, where all the transmit antennas reside at one transmitting node and all the receive antennas reside at another receiving node. Similar techniques need to be investigated for cooperative MIMO communications [9].
- Synchronization for high SNR systems: The DSSS-based synchronization front-end has been designed for a low SNR operation. Its performance at high SNR (with a shorter preamble) needs to be compared to the current state-of-the-art for possible improvements in the synchronization performance.
- Interference and jamming: The performance analyses of the parallel code acquisition as well as the synchronization schemes need to be performed in the presence of narrowband interference and jamming.

# Bibliography

- 1. J. G. Proakis, *Digital Communications Fourth Edition*. McGraw-Hill, 2006.
- 2. D. Tse and P. Viswanath, *Fundamentals of wireless communication*. Cambridge University Press, 2005.
- 3. A. Paulraj, R. Nabar, and D. Gore, *Introduction to space-time wireless communications*. Cambridge University Press, 2005.
- R. Schaller, "Moore's law: past, present and future," Spectrum, IEEE, vol. 34, no. 6, pp. 52–59, Jun 1997.
- 5. IEEE standards association, IEEE 802.11, 1999 edition. [Online]. Available: http://standards.ieee.org/getieee802/802.11.html
- 6. P. Gupta and P. Kumar, "The capacity of wireless networks," *Information Theory, IEEE Transactions on*, vol. 46, no. 2, pp. 388–404, mar 2000.
- 7. M. Takai, J. Martin, and R. Bagrodia, "Effects of wireless physical layer modeling in mobile ad hoc networks," in *MobiHoc '01: Proceedings of the* 2nd ACM international symposium on Mobile ad hoc networking & computing, New York, NY, USA, 2001, pp. 87–94.
- 8. N. Ehsan and R. Cruz, "On the optimal SINR in random access networks with spatial reuse," in CISS '06: IEEE Conference on Information Sciences and Systems, Princeton University, USA, Mar. 2006.
- A. Nosratinia, T. Hunter, and A. Hedayat, "Cooperative communication in wireless networks," *Communications Magazine*, *IEEE*, vol. 42, no. 10, pp. 74– 80, Oct. 2004.
- D. Gesbert, M. Shafi, D. shan Shiu, P. Smith, and A. Naguib, "From theory to practice: an overview of mimo space-time coded wireless systems," *Selected Areas in Communications, IEEE Journal on*, vol. 21, no. 3, pp. 281 – 302, apr 2003.

- G. J. Foschini, Layered Space-Time Architecture for Wireless Communication in a Fading Environment When Using Multiple Antennas, vol. 1, no. 2, pp. 41 - 59, 1996.
- P. Wolniansky, G. Foschini, G. Golden, and R. Valenzuela, "V-blast: an architecture for realizing very high data rates over the rich-scattering wireless channel," sep-2 oct 1998, pp. 295 –300.
- R. L. Pickholtz, D. L. Schilling, and L. B. Milstein, "Theory of spreadspectrum comunications - a tutorial," *IEEE Transactions on Communications*, vol. COM-30, no. 5, pp. 855–884, May 1982.
- 14. S. W. Golomb and G. Gong, Signal Design for Good Correlation: For Wireless Communication, Cryptography and Radar. Cambridge University Press, 2005.
- A. J. Viterbi, Code Division Mulple Access Princibles of Spread Spectrum Communication. Addison-Wesley, 1995.
- R. B. Ward, "Acquisition of pseudonoise signals by sequential estimation," *IEEE Transactions on Communications Technology*, vol. COM-13, pp. 474– 483, Dec. 1965.
- R. B. Ward and K. P. Yiu, "Acquisition of pseudonoise signals by recursionaided sequential estimation," *IEEE Transactions on Communications*, vol. COM-25, pp. 784–794, Aug. 1977.
- A. Polydoros and C. Weber, "A unified approach to serial search spreadspectrum code acquisition, part I: General theory," *IEEE Transactions on Communications*, vol. COM-32, no. 5, pp. 542–549, May 1994.
- —, "A unified approach to serial search spread-spectrum code acquisition, part II: A matched filter receiver," *IEEE Transactions on Communications*, vol. COM-32, no. 5, pp. 550–560, May 1994.
- 20. D. Sarwate. Acquisition of direct-sequence spread-spectrum. [Online]. Available: http://www.ifp.illinois.edu/~sarwate/
- E. Sourour and S. C. Gupta, "Direct-sequence spread-spectrum parallel acquisition in nonselective and frequency-selective rician fading channels," *IEEE Journal on Selectec Areas in Communications*, vol. 10, no. 3, pp. 535–544, Apr. 1992.
- R. R. Rick and L. B. Milstein, "Optimal decision strategies for acquisition of spread-spectrum signals in frequency-selective fading channels," *IEEE Transactions on Communications*, vol. 46, no. 5, pp. 686–694, May 1998.

- U. Cheng, W. Hurd, and J. Statman, "Spread-spectrum code acquisition in the presence of doppler shift and data modulation," *Communications, IEEE Transactions on*, vol. 38, no. 2, pp. 241 –250, feb 1990.
- 24. W. H. Ryu, M. K. Park, and S. K. Oh, "Code acquisition schemes using antenna arrays for ds-ss systems and their performance in spatially correlated fading channels," *Communications, IEEE Transactions on*, vol. 50, no. 8, pp. 1337–1347, Aug 2002.
- R. Rick and L. Milstein, "Parallel acquisition of spread-spectrum signals with antenna diversity," *Communications, IEEE Transactions on*, vol. 45, no. 8, pp. 903–905, Aug 1997.
- 26. O.-S. Shin and K. B. Lee, "Use of multiple antennas for ds/cdma code acquisition," vol. 1, 2002, pp. 621–625.
- B. Wang and H. Kwon, "Pn code acquisition using smart antenna for spreadspectrum wireless communications. i," *Vehicular Technology, IEEE Transactions on*, vol. 52, no. 1, pp. 142 – 149, jan 2003.
- 28. —, "Pn code acquisition for ds-cdma systems employing smart antennas .ii," Wireless Communications, IEEE Transactions on, vol. 2, no. 1, pp. 108 – 117, jan. 2003.
- M. Katz, J. Iinatti, and S. Glisic, "Recent advances in multi-antenna based code acquisition," aug.-2 sept. 2004, pp. 199 – 206.
- S. Won and L. Hanzo, "Analysis of serial-search-based code acquisition in the multiple-transmit/multiple-receive-antenna-aided ds-cdma downlink," Vehicular Technology, IEEE Transactions on, vol. 57, no. 2, pp. 1032–1039, March 2008.
- 31. —, "Non-coherent code acquisition in the multiple transmit/multiple receive antenna aided single- and multi-carrier ds-cdma downlink," Wireless Communications, IEEE Transactions on, vol. 6, no. 11, pp. 3864 –3869, november 2007.
- M. K. Simon and M.-S. Alouini, *Digital Communication over Fading Channels*. Wiley, 2005.
- 33. J. Lehnert and M. Pursley, "Error probabilities for binary direct-sequence spread-spectrum communications with random signature sequences," Communications, IEEE Transactions on, vol. 35, no. 1, pp. 87–98, Jan 1987.
- 34. Y. Seokhyun Yoon; Bar-Ness, "Performance analysis of linear multiuser detectors for randomly spread cdma using gaussian approximation," *Selected Areas in Communications, IEEE Journal on*, vol. 20, no. 2, pp. 409–418, Feb 2002.

- 35. S. Verdu, *Multiuser Detection*. Cambridge University Press, 1998.
- J. Marcum, "A statistical theory of target detection by pulsed radar," Information Theory, IEEE Transactions on, vol. 6, no. 2, pp. 59–267, Apr 1960.
- 37. M. Amde, J. Shim, J. Marciano, K. Yun, and R. Cruz, "A low SINR synchronization system for direct-sequence spread-spectrum communications: radio prototype, verification testbed and experimental results," in *TridentCom '08: Proceedings of the 4th International Conference on Testbeds and research infrastructures for the development of networks & communities.* ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2008, pp. 1–6.
- E. Dinan and B. Jabbari, "Spreading codes for direct sequence cdma and wideband cdma cellular networks," *Communications Magazine*, *IEEE*, vol. 36, no. 9, pp. 48–54, Sep 1998.
- 39. Xilinx Virtex-II Pro. [Online]. Available: http://www.xilinx.com/virtex2pro
- 40. Xilinx Virtex-4. [Online]. Available: http://www.xilinx.com/virtex4
- 41. M. Amde, J. Marciano, R. Cruz, and K. Yun, "Code acquisition at low SINR in spread spectrum communications," in the 9th International Symposium on Spread Spectrum Techniques and Applications, Manaus, Brazil, Aug. 2006.
- 42. A. Polydoros and C. L. Weber, "Analysis and optimization of correlative codetracking loops in spread-spectrum systems," *IEEE Transactions on Communications*, vol. COM-33, no. 1, pp. 30–43, Jan. 1985.
- 43. G. Fock and J. B. et al, "Channel tracking for rake receivers in closely spaced multipath environment," *IEEE Journal on Seleted Areas in Communications*, vol. 19, no. 12, pp. 2420–2431, Dec. 2001.
- 44. H. Meyr and G. Ascheid, Synchronization in Digital Communications. Volume 1 : Phase-, Frequency-Locked Loops, and Amplitude Control. Wiley-Interscience, 1989.
- X. Chen, Y. Li, and S. Cheng, "Frame synchronization by de-correlation detection," in *International Conference on Communication Technology*, *ICCT'98*, Beijing, China, Oct. 1998.
- 46. A. Ling and L. Milstein, "The effects of spatial diversity and imperfect channel estimation on wideband mc-ds-cdma and mc-cdma," *Communications, IEEE Transactions on*, vol. 57, no. 10, pp. 2988 –3000, october 2009.
- 47. T. Rappaport, K. Blankenship, and H. Xu, Tutorial : Propogation and Radio System Design Issues in Mobile Radio Systems for the GloMo Project. Virginia Tech, 1997.

- D. Sarwate and M. Pursley, "Crosscorrelation properties of pseudorandom and related sequences," *Proceeding of the IEEE*, vol. 68, no. 5, pp. 593–619, May 1980.
- 49. J. Shim, M. Amde, J. Marciano, K. Yun, and R. Cruz, "Synchronization at low SINR in asynchronous direct-sequence spread-spectrum communications," in *The Second International Conference on Systems and Networks Communications*, Cap Esterel, France, Aug. 2007.
- 50. NI PXI-5690 3 GHz RF Preamplifier. [Online]. Available: http: //sine.ni.com/nips/cds/view/p/lang/en/nid/202388
- 51. NI PXI-5661 2.7 GHz RF Vector Signal Analyzer with Digital Downconversion. [Online]. Available: http://sine.ni.com/nips/cds/view/p/lang/en/nid/203038
- 52. General-Purpose 18-Slot Chassis for PXI. [Online]. Available: http: //www.ni.com/pdf/products/us/pxi1045.pdf
- 53. XtremeDSP Development Kit for Virtex-4 (DO-DI-DSP-DK4). [Online]. Available: http://www.xilinx.com/bvdocs/userguides/ug\_xtremedsp\_devkitIV.pdf
- 54. AD6645 14-Bit, 80 MSPS/105 MSPS A/D Converter. [Online]. Available: http://www.analog.com/en/prod/0,2877,AD6645,00.html
- 55. Chipscope Pro. [Online]. Available: http://www.xilinx.com/chipscope
- 56. NI PXI-5690 with NI PXI-5660 Specifications. [Online]. Available: http://www.ni.com/pdf/manuals/371674b.pdf
- 57. Agilent N5110B Baseband Studio for Waveform Capture and Playback
  : Technical Overview. [Online]. Available: http://cp.literature.agilent.com/ litweb/pdf/5989-2095EN.pdf
- 58. N5102A Baseband Studio Digital Signal Interface Module : Technical Overview. [Online]. Available: http://cp.literature.agilent.com/litweb/pdf/ 5988-9495EN.pdf
- 59. Agilent 16900 Series Logic Analysis System Mainframes. [Online]. Available: http://cp.literature.agilent.com/litweb/pdf/5989-0421EN.pdf
- 60. Agilent Technologies Measurement Modules for the 16900 Series. [Online]. Available: http://cp.literature.agilent.com/litweb/pdf/5989-0422EN.pdf
- 61. Agilent E4438C ESG Vector Signal Generator : Data Sheet. [Online]. Available: http://cp.literature.agilent.com/litweb/pdf/5988-4039EN.pdf

- 62. Cisco Aironet 802.11a/b/g Wireless CardBus Adapter. [Online]. Available: http://www.cisco.com/en/US/products/hw/wireless/ps4555/ products\_data\_sheet09186a00801ebc29.html
- 63. Unlicensed devices general technical requirements detailed material. [Online]. Available: http://www.fcc.gov/oet/ea/presentations/files/oct05/Unlicensed\ \_Devices\\_JD.pd