## UC Irvine UC Irvine Electronic Theses and Dissertations

#### Title

Multi-Gigahertz Synchronous Sampling and Triggering (SST) Circuit with Picosecond Timing Resolution

Permalink https://escholarship.org/uc/item/1m29740h

Author Chiem, Edwin Yuel-wai

**Publication Date** 2017

Peer reviewed|Thesis/dissertation

# UNIVERSITY OF CALIFORNIA, IRVINE

Multi-Gigahertz Synchronous Sampling and Triggering (SST) Circuit with Picosecond Timing Resolution

#### DISSERTATION

submitted in partial satisfaction of the requirements for the degree of

#### DOCTOR OF PHILOSOPHY

in Electrical and Computer Engineering

by

Edwin Y. Chiem

Dissertation Committee: Professor Stuart Kleinfelder, Chair Professor Steve Barwick Professor Michael Green

 $\bigodot$  2017 Edwin Y. Chiem

## TABLE OF CONTENTS

|                                                                                                   |                                                                                                                                                                                                                   |                                                    |             | Page                                                |
|---------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|-------------|-----------------------------------------------------|
| LIST (                                                                                            | OF FIGURES                                                                                                                                                                                                        |                                                    |             | $\mathbf{iv}$                                       |
| LIST (                                                                                            | OF TABLES                                                                                                                                                                                                         |                                                    |             | vi                                                  |
| ACKN                                                                                              | OWLEDGMENTS                                                                                                                                                                                                       |                                                    |             | vii                                                 |
| CURR                                                                                              | ICULUM VITAE                                                                                                                                                                                                      |                                                    |             | viii                                                |
| ABST                                                                                              | RACT OF THE DISSERTATION                                                                                                                                                                                          |                                                    |             | x                                                   |
| 1 Intr<br>1.1<br>1.2<br>1.3<br>1.4                                                                | oduction<br>Background                                                                                                                                                                                            | ns                                                 |             | <b>1</b><br>1<br>4<br>7<br>9                        |
| <ul> <li>2 SST</li> <li>2.1</li> <li>2.2</li> <li>2.3</li> <li>2.4</li> <li>2.5</li> </ul>        | Architecture and Overview<br>Synchronous Sampling and Triggering<br>Sampling Clock Generation Comparis<br>SST Data Acquisition and Readout C<br>SST Triggering Settings and Controls<br>Time-Interleaved Sampling | g (SST) Chip Archite<br>son<br>Control Signals<br> | ecture      | <b>11</b><br>11<br>14<br>20<br>23<br>26             |
| <ul> <li><b>3</b> Ana</li> <li>3.1</li> <li>3.2</li> <li>3.3</li> <li>3.4</li> <li>3.5</li> </ul> | log and Mixed Signal Circuit DesLVDS ReceiverSample and Hold CellAnalog Readout Circuitry (MultiplexHigh Speed ComparatorDesigning for High Acquisition Bandy                                                     | scription and Designation                          | gn Analysis | <b>28</b><br>28<br>28<br>36<br>46<br>55<br>55<br>69 |
| <b>4 Tim</b><br>4.1<br>4.2<br>4.3<br>4.4<br>4.5                                                   | ing AnalysisTiming ResolutionIntra Channel Timing TestInter Channel Timing TestSample Interval ErrorsZeroCrossing                                                                                                 |                                                    |             | <b>79</b><br>79<br>80<br>87<br>96<br>Timing         |

|          |      | FPN Characterization                         |     |     | <br> | • |    | 99   |
|----------|------|----------------------------------------------|-----|-----|------|---|----|------|
|          | 4.6  | Simulated Annealing Met                      | hod | for |      |   | Ti | ming |
|          |      | FPN Characterization                         |     |     | <br> |   |    | 101  |
|          | 4.7  | SST Timing Calibration                       |     |     | <br> | • | •• | 104  |
| <b>5</b> | SST  | Test Results                                 |     |     |      |   |    | 109  |
|          | 5.1  | System Verification                          |     |     | <br> |   |    | 109  |
|          | 5.2  | Voltage Fixed Pattern Noise                  |     |     | <br> |   |    | 111  |
|          | 5.3  | Bandwidth Measurement                        |     |     | <br> |   |    | 113  |
|          | 5.4  | DC Performance Measurements                  |     |     | <br> |   |    | 114  |
|          | 5.5  | Sampled Noise Measurements                   |     |     | <br> |   |    | 116  |
|          | 5.6  | Dynamic Signal Performance                   |     |     | <br> |   |    | 118  |
|          | 5.7  | Event Triggering Testing                     |     |     | <br> |   |    | 121  |
|          | 5.8  | Measured Fixed Pattern Sample Interval Error |     |     | <br> |   |    | 122  |
|          | 5.9  | SST Timing Resolution Measurements           |     |     | <br> |   |    | 126  |
|          | 5.10 | Scaling the SST Design                       |     |     | <br> |   |    | 131  |
| 6        | Con  | clusions and Summary                         |     |     |      |   |    | 137  |

## Bibliography

141

## LIST OF FIGURES

### Page

| $1.1 \\ 1.2 \\ 1.3 \\ 1.4$                                                                                   | ARIANNA neutrino detector illustration | 3<br>5<br>7<br>9                                                                 |
|--------------------------------------------------------------------------------------------------------------|----------------------------------------|----------------------------------------------------------------------------------|
| $2.2 \\ 2.3 \\ 2.4 \\ 2.5 \\ 2.6 \\ 2.8$                                                                     | PLL block diagram                      | 16<br>17<br>18<br>21<br>22<br>27                                                 |
| 3.1<br>3.2<br>3.3<br>3.4<br>3.5<br>3.7<br>3.8<br>3.9<br>3.10<br>3.11<br>3.13<br>3.14<br>3.15<br>3.16<br>3.17 | SST LVDS receiver schematic            | 30<br>34<br>35<br>36<br>37<br>42<br>46<br>48<br>54<br>56<br>59<br>63<br>68<br>70 |
| 3.18                                                                                                         | phase                                  | 72                                                                               |
| 3.19                                                                                                         | transistor width of the NMOS pass gate | 77<br>78                                                                         |
| $4.1 \\ 4.2 \\ 4.3$                                                                                          | Inter channel timing test illustration | 81<br>83<br>84                                                                   |

| 4.4  | An example of the bipolar pulse inputs used in the inter channel timing test         | 90  |
|------|--------------------------------------------------------------------------------------|-----|
| 4.5  | Plot of the cross correlation function versus lag                                    | 91  |
| 4.6  | Second order fit of the cross correlation function maximum                           | 92  |
| 4.7  | Illustration of timing uncertainty that causes sampling interval errors              | 97  |
| 4.8  | Histograms of calibrated and uncalibrated period measurements of 3 ns period         |     |
|      | sine waves                                                                           | 105 |
| 4.9  | Normalized histograms of calibrated and uncalibrated delays for 60ns cable           |     |
|      | delay inter channel timing measurements                                              | 106 |
| 4.10 | FFT of a recovered 100MHz sine wave                                                  | 108 |
| 5.1  | SST readout of a recorded 100 MHz sinewave                                           | 110 |
| 5.2  | SST readout of captured neutrino template signal                                     | 111 |
| 5.3  | Voltage fixed pattern noise measurement(with DC offset)                              | 112 |
| 5.4  | Voltage fixed pattern noise distribution                                             | 113 |
| 5.5  | Normalized output amplitude versus frequency plot                                    | 114 |
| 5.6  | DC voltage transfer characteristic of the SST                                        | 115 |
| 5.7  | Linear error on DC output voltage                                                    | 116 |
| 5.8  | Distribution of the SST output noise                                                 | 117 |
| 5.9  | FFT of an SST readout of a 100 MHz sinewave                                          | 118 |
| 5.10 | Comparator sensitivity test waveforms                                                | 122 |
| 5.11 | The fixed pattern sample intervals plotted versus sample cell position               | 124 |
| 5.12 | Histogram of the fixed pattern sampling intervals                                    | 124 |
| 5.13 | Fast shift register sampling clock generation                                        | 126 |
| 5.14 | Scatter plot of intra channel timing test results for various node frequencies       |     |
|      | and timing calibrations                                                              | 127 |
| 5.15 | Inter-channel timing test                                                            | 130 |
| 5.16 | Fixed pattern sample intervals plotted against cell position of the $0.18 \mu m$ SST | 134 |
| 5.17 | Histogram of fixed pattern sample interval error of the $0.18 \mu m$ SST             | 134 |

## LIST OF TABLES

### Page

| 5.1 | Table of intra channel timing test period error for various node frequencies   |     |
|-----|--------------------------------------------------------------------------------|-----|
|     | (in units of ps RMS)                                                           | 127 |
| 5.2 | Inter channel timing test delay uncertainty for various delays (in units of ps |     |
|     | RMS)                                                                           | 129 |
| 6.1 | Summary of the SST Specifications                                              | 139 |

## ACKNOWLEDGMENTS

I thank my advisor Dr. Stuart Kleinfelder for his expertise and guidance. His invaluable advice and honest feedback helped me learn and grow in the field of engineering. I thank Dr. Steve Barwick for the guidance I received, and for giving me the opportunity to participate in the ARIANNA collaboration.

I extend a special thanks to Dr. Michael Green for participating in my dissertation defense committee.

I thank my colleague Tarun Prakash for his collaboration in the SST chip design and testing. Recognition goes to him for his design of the clock generation circuitry and triggering logic.

I thank Anirban Samanta who completed the system board design. His work was essential to implementing the data acquisition system using the SST chip.

I extend a thanks to the physics group: Dr. Joulien Tatar, Dr. Corey Reed, Dr. Anna Nelles, Chris Persichilli, and James Walker.

I express profound gratitude for the support and encouragement from my dear friend Maria del Rosario Cortes Luna.

Finally I thank my parents Chai Chiem and Millie Chiem and my brother Calvin Chiem. Their unwavering love and support helped me persevere through many challenges and for that, I am deeply grateful.

## CURRICULUM VITAE

### Edwin Y. Chiem

#### **EDUCATION**

| <b>Doctor of Philosophy in Electrical and Computer Engineering</b> ,<br>University of California, Irvine | Irvine,   | <b>2017</b><br>California |
|----------------------------------------------------------------------------------------------------------|-----------|---------------------------|
| Masters of Science in Electrical Engineering<br>San Jose State University                                | San Jose, | 2008<br>California        |
| <b>Bachelor of Science in Electrical Engineering</b><br>University of California, Davis                  | Davis,    | <b>2006</b><br>California |

#### **RESEARCH EXPERIENCE**

**Graduate Research Assistant** University of California, Irvine

#### TEACHING EXPERIENCE

**Teaching Assistant** University of California, Irvine **2012–2017** *Irvine, California* 

**2010–2012** *Irvine, California* 

#### PUBLICATIONS

1. S.A. Kleinfelder, **E. Chiem**, T. Prakash. "The SST Fully-Synchronous Multi-GHz Analog Waveform Recorder with Nyquist-Rate Bandwidth and Flexible Trigger Capabilities," 2014 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Seattle, WA, 2014, pp. 1-3.

2. S.A. Kleinfelder. **the ARIANNA Collaboration** "Design of the Second-Generation ARIANNA Ultra-High-Energy Neutrino Detector Systems," Proceedings, 2015 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC 2015): San Diego, California, USA pages 1-4, 2015

3. S.A. Kleinfelder, **E. Chiem**, T. Prakash. "The SST Multi-G-Sample/s Switched Capacitor Array Waveform Recorder with Flexible Trigger and Picosecond-Level Timing Accuracy, arXiv:1508.02460 [physics.ins-det], 2015.

### ABSTRACT OF THE DISSERTATION

Multi-Gigahertz Synchronous Sampling and Triggering (SST) Circuit with Picosecond Timing Resolution

By

Edwin Y. Chiem

Doctor of Philosophy in Electrical and Computer Engineering

University of California, Irvine, 2017

Professor Stuart Kleinfelder, Chair

The Antarctic Ross Ice shelf ANtenna Neutrino Array (ARIANNA) particle physics experiment aims to detect ultra-high energy neutrinos originating outside our solar system. A second generation detector prototype for the experiment has been developed and successfully deployed in Antarctica. The second generation detector is based on the Synchronous Sampling and Triggering (SST) integrated circuit. This dissertation focuses on the design and performance of the SST chip.

Fabricated in a  $0.25 \,\mu m$  CMOS process, the SST is a low power data acquisition circuit that monitors for potential neutrino signals and preserves candidate signals. The waveform capture is performed with a 256-cell time-interleaved sampling array. Continuous sampling operation is achieved through circular cycling across the array. The synchronous sampling clock generation allows for sampling rates that span six orders of magnitude (i.e. ranging between 2.0 KHz and 2.0 GHz). The analog bandwidth (-3dB frequency) of the SST reaches 1.5 GHz, allowing for the capture of frequency components up to the Nyquist frequency. The SST integrates four channels of waveform capture functionality into a single chip.

Each SST channel includes event triggering to initiate the signal capture of neutrino events, and to reject random noise signals. Events are triggered based on outputs from a pair of high speed comparators that monitor for bipolar threshold crossings. Multiple triggering options are available on the SST, including direct output of the comparator signals and triggering on dual threshold crossings occurring within a programmable time window.

The SST chip utilizes an external low jitter LVDS oscillator to synchronously generate an internal sampling clock with low timing jitter. The fixed pattern timing noise was characterized through two different approaches: a stochastic zero crossing method and a Monte Carlo based simulated annealing method. After calibrating for fixed pattern timing noise, the SST achieves inter channel timing resolutions between 1.15 ps (RMS) and 2.36 ps (RMS).

## Chapter 1

## Introduction

The Synchronous Sampling and Triggering (SST) is designed as a circuit solution for the data acquisition for the Antarctic Ross Ice-shelf ANtenna Neutrino array (ARIANNA) neutrino detection experiment. This chapter discusses the background information on ARI-ANNA to provide context for the development of the SST system. A brief description of the neutrino is given. The physical principals utilized for neutrino detections are explained. The goals and scope of ARIANNA are presented. A rationale for the development of the SST chip and a description of how it is used in ARIANNA are discussed. The chapter concludes with information about the SST's fabrication technology.

### 1.1 Background

Information about the ARIANNA experiment is presented to provided the context for the use of the SST. The field of neutrino astronomy is an active area of research and it has the potential to be a powerful tool for exploring the universe. Neutrino detectors can function as a neutrino telescope, providing a new way to search for energetic events in the universe. The detection of ultra-high energy neutrinos (UHE) can be used to make discoveries about the cosmos that cannot be done with conventional optical telescopes alone [1],[2].

Neutrinos are subatomic particles that possess unique properties which make their detection challenging. These low mass particles are neutrally charged and rarely interact with matter through the weak nuclear force. This property allows neutrinos to pass through the vast majority of matter, making them difficult to detect. Due to the rarity of neutrino interactions, large volumes of detection medium are needed, resulting in vast, kilometer-scale detectors [2],[3]. The same property that makes neutrinos difficult to detect also make them excellent at transmitting information about the far reaches of the universe. The particle's weak interaction with matter give physicists an unobscured view of the universe and the neutrino's indifference to electric-magnetic fields allows for accurate determination of its point of origin [4].

The ARIANNA experiment proposes to build a large scale neutrino detector array to study the flux of the UHE neutrinos. The goal of the ARIANNA detector is to be highly sensitive to neutrinos at the highest energy range of  $10^{17} eV$  to  $10^{20} eV$  [2],[5]. The completed ARIANNA experiment will consist of nearly 1,300 independent stations installed over a 36 km X 36 km grid. Each station is built with a solar panel assembly that generates all the power for its operation. An illustration of ARIANNA is presented in figure 1.1.

The Ross Ice Shelf was chosen as the site of the experiment because of several crucial features. There is minimal human activity allowing for a radio quiet environment. The ice shelf acts as a vast and existing medium for neutrino detection. The ice is transparent to the radio signals produced as a result of the neutrino interactions within the medium. The ice shelf's ice to water interface behaves as a mirror like surface for the radio emissions, reflecting the radio signals towards the detector array on the surface of the ice.

ARIANNA detects neutrinos by monitoring for distinct radio signals emitted in a



Figure 1.1: ARIANNA neutrino detector illustration<sup>1</sup>

process known as the Askaryan effect [6],[7]. The Askaryan effect is a phenomenon where radio waves are produced when a particle travels through a medium at a speed greater than light's phase velocity inside said medium. On occasion, a highly energetic neutrino collides with one of the nuclei in the water molecule. The collision creates a shower of secondary particle which travel faster than the phase velocity of light in the ice and emits a radio signal. The radio emission reflects off of the water and ice interface of the Ross Ice Shelf, and it is sensed by antennas connected to an ARIANNA station at the surface of the ice.

<sup>&</sup>lt;sup>1</sup>Image created by Scott Brown, originally published in [8].

## **1.2 Data Acquisition Solution**

The radio emissions from a neutrino interaction are detected by a log-periodic dipole array (LPDA) antenna, the signals are passed through an anti-aliasing filter and sent to a high gain RF amplifier. The resulting signal is a continuous time analog voltage signal. Each ARIANNA monitoring station contains a data acquisition system to processes the neutrino events for transmission back to University of Irvine California (UCI) for analysis.

The ARIANNA data acquisition system has four channels to monitor the signals from four independent antennas. Periodically, particle events are detected, causing each of the channels to hold the sampled voltages of the analog signals, thus producing records of the input waveforms. Once a neutrino event occurs, a trigger signal is created prompting the system to save the samples and preserve the neutrino signal in a temporary "analog memory". The sampling is performed with a custom designed, switch capacitor array (SCA) based, waveform recording integrated circuit. An analog to digital conversion (ADC) is performed, converting the saved samples into binary bit strings, which are written to a non-volatile memory. Once stored in memory, the signal data is available for digital transmission through long distance wireless link and satellite communications.

Capturing neutrino signals presents several challenges for the data acquisition system, which are outlined in the following discussion. The neutrino interactions produce short transient bursts that span on the order of a hundred nanoseconds. Figure 1.2 presents a waveform generated from a neutrino signal template and is representative the signals typically observed from a neutrino interaction.

Conventional analog data acquisition systems operate differently from the ARIANNA data acquisition system. The conventional systems use a real-time ADC, which continuously samples and digitally converts the input analog signal at a rate that must meet or exceed the Nyquist sampling rate. The converted data is then read to a storage medium for later



Figure 1.2: Neutrino template signal<sup>2</sup>

recovery. Limitations to the traditional data acquisition system make it inadequate for use in the ARIANNA experiment. Nyquist rate sampling for high energy particle detection requires operation rates ranging from hundreds of MHz to multi-GHz. Achieving high bit resolution at these high conversion rates can be challenging and is generally achieved at the expense of large power consumption. A multi-gigahertz ADC can require over a watt of power. Another drawback to the conventional system is that the constant digitalization generates an excessive amount of data that requires an impractically large hard drive for storage. The majority of this stored data is going to be extraneous noise that needs to be discarded to find the signals from the neutrino interactions.

The architecture of the ARIANNA data acquisition system is based on a transient waveform recorder and it overcomes the previously mentioned disadvantages of a conventional detector system. A major advantage of using a transient waveform recording chip is that it greatly relaxes the requirements of the ADC while maintaining multiple giga-samples per second rates and low power consumption. Under typical conditions, the ARIANNA stations detect potential neutrino events at rates of around a few millihertz [3], leaving large time

<sup>&</sup>lt;sup>2</sup>Image of neutrino template from the publication [3].

intervals between events. With the neutrino waveform sampled and saved in the SCA, there is ample time for a low rate ADC to perform the digital conversion.

The triggering mechanism is essential to the functionality of the transient waveform recording method of data acquisition. The trigger continuously monitors for a potential neutrino event and initiates the waveform capture once they occur. The user programed triggering criterion performs noise suppression, which effectively rejects the majority of random noise events. This improves the quality of the data taken by the ARIANNA stations.

The frequency components of the neutrino signals spans a broadband of frequencies and includes components in the high frequency spectrum (in the upper hundreds of MHz). The presence of high frequency components correspond to rapid fluctuations in the time domain, resulting in signal features with large voltage changes within short time intervals. The data acquisition system must sample the signal at a sufficiently high rate to faithfully capture the signal information, as dictated by the Nyquist theorem. The broadband frequency components of the input signals impose another requirement on the data acquisition system. In addition to a high sample rate, the system needs a broad and flat frequency response, which includes a high analog tracking bandwidth, to accurately sample the input.

The record length of the data acquisition system is finite, typically spanning on the order of hundreds of nanoseconds. Increasing the record length of SCA waveform recorders becomes impractical beyond a point since the increase in depth adversely affects other performance specifications including bandwidth, frequency dependent distortion, power consumption and layout area. Discussions on these aspects are presented later in chapter 3. There is a risk of truncating the signal and corrupting the information if the record length is insufficient. Therefore, the SST's design balances the record length against the various other parameters.

It is important that the data acquisition process maintains high signal integrity. As



Figure 1.3: 2011 Data acquisition core prototype using ATWD

the waveforms are captured and digitally converted, various noise components are added in the process. In addition to noise, nonlinearities in the circuits distort the original signal. The presence of either noise or distortion reduces the effective resolution of the system and can hamper the fidelity of the recorded signal. These concerns were addressed during the design of the SST chip and discussions about them are presented in chapter 3.

## **1.3** Second Generation ARIANNA Stations

This section discusses the motivation behind developing the SST chip. The first Generation ARIANNA stations were prototyped using a data acquisition system based on the Advance Transient Waveform Digitizer (ATWD) chip, designed by a previous design group under Professor Kleinfelder [9]. The data acquisition system using the ATWD is shown in the figure 1.3.

The ATWD prototype stations were installed in Antarctica in the year 2011 [10] and

successfully collected data from the Ross Ice Shelf and transmitted it back to UCI. The functionality of the ATWD system deployment is proof of concept. It is a concrete demonstration that the waveform recording system works in the real world conditions of the extreme Antarctic environment. With a sound understanding of the data acquisition principals, focus turned to improving and optimizing the data acquisition and triggering systems. Efforts culminated in the development of a second generation waveform recorder chip named the Synchronous Sampling and Triggering (SST) circuit, and a new SST based data acquisition system.

The SST chip displays several improvements over the ATWD and a number of the major differences are described in the following. The ATWD is capable of recording only a single channel's input, thus requiring the four ATWD chips, each mounted on separate daughter cards, to realize a four channel system. The SST integrates four channel waveform capturing capability onto a single integrated circuit, consequently reducing the hardware overhead and shrinking the system size. The SST realizes a low power design, requiring 32 mW of power per channel compared to the ATWD's power consumption of 780 mW per channel. The transient record length of the SST is doubled that of the ATWD, resulting in a 128ns waveform capture window (presuming a sample rate 2.0 giga-samples per second). The SST design has an improved analog bandwidth, reaching 1.5 GHz compared to the ATWD's 800 MHz -3dB frequency. The ATWD triggering mechanism relies on pattern match logic to identify neutrino events. This method requires an involved process of programming and calibrations. The SST simplifies the triggering criteria by checking for voltage threshold crossings that occur within a programmable time interval of each other. These SST performance results are discussed in further detail in chapter 5.

The SST chip design was completed in collaboration with my colleague in the engineering research group, Tarun Prakash. The second generation ARIANNA prototype station was completed using a motherboard design by Anirban Samanta, another engineering colleague



Figure 1.4: Second generation data acquisition system based on the SST chip

of mine. A photo of the SST data acquisition system is presented in figure 1.4. To date, the SST system's functionality has been verified and several SST ARIANNA stations have been successfully deployed in Antarctica and are currently collecting data.

## 1.4 Fabrication Technology

This section presents basic information about the fabrication technology used in the ARIANNA chip designs. Detailed specifications about the technology are confidential and cannot be shared. The following discusses publicly available information about the process properties and fabrication options.

Both SST and the ATWD chips were fabricated with the same TSMC  $0.25 \,\mu m$  RF CMOS design kit provided through MOSIS [11]. The power supply voltage for this technology

is 2.5 V. Five metal layers and one poly layer are available for circuit routing. This process allows for resistor implementations using either poly or NWELL layers. The option of silicide block is available for the poly resistors, giving the designer more choices for the resistor sheet resistance. The ARIANNA chip designs take advantage of the process' capability of implementing metal-insulator-metal (MIM) capacitors with good linearity. These MIM devices show low capacitance variation versus temperature or voltage across the device.

The SST layout follows scalable CMOS (SCMOS) [12] design rule, where all the layout geometries are expressed in abstract units of lambda. The SCMOS rules facilitate scaling a circuit layout to different technology nodes. The TSMC  $0.25 \,\mu m$  process has a lambda equal to  $0.12 \,\mu m$  and its features are arranged on a half grid allowing for a  $0.06 \,\mu m$  resolution.

Being a CMOS process, both NMOS and PMOS transistors are available. The minimum gate length of the PMOS and NMOS transistors is  $0.24 \,\mu m$  drawn length. The NMOS and PMOS have nominal threshold voltages (without body effect) of about 0.51 V and -0.52 V respectively. In this process, the electron mobility is approximately 4.5 times greater than the hole mobility, thus NMOS transistors generally yield better performance over PMOS. Under typical bias conditions, a  $5 \,\mu m$  gate width NMOS transistor has a unity gain current frequency (denoted as  $f_T$ ) of around 16 GHz, while a  $5 \,\mu m$  wide PMOS transistor has a  $f_T$  of 11 GHz. The transistor  $f_T$  provides an upper frequency limit for the devices; typical circuit designs operate at much lower frequencies. Based on the  $f_T$ , the NMOS also have a better frequency response compared to PMOS transistors and that is why NMOS designs are preferred when possible.

## Chapter 2

## **SST** Architecture and Overview

This chapter provides a hierarchal description of the SST and discusses its various operations from a block level perspective. The mechanisms behind the clock generation, data acquisition, waveform readout, and event triggering are broadly explained. Details about the functionality of the SST's inputs and outputs are provided. An explanation of the SST time-interleaved sampling is presented.

# 2.1 Synchronous Sampling and Triggering (SST) Chip Architecture

The SST chip operates as a transient waveform recorder capturing a snapshot of an input signal in the form of 256 sample voltages. The sampling rate is dictated by the SST's LVDS reference clock input, so it has the potential to be varied. Nominally, the SST operates with a sample rate of 2.0 GHz taking sample voltages of the input signal at 500 ps intervals. At the core of the waveform capture function is an array of 256 sample and hold cells that sequentially track and store samples of the analog input signal. At the 2.0 GHz sampling

rate, the entirety of the 256 sampled voltages results in a 128 ns record length. To achieve continuous sampling over long periods of time, the SST samples the input in a circular fashion. The circular sampling is accomplished by returning to the beginning position of the sampling cell array, and overwriting the previously saved samples once all 256 sample cells have stored voltages. This allows the SST to continuously record samples indefinitely. The SST will continue to sample the input until it is commanded to save the record. The resulting waveform stored on the SST is a record of the last 128 ns of the transient input signal prior to the SST receiving the stop signal.

The following discussion presents an outline of how the components of the SST operate and interact to perform a transient waveform capture. Figure 2.1 is a block diagram of the SST chip and it displays the SST's components block and their connections.

A sampling clock coordinates the analog voltage sampling, and the sampling clock is synchronously generated with a series of high speed, dynamic logic shift registers (fast shift registers). An off chip oscillator provides the SST with an LVDS reference clock. The LVDS oscillator's signal is sent to the LVDS receiver block in the SST, which converts the reference clock into a full scale 2.5 V to 0 V internal SST clock. Dual non-overlapping clock signals based on the two phases of the LVDS receiver's outputs are distributed to each of the fast shift registers through a network of clock drivers. The non-overlapping clock phases are necessary to ensure proper function of the dynamic logic in the sample clock generation circuitry. The sample clock is created by shifting a sampling pointer on the clock edges of each phase of the SST clock; thereby implementing a sample rate that is twice the frequency of the reference clock frequency. The SST chip is designed to typically operate using an LVDS oscillator with the nominal frequency of 1.0 GHz, thus realizing a 2.0 GHz sampling rate.

To realize the continuous sampling, the sampling process must restart at the beginning of the sampling array and overwrite the previous samples once all 256 of the sampling cells



Figure 2.1: SST Block Diagram<sup>1</sup>

have been exhausted. This is accomplished by feeding back the sampling pointer to the initial position. Once the sampling pointer returns to the first sampling cell, the signal sampling process cycles through the array again. This loop can repeat indefinitely until a stop signal halts the sampling clock and preserve the existing samples stored in the array.

The analog input being sampled is connected to the array of 256 switch capacitor sampling cells through an analog signal bus consisting of several metal layers connected together with vias to make a low resistance connection. Each sample and hold cell in the

<sup>&</sup>lt;sup>1</sup>Figure originally published in the reference [13].

array is connected to a different register position in the fast shift register. A tap off of the fast shift registers generates a sampling cell's sampling clock signal, which determines whether the cell is tracking the input or holding the sampled voltage.

The triggering capability of the SST is used to reject background noise and to discern when a signal of interest has occurred. The analog input signal is connected to the triggering circuitry through the same analog bus as the sampling array. The SST trigger circuit monitors for instances when the analog input exceeds certain voltage thresholds, and sends a trigger signal if the user defined triggering criterion is met. In practice, the trigger signal is sent to the microcontroller on-board the system board to communicate that a signal of interest has occurred. The electronics on the system board then issues a stop signal to preserve the samples and coordinates the SST chip readout, digital conversion, and writing to memory.

After a stop signal has been issued to the SST chip, the waveform samples are saved and can be held for a period of time before being readout. Each of the sample and hold cells also contain circuitry that provides voltage buffering for the sampled voltage and implements analog multiplexing. During the SST readout operation, all the saved analog sample voltages are serially driven onto the SST's analog output pin.

### 2.2 Sampling Clock Generation Comparison

Professor Kleinfelder originally developed the concept of a switch capacitor array (SCA) based waveform recorders for particle detectors in 1988 [14]. Since then, SCA recorders have been widely used in particle detectors; the SST, along with other high rate waveform recorder designs, rely on waveform sampling through an SCA. Recently, different research groups designed waveform recorders with sampling rates in the range of 2-15 giga-sample per

second (GSPS) [15],[16],[17]. A notable distinction between the SST and the prior multi-GSPS SCA waveform recorders is the method in which the sampling clock is generated. The SCA recorders proceeding the SST rely on either a Phase Lock Loop (PLL) or a Delay Lock Loop (DLL) feedback circuit to generate a stable, high frequency sampling clock. A research team under Professor Varner from the University of Hawaii developed a SCA recorder named PSEC4, which utilizes a DLL for timing generation [17]. Another research team with Professor Ritt from the Paul Scherrer Institute, Switzerland developed the DRS4 SCA recorder, which uses a PLL for timing generation [16]. The timing resolution of an SCA recorder is determined in part by its sample clock generation topology. The various topologies are affected by different noise mechanisms and these noise sources degrade the timing resolution by causing errors in the timing of the sampled instances. The time in between the samples are referred to as the sampling interval. This topic of timing noise explored in greater detail in chapter 4. The following discussion presents a summary of the sample clock generation methods used in priorly mentioned multi-GSPS SCA waveform recorders.

A PLL is a versatile circuit with many applications, including clock generation and signal synchronization. Consequently, PLL circuits are well suited for sample clock generation in SCA recorders. There are many variations of the PLL circuit; prior SCA recorders typically use a frequency synthesizing type of PLL, which generates a high frequency clock from a low frequency reference signal. Figure 2.2 presents a simplified block diagram that illustrates the concept behind the frequency synthesizing PLL.

A low frequency reference clock serves as the input to the PLL. Through a negative feedback mechanism, the PLL produces an output oscillation that is phase locked (driven to be in phase) with the reference clock. A frequency divider circuit that divides by a factor of M is placed in the feedback path. This causes the PLL to lock to a frequency that is M times that of the reference signal and achieves a frequency multiplication. A ring oscillator based



Figure 2.2: PLL block diagram

voltage control oscillator (VCO) produces an oscillation with a frequency that is determined by the control voltage  $V_{ctrl}$ . A phase frequency detector (PFD) outputs a signal that is a function of the phase error between the VCO output and the reference clock. A low pass filter is placed between the PFD and the VCO to condition the voltage signal.

A number of prior SCA waveform recorders utilize a PLL circuit and generate their sample clock by tapping the signals off of the VCO's delay stages [16]. The PLL topology is capable of synthesizing multi-gigahertz sampling clocks from a low noise reference signal with a frequency in the megahertz range. An issue with the PLL topology is that it is susceptible to high frequency jitter stemming from the VCO. The PLL's corrective feedback loop is capable of suppressing high frequency jitter components from the reference clock; however the feedback loop does not respond to high frequency jitter from the VCO. The PLL jitter manifests as sampling jitter in the SCA recorder and adds noise to the output. The PLL is susceptible to noise coupled to the  $V_{ctrl}$  node, which also contributes to timing jitter. The PLL feedback drives the phase error between the VCO and the reference clock to zero (assuming a Type II PLL). While the end nodes of the VCO are synchronized with the reference clock, the intermediate VCO delay stages are not guaranteed to have uniform delays. Process variations and circuit mismatches causes the VCO delay stages to drift, ultimately causing timing errors.



Figure 2.3: DLL block diagram

A DLL is another type of circuit with various timing applications and has also been used in multiple prior SCA recorders for generating a sample clock [18],[17]. A simplified DLL block diagram that illustrates its fundamental operation is presented in figure 2.3. Using a voltage control delay line (VCDL), the DLL generates stable timing delays. The VCDL consists of a cascade of numerous delay stages where each delay stage has a propagation delay that is dependent on the control voltage  $V_{ctrl}$ . A phase detector followed by a low pass filter produces a voltage signal based on the phase error between the VCDL output and the reference clock. The feedback action drives the phase error to zero (presuming that the loop contains a charge pump circuit to provide a 1/s pole) and the VCDL is locked in phase with the reference clock. This results in a well-defined delay across the VCDL where the arithmetic mean of its stage delays equals to the reference clock period divided by the number of VCDL stages. A sampling clock where the sampling edges are separated by a defined sampling interval is generated by tapping the signal from the delay stages in the DLL's VCDL.

The DLL sample clock generation topology has some advantages over using a PLL. The VCDL is less susceptible to noise compared to the VCO. This leads to an improved jitter performance for the DLL topology in instances where high frequency noise coupling is an issue. It is easier to achieve closed loop stability in a DLL than a PLL, because the DLL's open loop gain does not include a 1/s term from a VCO circuit block. However, similar to the PLL, circuit mismatches in the DLL introduce delay variations among the stages and creates timing errors. These random variations in the spacing between the SCA's sampling edges appear as sampling jitter (timing noise).

The sample clock generation circuitry in the SST is notably different from the PLL and DLL topology. The SST's sampling clock is generated synchronously in the sense that digital logic components are coordinated by the clock edges of a clock signal distributed across the chip. The SST's synchronous sampling clock circuitry was designed and implemented by my colleague Tarun Prakash. A detailed examination the synchronous clocking design is presented in his dissertation [19]. The following discussion summarizes the operation of the SST's synchronous clock generation. A simplified block diagram meant to highlight the essentials of the of the SST's sample clock generation is presented in figure 2.4.



Figure 2.4: SST synchronous sampling clock generation block diagram

Prior to sampling the input, a 'sampling pointer' is instantiated in the SST's fast shift

registers. The position of a sampling pointer dictates if a sampling cell is in the tracking phase or if it holds the sampled voltage. Sampling across the array is performed by shifting the sampling pointer across the fast shift registers. The sampling pointer is shifted to the subsequent register on each of the SST's internal clock edges, resulting in a sampling rate equal to twice that of the reference clock frequency. In effect, the voltage sampling operation is synchronized by a single clock signal common to each shift register. A clock distribution tree is used to prevent clock skew among the shift registers. The sample clock signals are plotted and further discussed in the time-interleaved sampling section (Section 2.5).

This synchronous timing generation topology is robust and straight forward to design. The SST's timing generation does not rely on negative feedback, thus there are not any stability concerns, unlike with the PLL or DLL circuits. Since the sampling edged are governed by a fixed frequency clock signal instead of a voltage control delay element, this synchronous timing generation topology is more resistant to jitter caused by coupled voltage noises and is less susceptible to timing errors from process variations.

The decision to operate on both clock edges was made in order to increase the SST's sampling rate for a given input reference clock. Higher sampling rates allow for greater timing resolution on the waveform record; this is discussed in section 4.3. However, this design choice has the drawback of being susceptible to duty cycle variations. Any deviation from a 50% duty cycle results in periodic timing jitter on the sampling instances. The duty cycle variation can come from the LVDS oscillator reference clock or it may be introduced if there are asymmetrical current sinking/sourcing capabilities in the LVDS receiver.

The DLL and PLL timing generation topologies are capable of achieving greater sampling rates than the synchronous timing generation. With the PLL and DLL methods, the sampling intervals are determined by the voltage control delay stages, which can be made significantly faster than clock generation with a shift register. A prior work that was fabricated with a more advanced  $0.13 \,\mu m$  process uses a DLL based timing generation, and achieved sampling rates as high as 15.0 GSPS [17]. The SST achieved a maximum tested rate of 2.0 GSPS. However, a higher sampling rate corresponds to a shorter record length for a given SCA cell depth, so operating the SCA recorder at extremely high sampling rates would be impractical for many applications. The SST's 2.0 GHz sampling rate and 256 sample cell depth achieves sufficient accuracy and record duration to satisfies the requirements for the ARIANNA experiment.

The jitter performance of the synchronous timing generation is heavily dependent on the performance of the external oscillator. The SST's timing signals are derived from the LVDS reference clock, so any timing jitter from the external clock is transferred to the SST's sampling clock. Conversely, if the external oscillator has low phase noise, then the clean reference clock generates a low jitter sampling clock since the sampling clock is re-timed on the clock edges of a low jitter reference clock.

The SST's synchronous design takes advantage of the availability of the high quality, low cost, low noise LVDS oscillators. The external oscillators typically used with the SST are 1.0 GHz LC72 series oscillators made by FOX Electronics, but the SST is also compatible with a variety of other LVDS oscillators. The FOX oscillator is a fixed frequency, crystal oscillator with frequency stability of  $\pm 20$  ppm and jitter under 20.0 ps RMS. Using this low noise oscillator in conjunction with the synchronous timing generation allows the SST to achieve high timing accuracy, measuring timing events with a resolution ranging in the picoseconds.

### 2.3 SST Data Acquisition and Readout Control Signals

This section explains the purposes of the SST chip control signals and how they are related to the analog signal acquisition and readout. The SST chip relies on a simple three input control scheme to operate the waveform sampling capability. These signals are the reset, stop, and analog inputs. The timing diagram in figure 2.5 illustrates the input sequence that setups and executes the data acquisition operation. The scale shown on the top of figure 2.5 is the time axis expressed in nanoseconds.



Figure 2.5: SST chip signal acquisition control timing diagram

The purpose of the reset signal is to initialize the sampling pointer in the fast shift registers. When the reset signal is set to logic high, the dynamic latches in the fast shift registers are configured to prime the sampling pointer to be at the initial position of the sample and hold array. The reset input is briefly pulsed high prior to the SST entering its waveform sampling phase as shown in figure 2.5. For the reset signal to function properly, it is required that the stop signal be held high and that the reset signal is pulsed high for at least the duration of one period of the internal SST clock. With a 1.0 GHz LVDS oscillator, the minimum reset pulse width is 1 ns. In practice, the reset pulse is set for several nanoseconds.

The stop signal dictates whether or not the SST is sampling the input signal. When the stop signal transitions from low to high, the clock signals to the fast shift registers are disabled, causing the sampling pointer to be fixed in the position it held when the rising stop edge occurred. This effectively halts the sampling array from overwriting any previously sampled values and saves the input waveform. Setting the stop signal to low enables the clock signals to the fast shift registers, allowing the sampling pointer to cycle through. This puts the SST into the sampling mode by commanding the sampling array to sequentially track the input signal and store the sampled voltages.

After a stop has been issued, a waveform has been captured as sample voltages across the sample and hold cell array. These sample voltages can be held for some time, but will eventually be corrupted because of non-zero leakage currents. The average measured rate of voltage change in a held sample voltage is 150 mV per second. For the SST's 80pF MIMs capacitors, the voltage change corresponds to a leakage current of 0.012 pA. Based on the values for the 2.5 V full scale, 12-bit ADC used on the SST system board, the SST's sampled waveform can be held for 4.1 milliseconds before the voltage deviations on the samples reach one least significant bit (LSB).

The readout operation recovers the transient waveform captured during the input sampling phase. A timing diagram of the SST analog readout is presented in figure 2.6. The scale on the top of figure 2.6 represents the time axis in units of nanoseconds. During readout, output voltages that correspond to the saved sampled voltages are serially outputted to the analog output pin until the entire waveform record has been recovered. The analog voltage on the output is a scaled and offset version of the original sampled voltage because of the voltage buffering circuitry used to drive the signal to the output pin.



Figure 2.6: SST waveform readout timing diagram

The SST analog output voltages readout at a rate determined by the frequency of the read clock. The nominal frequency of the read clock used on the SST board is 1.0 MHz. At this readout frequency,  $256 \,\mu s$  are required to recover the entire waveform record. The 1.0 MHz readout frequency is selected to be compatible with the ADCs used on the system board. The SST chip itself can support up to a 5.0 MHz analog readout rate when loaded with a 15 pF capacitive load on the analog output pin.

The SST chip has a stop-out output that allows for a direct readout of the sampling pointer's position. As the analog samples are being shifted out onto the analog output pin, the stop-out pin will be driven high if the sample being read out corresponds to the position of the sampling pointer and remains low otherwise. Due to the circular sampling, the beginning and end of the sampled waveforms are not fixed to a specific cell position. Being able to directly observe the internal pointer via the stop-out allows for the chronological order of the samples to be determined without any ambiguity. The analog readout voltage prior to the stop-out marks the latest sample in the waveform record; while the readout sample after the stop-out signifies the earliest sample in the sample record.

## 2.4 SST Triggering Settings and Controls

Each of the four channels of the SST has a set of trigger inputs and outputs; this includes a high threshold input, low threshold input, and two trigger outputs. The SST chip configures the triggering criterion and the trigger output mode with the following inputs: L1 bias, L2 bias, AND/OR select, and Diff/Single. A block diagram representation of the SST triggering circuitry is presented in figure 2.7.

The purpose of the triggering is to detect events of interest and single them out for waveform capture. For the purposes of the ARIANNA experiment, the triggering is only


Figure 2.7: Block diagram of the SST triggering  $logic^2$ 

useful during the signal sampling phase and serves no purpose when the SST is holding the waveform or performing the readout operation. Therefore, the triggering circuitry is disabled whenever the SST stop input is high, so the triggering outputs are only valid during signal acquisition.

The high and low threshold inputs to the SST chip receive DC analog voltage inputs. Theses DC threshold inputs determine which voltages the comparators are monitoring for on the sampled signal. The comparators that are connected to the high threshold drive their output voltage to  $V_{dd}$  when the analog input voltage reached above the  $Vth_{high}$  voltage. Similarly, the comparators that are connected to the low threshold output  $V_{dd}$  when the analog input falls below the  $Vth_{low}$  voltage. These threshold values are provided to the SST chip by 12-bit digital to analog converters (DAC) that are on the SST system board. The DACs have a full scale voltage of 2.5 V and they are programed through the MBED microcontroller script. The by triggering on both a high and low threshold, the SST searches for the bipolar nature of a neutrino signal; this greatly increases the ability to discern between a particle event and random noise.

Two output modes exist for the trigger output: a single ended output mode and a differential output mode. These modes are set with the Diff/Single input where a logic high puts the SST into the differential mode and a logic low sets the SST to the single mode.

<sup>&</sup>lt;sup>2</sup>SST trigger logic block diagram was originally published in the reference [20].

In the single mode, a pulse stretched versions of the comparator's outputs are driven to the trigger output pins, bypassing the on chip trigger logic. While the SST will most likely be operating in its differential mode to make use of the triggering logic, the single mode operation can be useful for system testing or to test against an alternative triggering logic using the SST board electronics.

In the differential mode, the trigger outputs are complementary and the on chip triggering logic is applied to the comparator outputs. The AND trigger logic is applied by setting the AND/OR input pin high, while setting it to low implements the OR logic. In the AND logic mode, a trigger is issued when both the high and a low voltage thresholds are crossed within a specified time of each other. The AND trigger mode effectively monitors for bipolar pulses where a high and low threshold crossing occurs within a user definable coincidence window. The OR logic issues triggers when either the high or low threshold is crossed. The OR mode is less stringent since unipolar signals will also cause the trigger to signal an event.

The level 1 delay control (L1) is an analog input that establishes the time delay in the feedback path that stretches the comparator's output pulses. The relationship between the L1 voltage and the pulse stretching is such that a lower L1 voltage results in shorter pulse stretching while a higher L1 voltage results in longer stretched pulses. It is noting that the L1 voltage versus the stretched pulse width is exponential relationship rather than linear. The range of the L1 pulse stretching extends from 3.5ns to  $1.0 \,\mu s$ . The triggering logic was designed by my colleague Tarun Prakash and more details about the triggering logic is presented in his dissertation [19].

The effects of the L1 voltage differ slightly depending on the trigger logic setting and the output configuration setting. When the SST is put into the differential output setting and the trigger logic is set to the AND mode, the L1 voltage dictates the length of the coincidence window where the amount of pulse stretching equals to the coincidence window duration. In the single ended output setting, the stretched comparator outputs are directly outputted to the trigger output. Therefore, in the single output mode the L1 voltage determines the time durations of the high and low output trigger pulses.

The level 2 control (L2) controls a second round of output pulse stretching, which is applied to the triggering logic outputs. This L2 voltage determines the duration of the output pulses that signifies an event when the SST is set to the differential mode. The L2 delay has no bearing on the trigger output when the circuit is configured for single ended trigger outputs.

#### 2.5 Time-Interleaved Sampling

The fundamental principal behind the SST's ability to achieve high sampling rates is its use of time-interleaved sampling. The analog input signal is connected in parallel to an array of N=256 identical sample and hold cells, which track and sample the input signal at syncopated intervals to acquire the waveform record. The effective sample rate of the time-interleaved sampling is  $F_s = 2.0$  GHz (with a sampling period  $T_s = 500$  ps). However, each individual sample and hold cell samples the input at a much lower rate of  $F_s/N$ . The sampling control signals generated by the fast shift registers are used to coordinate the sampling instances to be staggered by time intervals equal to  $T_s$ . An idealized illustration of the time-interleaved sampling and the sampling clock signals are shown in figure 2.8; the circuit delays are not pictured in the figure. The scale on the top figure 2.8 is the time axis in nanoseconds and the signals  $S_0$  through  $S_4$  signify the sampling clock signals to the sample and hold cells  $SH_{cell_0}$  through  $SH_{cell_4}$ . The sample and hold cells track the analog input signal when a logic high is applied to its corresponding sample clock signal. The cell holds its sampled voltage when a logic low is applied to its corresponding sample clock signal. The falling edges of the sampling clock signals are indicated by the dashed lines in figure 2.8 and they signify the instances where the cells store their sampled voltages. As illustrated in figure 2.8, each sampling instance is separated from the prior sample instance by a time interval of  $T_s$ .

As shown in figure 2.8, each sampling cell tracks the input for 1.0 ns (equal to two times  $T_s$ ) leading to an overlap in the tracking phase of two adjacent cells. Each cell tracks the input for longer than  $T_s = 500 \ ps$  to allow for enough time to completely discharge the previous sampled voltage saved on the cell. If the sample cell's tracking phase is too short, components of previous sampled voltages remain on the capacitor, leading to the presence of ghost pulses. The 1.0 ns track time is sufficient to completely overwrite any existing voltages in the sampling cells.



Figure 2.8: Internal SST sampling clock timing diagram

# Chapter 3

# Analog and Mixed Signal Circuit Description and Design Analysis

This chapter discusses the analog and mixed signal circuit blocks used in the SST. Each circuit block is analyzed on a transistor level and the component's functionality, design, and challenges are examined. The circuit blocks discussed in this chapter are the LVDS receiver, sample and hold cell, readout circuit, high speed comparators, and the sample and hold array.

### 3.1 LVDS Receiver

Low voltage differential signaling (LVDS) is an established standard for high speed input/output signaling in digital point to point data connections. The high speed and low power capabilities of the LVDS make it a suitable solution for the SST's high speed clock generation. The LVDS protocol represents binary bits with a voltage difference between two terminals; each terminal is centered on the same common mode (CM) voltage. A digital '1' is signified with a positive voltage difference and a digital '0' is signified with a negative voltage difference. The LVDS voltage typically ranges between  $\pm 250 \ mV$  to  $\pm 400 \ mV$  [21].

The LVDS rejects CM noise because it detects the voltage difference between two terminals, thus giving it good noise immunity. Noise sources that couple to both terminals include noise on the power and ground lines, which often become especially problematic in the presence of high speed digital signals. The low voltage swings of the LVDS signals results in lower power dissipation during high bit rate operations when compared to other logic families such as single ended CMOS logic. Another advantage of the low voltage swings is that it allows for very high speed operation. Typical LVDS signal operation is greater than 1 gigabit per second; however, greater speeds achievable [22].

The SST relies on an off chip LVDS oscillator to generate the clocking signals. Since LVDS is a widely used industry standard, there is a wide selection of compatible off the shelf components. This allows for great system flexibility where the SST's sample rate can be selected simply by changing the oscillator frequency. The SST has been verified to have an extremely wide range of sampling rates, functioning as low as 2 kilo-samples per second and as high as 2 giga-samples per second. With this LVDS clocking configuration, the change of a single component can alter the system's sample rate to suit a wide range of applications without making any other hardware modifications.

The SST relies on the LVDS receiver circuit to provide the on chip clock signals. The essential function of the LVDS receiver is to take the signal from the LVDS oscillator and convert it into complementary, rail to rail clock signals for the CMOS circuitry on the SST. To ensure proper functionality for the SST, the LVDS receiver was designed to satisfy several criterion. The LVDS oscillator's CM output voltage must fall within the receiver's CM input range. The LVDS receiver outputs must settle to less than 10% of the supply rails. A sufficiently large voltage gain is required to amplify the LVDS signal to full scale CMOS logic levels and have sharp transition edges. Finally, the LVDS receiver must meet a speed requirement and maintain proper operation at frequencies high enough to implement Nyquist frequency sampling rates. The circuit presented in figure 3.1 is the LVDS receiver circuit design used in the SST. The following section discusses the operation and design considerations of the LVDS receiver.



Figure 3.1: SST LVDS receiver schematic

The circuit in figure 3.1 acts as a comparator stage that performs a high gain amplification of the LVDS signals to convert them into full scale CMOS logic levels. The symmetry of the circuit's fully balanced design allows it to generate complementary phases of the CMOS logic output.

Examining the low frequency, large signal behavior explains how the LVDS receiver operates. The NMOS source coupled transistor pair (M1 and M2) converts the differential input voltage into a current difference between the drain currents of M1 and M2. The bias current of transistor M3, which is referred to as  $I_{tail}$ , is steered between M1 and M2 as a function of the differential input voltage. Once the differential input voltage becomes large enough, full current switching occurs and all of the  $I_{tail}$  current flows through the transistor with the greater gate voltage, while no current flows through the other transistor in the differential pair. Full current switching occurs once the differential input voltage exceeds  $|V_{id}|$  expressed in equation 3.1.

$$V_{id} = \sqrt{\frac{2I_{tail}}{\mu_n C_{ox}(W/L)}} \tag{3.1}$$

Where  $\mu_n$  is the NMOS mobility,  $C_{ox}$  the oxide capacitance per unit gate area, W and L are the width and length respectively of M1 and M2. The  $I_{tail}$  current is chosen to be large enough to meet the speed requirement of the circuit. The aspect rations of M1 and M2 are designed so that the LVDS input voltage is larger than  $V_{id}$ , causing full current switching between M1 and M2.

The drain currents of M1 and M2 travel through the diode connected PMOS transistors M6 and M7 and are current mirrored to the output transistors M12, M13, M16 and M17. The NMOS transistors M12 and M16 implement the pull down network for the output nodes  $V_o+$  and  $V_o-$  respectively while the PMOS transistors M13 and M17 implement the pull-up network for the output nodes  $V_o+$  and  $V_o-$  respectively. The LVDS receiver behaves as a class AB amplifier. When the differential input voltage is positive and larger than  $V_{id}$ , the  $V_o+$  output node is driven high and the pull-up network is fully conducting while the pull-down network is cutoff. Conversely, when the  $V_o+$  output node is driven low by a large negative differential input, the pull-down network fully conducts while the pull-up network is cutoff. This push pull action allows for rail to rail operation at the output nodes, reaching 0 V and  $V_{dd}$ . With zero differential input, both the pull-up and pull-down networks conduct and both the  $V_o+$  and  $V_o-$  output nodes are driven to an output common mode voltage. The output common mode voltage is determined by the device sizing and  $I_{tail}$  bias current, and the LVDS receiver is designed to have a output common mode voltage near  $V_{dd}/2$ . While the precise value of the common mode voltage of the LVDS receiver will vary due to process variation, the current mirrors establishes a common mode voltage that keeps the output transistors M12, M13, M16, M17, biased in the active region of operation.

The PMOS transistors M4 and M5 form a crossed coupled pair which creates a large voltage amplification of a small voltage difference between nodes 1 and 2. The crossed coupled pair forms a positive feedback loop that acts like a bistable latch. Consider the case where the gate voltage of M1 and M2 are kept the same, then the gate of M1 is increased to be greater than the gate voltage of M2. The drain current of M1 increases, while the drain current of M2 decreases, causing the voltage at node 1 to decrease and the voltage at node 2 to increase. Node 1 is also the gate voltage of M5, so as the node 1 voltage decreases, the PMOS transistor M5 conducts more, causing the node 2 voltage to be pulled toward  $V_{dd}$ . This causes the PMOS transistor M4's overdrive voltage at node 1 to further reduce. This regenerative loop realizes high voltage gains which allows the LVDS receiver's output voltages to resolve to CMOS logic levels for small, millivolt level differential input voltages.

The SST's LVDS receiver was designed without hysteresis by sizing transistors M4, M5, M6 and M7 to have the same size. However, hysteresis can be introduced to the design by sizing the crossed coupled pair (M4 and M5) to have a larger aspect ratio than transistors M6 and M7 [23]. Hysteresis can be useful for preventing the high frequency voltage noise in the input from causing erroneous transitions in the output signal. However, the input to the LVDS receiver has such a low voltage noise power that hysteresis in not necessary. Another reason hysteresis was omitted was over the concern that hysteresis would add timing noise to the SST causeing periodic variations in the clock period.

Performing a small signal analysis on the LVDS receiver is useful for insights on determining the limiting factors of its speed. The circuit poles are estimated by the method of calculating the equivalent capacitance and resistance at each node. This analysis shows that the frequency response is typically dominated by nodes 1 and 2 in figure 3.1. In the LVDS design, the transistors M4, M5, M6 and M7 are sized equally so that their transconductance are the same. The equivalent resistance of the diode connected transistors M6 and M7 have a small signal impedance of  $1/gm_{6,7}$ . The equivalent small signal resistance of the crossed coupled transistor pair looking into nodes 1 and 2 are negative resistances of  $-1/gm_{4,5}$ . Both of these impedances appear in parallel with each other and cancel each other out. Therefore the equivalent impedance seen at nodes 1 and 2 are given by equation:

$$R_{node_{1,2}} = r_{0_{1,2}} ||r_{0_{4,5}}||r_{0_{6,7}}$$
(3.2)

Where the  $r_o$  resistances are the corresponding transistor's output resistance caused by channel length modulation. The equivalent capacitance seen at these nodes are given by equation:

$$C_{node_{1,2}} = C_{DB1,2} + C_{DB4,5} + C_{DB6,7} + C_{GS4,5} + C_{GS6,7} + C_{GS10,14} + C_{GS13,17}$$
(3.3)

Where the  $C_{DB}$  capacitances are the drain bulk capacitances and  $C_{GS}$  are the gate source capacitances for their respective transistor. The transistors connected to nodes 1 and 2 are sized to draw several milliamps of current, so their capacitances can get quite large. The estimated time constants associated with these nodes are given by equation:

$$\tau_{node1,2} = \frac{1}{R_{node_{1,2}}C_{node_{1,2}}} \tag{3.4}$$

The relatively large capacitances and resistance appearing at nodes 1 and 2 makes these nodes the dominant factor in limiting the high speed performance in most practical cases. The output nodes  $V_o$ + and  $V_o$ - also affect the overall speed of the LVDS receiver, but in general, nodes 1 and 2 dominate the speed performance. The LVDS receiver drives a tapered clock driver so that the load capacitance seen at the  $V_o$ + and  $V_o$ - nodes are equivalent to two or three times the capacitance of a minimum sized inverter. With the clock drivers buffering the LVDS receiver, the LVDS receiver's output nodes have lower equivalent capacitance than that seen on node 1 and 2.

In data communication systems, the LVDS receivers are typically designed a for rail to rail input CM voltage range because the input signals' CM may vary considerably before reaching the receiver. To achieve the rail to rail operation, LVDS receivers include a preamplifier stage with a pair of NMOS and PMOS input stages [22]. However, the SST's LVDS receiver does not need to have a CM input range that spans the entire power supply. The SST's CM input voltage is determined by the external oscillator and it is stable and well defined; the specific oscillators used in the SST system boards have a CM output voltage between 1.25 V and 1.3 V. Instead of the near 0 V to  $V_{dd}$  CM input range, the LVDS receiver is designed to operate with a CM input range equal to the oscillator's CM output (plus a several hundred millivolt margin). Without the need for a full range input CM, the preamplifier can be omitted to realize speed improvements and power saving.



Figure 3.2: SST LVDS receiver transient waveform simulation



Figure 3.3: SST LVDS receiver eye diagram

The functionality of the LVDS receiver design was verified with the Cadence Virtuoso simulation tool. The simulations include a model approximating the behavior of the transmission line and IC packing connecting the off chip 1.0 GHz LVDS oscillator to the on-chip LVDS receiver. A pair of  $50\Omega$  termination resistors is used to match the impedance. A transient plot of the LVDS receiver's input and output waveforms are shown in figure 3.2. The input signal in the plot is a 1.0 GHz LVDS input signal with 375 mVpp amplitude and a common mode input of 1.25 V. The simulated output confirms that the receiver achieves rail to rail CMOS logic level outputs at gigahertz speeds.

Figure 3.3 presents the LVDS receiver's eye-diagram of the differential input and output waveforms. The input used to generate the eye-diagram is a pseudo-random binary sequence (PRBS) with a rate of 2.0 gigabit per second and an amplitude of 375 mVpp. The rate of 2.0 gigabit per second is chosen because it results in an eye-diagram period corresponds to the period of the 1.0 GHz LVDS oscillator signal. The eye-diagram shows a wide open eye pattern for the output signal with no significant signs of inter-symbol interference (ISI). This simulation indicates that the LVDS receiver does not significantly contribute to the timing jitter of the SST.

The LVDS receiver is fabricated in TSMC  $0.25 \,\mu m$  CMOS process and the layout is shown in figure 3.4. The differential pair input NMOS transistors utilize an interdigitated layout to mitigate the effects of process variation.



Figure 3.4: Layout of the LVDS receiver

#### 3.2 Sample and Hold Cell

The SST's sampling capability relies on an array of sample and hold cells. Each sample and hold cell tracks the input signal, samples the voltage, and stores the voltage sample until readout. A basic switch capacitor track and hold circuit is used in the SST design. Figure 3.5a shows the block diagram of an ideal sample and hold cell. While the clock phase  $\phi_1$  is high the switch is closed and the input voltage appears across the sampling capacitor  $C_{sh}$ . When  $\phi_1$  transitions to low, the switch opens and a voltage sample of the input signal at the moment of the  $\phi_1$  falling edge is preserved across the sampling capacitor.



Figure 3.5: Sample and hold cells

Practical implementations of the sampling switch suffer from several non-idealities, including varying transmission resistance, parasitic capacitances, and charge injection. These non-idealities limit the performance of the sampled and hold cell and are discussed in the following. Three options to implementing the basic sampling switch include a NMOS switch, a PMOS switch, or a CMOS transmission gate. The SST uses a CMOS transmission gate implementation shown in figure 3.5b. The SST's sampling cells implement 80 fF sample capacitors using metal insulator metal (MIMs) capacitor. Both the NMOS and the PMOS transistors in the SST's sampling cells have the minimum gate length of 0.24  $\mu m$  and they both have a gate width of 2.52  $\mu m$ .

There exist more elaborate sample and hold cell implementations than the basic circuits shown in figure 3.5, but the more elaborate implementations have practical drawbacks that make them them less suitable for the SST. The following section explains the reasons behind choosing the basic, open loop sample and hold cell implementation. By including opamps in a closed loop, negative feedback configuration, several parameters of the sample and hold cell can be improved. One such example of a closed loop sample and hold circuit is shown in figure 3.6. A major advantage of this configuration is that it uses negative feedback to create a virtual ground node at the negative terminal of opamp A2. The virtual ground absorbs the charge injection from switch S1 as it closes, thus neutralizing the majority of the charge injection error and enhancing the circuit's sampling accuracy [24],[25].



Figure 3.6: Sample and hold circuit with enhanced accuracy with feedback <sup>1</sup>

However, the closed loop configuration in figure 3.6 has significant drawbacks; using it as the SST's sample and hold cell would dramatically increase the overall power consumption, and compromise the tracking bandwidth. The opamp circuit blocks in figure 3.6 must have sufficient frequency compensation so that the closed loop system is stable, and so that it achieves an adequate phase margin. The phase margin affects the output settling time, and thus restricts the sampling rate and effects the sampling accuracy. Typically a phase margin of at least 45° degrees or greater is needed for reasonable flatness in the closed loop frequency response [26]. A smaller phase margin will result in large overshoots and under damped oscillations in the output. For a phase margin of at least 45° degrees, the open loop gain of circuit must exhibit a single pole roll-off for all frequencies above the unity gain frequency. Circuits that have two amplifiers in the signal path, such as the one in figure 3.6, generally have more than two poles present in its loop gain and require greater frequency compensation. When greater frequency compensation is applied, less of the potential circuit bandwidth is utilized, thus potentially limiting the overall operation speed.

Having large compensation capacitors in the circuit requires a large biasing current in order to meet the slew rate requirement. The slew rate is a large signal parameter indicating

<sup>&</sup>lt;sup>1</sup>Enhanced sample and hold circuit from reference [25].

the largest rate of change in the output voltage  $\left(\frac{\partial V}{\partial t}\right)$ . When the input signal exceeds the slew rate capability, the output will have errors due to nonlinear distortions. To prevent slew rate limitation, the following criterion must be satisfied:

$$\frac{I_b}{C_{cc}} \ge \omega_0 * A_{max} \tag{3.5}$$

Where  $I_b$  is the maximum current available from the opamp's first stage,  $C_{cc}$  is the compensation capacitor,  $A_{max}$  is the peak amplitude of the waveform, and the  $\omega_0$  is the signal's operation frequency [24], [26]. From equation 3.5, for a given large signal amplitude and frequency, the large compensation capacitances needed for close loop stability increases the current required to prevent slew rate limitation. Therefore, satisfying the slew rate conditions come at the cost of high power dissipation due to the large bias current.

An additional drawback to the closed loop configuration in figure 3.6 is that it substantially increases the circuit complex compared to the open loop sample and hold circuit in figure 3.5b. The closed loop configuration in figure 3.6 requires several times as many transistors and additional capacitors, which require much more die area per cell on the layout. The compensation capacitors needed in the feedback configurations can range from several hundred femtofarads to picofarads. These compensation capacitors have an equal or greater capacitance value than the sampling capacitor, and would drastically increase the layout area of each sample and hold cell. The open loop sample and hold cells have much greater layout area efficiency. Since they are inherently stable, they do not need large die area for compensation capacitors. The SST utilizes an array of 256 identical sample and hold cells, so any savings in the layout area is multiplied by the number of array elements. The simplicity of the basic sample and hold cell allows it to be laid out in a very compact fashion, resulting in an SST array with high integration density.

As previously discussed, closed loop sample and hold cells such as in figure 3.6 improve

accuracy at the cost of operation speed and power consumption. The opamps used in the closed loop sample and hold cells substantially increase the power consumption since they are arrayed 256 times. As a result, even a modest increase in power consumption in a single sampling cell would result in significant increases in the SST's overall power consumption. An objective for the SST chip is a low power design; therefore, the open loop sample and hold cell is preferable for the SST implementation.

The open loop configuration is a simple and robust circuit that can operate with low power consumption at high speeds. The worst case power consumption of the open loop sample and hold cell (shown in figure 3.5b) is calculated as the power dissipated when each for sampling cycle charges the sample capacitor to the maximum input voltage; the maximum power dissipation of the figure 3.5b circuit is given by equation:

$$P_{worstcase} = \frac{C_{samp} * F_{samp} * V_{inmax}^2}{2} \tag{3.6}$$

Using the SST specifications of a 1.6 V  $V_{inmax}$ , a  $C_{samp}$  of 80 fF, and a sample rate of 2.0 GSPS, the worst case power consumption is 0.205 mW for the entire array. This is far lower than the several hundreds of milli-watts that would be required for an array of closed loop sample and hold cells. The use of an open loop sampling cell configuration (shown in figure 3.5b) means that it does not need frequency compensation and that translates to greater tracking speed. An additional advantage to figure 3.5b is that its lower component count results in a much more compact layout, saving a lot of layout area. While the open loop sample and hold cell does suffer from lower accuracy compared to the close loop sample and hold cell, the open loop sampling cell has been shown to achieve 10 to 11 bits of accuracy. This level of accuracy is acceptable for the ARIANNA experiment. Considering these factors, the open loop sample and hold cell is the more suitable circuit for the SST.

During the tracking operation of the circuit in figure 3.5b, the drain source voltages

across the pass transistors are much smaller than  $V_{DS,sat}$ , making the triode resistance equations a reliable first order approximation. The  $R_{on}$  resistances for the NMOS and PMOS transistors are approximated by the triode resistance equation 3.7 and equation 3.8 respectively.

$$R_{onN} \simeq \frac{1}{\mu_n C_{ox}(\frac{W_n}{L})(V_{dd} - V_{in} - Vtn)}$$

$$(3.7)$$

$$R_{onP} \simeq \frac{1}{\mu_p C_{ox}(\frac{W_p}{L})(V_{in} - |Vtp|)} \tag{3.8}$$

The CMOS transmission gate's  $R_{on}$  resistance equals to the parallel connected resistances of the NMOS and PMOS switch:

$$R_{onCMOS} = R_{onP} || R_{onN} \tag{3.9}$$

The CMOS transmission gate was chosen because it allows for signals sampling on input signals ranging from 0 V through  $V_{dd}$ . The NMOS switch approaches cutoff as the voltage across the capacitor increases, and when  $V_{Csh} = V_{dd} - V_{thn}$ , the gate source voltage equals  $V_{thn}$ , the overdrive voltage equals 0 V, so the NMOS turns off. Therefore an NMOS sample and hold cell cannot track voltages above  $V_{dd} - V_{thn}$ . Similarly a PMOS sample and hold cell cannot track voltages below  $|V_{thp}|$ . Choosing a CMOS transmission gate overcomes the voltage restrictions of either an NMOS switch or PMOS switch alone. However, the CMOS transmission gate significantly increases the parasitic capacitance contribution of each cell. Each sample and hold cell contributes capacitance to the analog bus line, which lowers the SST's bandwidth. This issue is discussed in further detail in the section on designing the sample and hold array for high bandwidth (section 3.5). As shown in equations 3.7, 3.8, 3.9, the  $R_{on}$  resistances are functions of the input voltage. For large signal operations where the input signal varies by more than several hundreds of millivolts, the  $R_{on}$  resistance of the pass gates will also show large variations. A simulation of the operating point  $R_{on}$  resistance versus the DC input voltage for the NMOS, PMOS and CMOS pass gates are shown in figure 3.7. The resistance plots illustrate why the CMOS switch is necessary for sampling voltages across the entire supply voltage range. With the NMOS switch, the  $R_{on}$  resistance rapidly increases above 3 K $\Omega$  as the voltage exceeds 1.0 V. The PMOS switch  $R_{on}$  resistance dramatically increases above 3.5 K $\Omega$  as the voltage drops below 1.1 V. When the resistance significantly increases, so does the circuit's time constant and the circuit will lose the ability to track high frequency signals. The CMOS switch's  $R_{on}$  resistance does not exceed 2.8 K $\Omega$  for input voltages across the entire voltage range; this keeps the CMOS switch conducting even for input voltages near 0 V or  $V_{dd}$ .



(c) CMOS pass transistor on resistance

Figure 3.7: Pass transistor on resistance plots versus input voltage

While the CMOS switch prevents the  $R_{on}$  resistance from approaching a virtual open

circuit, the resistance still varies as a function of the input voltage and this can be problematic. This variation in resistance causes nonlinear distortions in the sampled signal, which generally are not corrected for. This leads to harmonics spurs in the frequency domain and has a detrimental effect on the accuracy of the sampled signals. This is discussed further when examining the SST's ENOB and dynamic performance (in section 5.6).

A discussion on the bandwidth of a single sample and hold cell (figure 3.5b) is presented in the following paragraph. An analysis of how the overall array bandwidth is affected by the individual sample and hold cells is presented later in section 3.5. The upper liming to the overall SST -3dB frequency is given by the sample and hold cell's bandwidth. Since the input signal path goes through the sample and hold cell, the SST's sampling array bandwidth cannot exceed the cell's -3dB frequency. The bandwidth of the sample and hold cell is given by equation:

$$\omega_{3dB} = \frac{1}{\left(R_{onCMOS} * C_{sh}\right)} \tag{3.10}$$

Where  $R_{onCMOS}$  is the CMOS switch's on resistance from equation 3.9 and  $C_{sh}$  is the sample capacitance. A large bandwidth is required to prevent high speed features in the input signal from being filtered out. This is accomplished by minimizing the  $C_{sh}$  and  $R_{onCMOS}$  but doing so will compromise other aspects of the circuit. The following discussion demonstrates that the analog bandwidth of the sample and hold cell trades off with signal accuracy and noise performance.

Charge injection is a problem that introduces errors in sampling circuits. It occurs when the sampling switch opens and charge from the inversion channel under the gate is injected into the sampling capacitor, resulting in errors on the sampled voltage. As the FET transistors transitions from conducting to cutoff, the charge that makes up the inverted channel gets injected into the source and the drain. Approximately half of the charge gets injected into the sample capacitor, resulting in a change in the sampled voltage [27]. The voltage error is a function of the fabrication process and the circuit parameters of the sampling cells. The voltage error is given by the equation:

$$\Delta V = \frac{WLC_{ox} * (V_{dd} - V_{in} - V_{th})}{2 * C_{samp}}$$
(3.11)

The threshold voltage in equation 3.11 is modulated with  $V_{in}$  due to the body effect, causing the charge injection error to be a nonlinear function of the input. This ultimately causes nonlinear distorts in the sampled signal.

Based on equation 3.11, the design of the sample and hold cell can mitigate the effects of charge injection by minimizing the transistor width and length, while maximizing the sampling capacitor. However, these design choices also act to reduce the circuits' bandwidth. The passing transistor widths cannot be made arbitrarily small and the sampling capacitor cannot be made arbitrarily large without severely degrading the bandwidth of the sample and hold cell. The designer needs to carefully consider this trade off to ensure that both the speed and accuracy specifications are met.

Another advantage of using the complementary pass switch (shown in figure 3.5b) is that it can be designed to reduce the effects of charge injection compared to an NMOS only switch [28]. The charge injection error caused by the NMOS' channel is due to an injection of electrons into the sample capacitor, causing the sampled voltage to have a negative voltage step. On the other hand, the PMOS channel injects holes and causes a positive voltage step. With a CMOS switch, the charge injections from the NMOS and the PMOS will counter act each other, reducing the overall charge injection error. This technique does not completely cancel the charge injection error across the entire input voltage range because the NMOS and PMOS charge injections vary differently with the input voltage, but it can mitigate the overall charge injection error. The sampled voltage taken by the sample and hold cell has a stochastically random noise component caused by the thermal noise in the conducting channel of the sampling transistor switches. The total noise power integrated across the entire frequency spectrum is given by equation:

$$Power_{n,out} = \frac{kT}{C_{samp}} \tag{3.12}$$

Where k equals the Boltzmann constant (k=1.38e-23 J/K) and T is the temperature in Kelvin. If the sampled voltage noise were measured, it will have an RMS value of:

$$V_{n,RMS} = \sqrt{\frac{kT}{C_{samp}}} \tag{3.13}$$

For a constant temperature, the RMS noise voltage is only a function of the sampling capacitor, and it is independent of the sampling switch resistance. It may appear counter intuitive that the total RMS noise is not a function of the resistance, since the channel thermal noise spectral density is proportional to the resistance and is given by:

$$\frac{V_{n,therm}^2}{\Delta f} = 4kTR\tag{3.14}$$

This is because the sampling cell's low pass characteristic shapes the thermal noise spectrum at the output, and any changes in the power spectral density due to changes in resistance are counteracted by opposing changes in the circuit bandwidth.

The dominant noise source in the SST is the kT/C noise. While other circuit blocks such as the analog readout/multiplexing circuit and the analog output driver contribute noise to the SST output, these noise sources are at least an order of magnitude lower than the kT/C noise component. From equation 3.13 it is clear that the sampling noise can be reduced by maximizing the size of the sampling capacitor. However, increasing the size of the sample capacitor increases the equivalent time constant of the sampling cell, degrading its bandwidth. The sample capacitor size must be calculatingly selected to achieve sufficiently low noise, while meeting the bandwidth specification.

## 3.3 Analog Readout Circuitry (Multiplexer)

Once the SST stopped its sampling phase, each SST channel has a waveform record that has been preserved as a series of voltages stored across the sample and hold cell array. The sampled waveforms are recovered when the SST performs the readout operation. The SST waveform readout relies on the analog readout circuitry to act as a voltage buffer and to select a sample and hold cell from the array for readout. A schematic of the readout circuitry is shown in figure 3.8.



Figure 3.8: Analog voltage buffer and readout multiplexer circuit schematic

The design of the analog readout circuit is based in part on a one dimensional, three transistor (3T) CMOS active pixel sensor array (APS). The APS is widely used in modern

digital cameras and has been proven to be low power and allows for a high degree of integration [29],[30]. Adapting the APS concept for the SST results in an efficient, low powered sample selection scheme.

The readout circuit connects to the positive plates of each of the 256 sampling capacitors in each of the sample and hold cells, and multiplexes the sampled voltages. Based on the  $\overline{V_{readout}}$  signals, one of the sampled voltages is driven on to the  $V_{out,sel}$  node. The  $V_{out,sel}$ node acts as an analog bus line that connects the current source load (PMOS transistor  $M_{cs}$ ) to each of the selection transistors ( $M_{sel}$ ). The gate voltage of  $M_{cs}$  is biased to draw an  $I_{bias}$ current of 100  $\mu A$  with an external current mirror. The  $M_{sf}$  PMOS transistors implements a source follower that buffers the analog voltage samples saved on the sample and hold cells.

To perform the multiplexing operation, the PMOS selection transistors  $(M_{sel_i})$  operate as pass switches. When the gate voltage is high, the PMOS selection transistor is in cutoff mode and the transistor has high impedance. This results in practically zero current flow through the PMOS selection transistor, thus isolating the corresponding sample cell from the output node. There is a small amount of leakage current flowing through the transistor when it is in cutoff, but it is on the order of picoamps and has a negligible effect in the SST's analog readout. When the gate voltage of the PMOS selection transistors is set to a low voltage, it conducts and allows the signal to pass to the output node.

The gate of each PMOS selection transistor is connected to a  $\overline{V_{readout}}$  control signal for a total of 256 signals. These signals determine which sample voltage appears on the analog output node. Only one of the 256  $\overline{V_{readout}}$  signals is low at any given time; the remaining 255 control signals are kept high, thus selecting only one sample voltage for the output. These  $\overline{V_{readout}}$  control signals are generated by a 'low speed' shift register that advances the read pointer at a rate determined by an externally supplied readout clock. The circuit in figure 3.9 is an equivalent representation of the readout circuitry during the readout of a selected cell. This circuit functions as a modified PMOS source follower. The PMOS selection transistor  $M_{sel_n}$  conducts in deep triode mode and can be approximated as a linear resistor. The on resistance is approximately given by equation:

$$R_{Sel} \simeq \frac{1}{\mu_p C_{ox}(\frac{Wp,sel}{L})(V_{biastee} + V_{sg_{sf}} - V_{tp})}$$
(3.15)

Where  $W_{p,sel}$  is the width of the PMOS select transistor,  $V_{biastee}$  is the input bias voltage of the input signal,  $V_{sg}$  is the source-gate voltage, and  $V_{tp}$  is the PMOS threshold voltage.



Figure 3.9: Circuit equivalent for analysis of the analog readout operation

Analyzing the circuit in figure 3.9 from a large signal perspective, the output voltage at  $V_{out,sel}$  is a DC shifted version of the input. The  $V_{out,sel}$  follows the input signal with an offset of  $V_{sg,sf} + V_{ds,sel}$  where  $V_{sg,sf}$  is the source-gate voltage of  $M_{sf}$  and  $V_{ds,sel}$  is the voltage across the source and drain of  $M_{sel}$ . In practice, the output signal does not follow the input exactly due to the channel length modulation and voltage threshold modulation through body effect. Due to these effects, the source-gate voltage varies, so the output signal is not a one to one replica of the input signal. The small signal model is used to account for these effects and predict the circuit behavior under small signal conditions. The readout circuit's voltage gain, calculated from the small signal analysis, is shown in the following:

$$Av_{readout} = \frac{gm_{sf}[(R_{sel} + r_{o,cs})||r_{o,sf}||1/(gmb_{sf})]}{1 + gm_{sf}[(R_{sel} + r_{o,cs})||r_{o,sf}||1/(gmb_{sf})]} * \frac{r_{o,cs}}{r_{o,cs} + R_{sel}}$$
(3.16)

Where  $gm_{sf}$  is the transconductance of the transistor  $M_{sf}$ ,  $gmb_{sf}$  is the body effect transconductance of transistor  $M_{sf}$ ,  $r_{o,sf}$  is channel length modulation resistance of transistor  $M_{sf}$ and  $r_{o,cs}$  is the channel length modulation resistance of transistor  $M_{cs}$ . The voltage gain of the readout circuit is very similar to a typical source follower circuit but a significant difference is that the readout contains the voltage divider term  $r_{o,cs}/(R_{sel} + r_{o,cs})$ . The readout circuit's voltage gain will always be less than unity, so the recovered signal suffers from attenuation. From equation 3.16, it is shown that increasing the impedances of  $r_{o3}$  and  $r_{o1}$ will help increase the voltage gain. If  $(R_{sel} + r_{o3})||r_{o1}$  approaches infinite resistance, then the small signal gain approaches the maximum value of:

$$Av \simeq \frac{gm_{sf}}{gm_{sf} + gmb_{sf}} * \frac{r_{o,cs}}{r_{o,cs} + R_{sel}}$$
(3.17)

Equation 3.15 and equation 3.16 show that the nonzero on resistance of the selection PMOS  $(R_{sel})$  contributes to lowering the voltage gain. The width of the  $M_{sel}$  PMOS transistor should be made wide to minimize the on resistance, thus reducing the signal attenuating effect of  $R_{sel}$ . A drawback to having a wide  $M_{sel}$  PMOS is that it adds more capacitance to the signal path, and slows down the readout rates. Incremental increases to the selection PMOS transistors have a compounded effect since the analog bus is connected to all 256  $M_{sel}$  PMOS transistors and each PMOS contributes to the overall capacitance on the output node  $V_{out,sel}$ . Fortunately, the readout rate for the SST data acquisition system is a modest 1.0 mega-samples per second. In general, the  $M_{sel}$  PMOS can be made several micrometers wide and still achieve the readout specifications without difficulty. Equation 3.15 and 3.16 also reveal that the voltage gain benefits from an increase in  $r_{o3}$ , the small signal output resistance of the current source load (*PMOS*  $M_{cs}$ ). Consider the case where resistance  $r_{o3}$  is increased to effectively be an open circuit. Then the  $\frac{r_{o3}}{R_{sel}+r_{o3}}$  term in equation 3.15 and equation 3.16 approaches unity, so both equations simplify to the voltage gain equation of a typical source follower circuit. The channel length modulation parameter  $\lambda$  is inversely proportional to the gate length, and the transistor's output resistance is approximately given by:

$$r_o \approx \frac{1}{\lambda * I_{bias}} \tag{3.18}$$

Therefore,  $r_{o3}$  can effectively be increased by lengthening the transistor gate length of the current source load PMOS. The gate length of 1  $\mu m$  is used; that is four times longer than the minimum gate length. Increasing the gate length of the current source load transistor requires that the PMOS width also be increased by a commensurate amount to maintain the same aspect ratio and hence preserve the same current bias. A drawback to having the increased gate length is that it also increases the capacitance at the output node, slowing down the readout speed. There is only a single PMOS current source load that is connected to the analog output bus, so there is not a multiplicative increase in the capacitance as opposed to the selection PMOS. The PMOS load transistor can be made fairly large before being problematic.

An issue with the readout circuit is that the selection PMOS causes distortion to the output signal. While modeling the conducting selection PMOS as a triode resistor is useful for design insights, the resistance varies as a function of the input. Over a wide range of input voltages, the selection PMOS transistor behaves as a nonlinear circuit element instead of a linear resistor. This causes the voltage across the selection PMOS to vary based on the input voltage despite a relatively constant  $I_{bias}$  current flowing through it. Referring back

to the circuit in figure 3.9 and recalling that the large signal analysis of the readout circuit gave output voltage relation as:

$$V_{out,sel} = V_{in} + V_{sg,sf} + V_{ds,sel} \tag{3.19}$$

The drain source voltage of the selection PMOS using the deep triode approximation is given by equation:

$$V_{ds,sel} \simeq \frac{I_{bias}}{k'(\frac{W}{L})(V_{in} + V_{sg,sf} - |V_{tp}|)}$$
(3.20)

If the body effect and channel length modulation effects are ignored,  $V_{tp}$  and  $V_{sg,sf}$  can remain constant, but  $V_{ds,sel}$  still varies with the input voltage. Since the  $V_{out,sel}$  signal equals  $V_{in}$  plus a non-constant offset, this is equivalent to a nonlinear distortion in the output. This distortion effect can be mitigated by reducing the range of variation of the  $V_{ds,sel}$ voltage. For a given input voltage range, increasing the aspect ratio of the selection PMOS transistor reduces the amount of variation in the  $V_{ds,sel}$  voltage and therefore reduces the overall nonlinear distortion caused by the  $M_{sel}$  transistor's nonlinear resistance.

The analog multiplexing circuitry uses PMOS transistors so that its input voltage range is compatible with the ARIANNA signal voltage range. The ARIANNA input signals range from 0 V to 1.6 V. If NMOS transistors were used instead of PMOS, the analog multiplexing circuit would have an absolute minimum input voltage of the NMOS threshold voltage  $V_{thn}$ . Since any input voltages below  $V_{thn}$  will cutoff the NMOS transistor, the circuit will not operate. In practice an NMOS analog multiplexer will have a minimum input voltage significantly higher than  $V_{thn}$  because the input voltage needs to be high enough to also keep the current source in the active region of operation.

In order to satisfy the 0 V to 1.6 V input voltage range requirement, the SST analog

multiplexer uses the PMOS transistor implementation seen in figure 3.8. With this implementation, the SST will function properly for inputs near the lower rail of 0 V since all transistors can still be maintained in the saturation region of operation. The upper limit for the input voltage for the PMOS multiplexer is calculated using the deep triode approximation for the voltage drop across the conducting selection transistor, and is given by:

$$V_{in,max} = V_{bias} + |V_{thp}| - V_{sd,sel} - V_{sg,sf}$$

$$(3.21)$$

Where  $V_{bias}$  is the gate bias voltage of current source PMOS  $M_{cs}$ , and  $V_{sg,sf}$  is the source gate voltage of the  $M_{sf}$  PMOS.  $V_{sg,sf}$  is given by the equation:

$$V_{sg,sf} = \sqrt{\frac{2I_{bias}}{\left(K'\frac{W_{sf}}{L_{sf}}\right) + |V_{thp}|}} \tag{3.22}$$

The SST will be able to properly readout recorded input voltages that range between 0 V and  $V_{in,max}$ . If the SST input signal exceeds  $V_{in,max}$ , the  $M_{cs}$  transistor will drop out of saturation and the readout circuit will experience a dramatic drop in voltage gain. This translates to nonlinear distortion in the form of signal compression in the SST analog readout.

If the width of  $M_{sel}$  is sized sufficiently wide, and the input voltage  $V_{in}$  is near  $V_{in,max}$ , the  $V_{sd,sel}$  is relatively small so  $V_{in,max}$  is approximately given by equation:

$$V_{in,max} \simeq V_{bias} - V_{sg,sf} + |V_{thp}| \tag{3.23}$$

The  $V_{in,max}$  voltage is directly influenced by the gate bias voltage of the PMOS current source load,  $V_{bias}$ .  $V_{bias}$  is given by the equation:

$$V_{bias} = V_{dd} - \sqrt{\frac{2I_{bias}}{K'(\frac{W_{cs}}{L_{cs}})}} - |V_{thp}|$$

$$(3.24)$$

While a low  $I_{bias}$  current results in a wide input voltage range, it cannot be made arbitrarily small because the current must be high enough to meet slew rate specifications, noise performances, and transconductance requirements. For a given  $I_{bias}$  current, the  $V_{bias}$  is raised by having a wide gate width for the  $M_{cs}$  current source PMOS. Similarly, widening the  $M_{sf}$ PMOS will also improve  $V_{in,max}$ .

In the case of the SST, the analog output voltage is restricted by the output range of the analog multiplexing stage. The SST's analog output pin driver circuitry is a source follower stage with less than unity voltage gain so the voltage range of the analog multiplexing circuit dictates the upper limit of the SST output voltage range. The output voltage range of the analog multiplexing circuit is given by the following equation:

$$V_{o_{Range}} = V_{o_{sel,Max}} - V_{o_{sel,Min}} \tag{3.25}$$

$$= V_{dd} - |V_{tp}| - \sqrt{\frac{2I_d}{K'\left(\frac{W}{L}\right)_{cs}}} - \sqrt{\frac{2I_d}{K'\left(\frac{W}{L}\right)_{sf}}} - \frac{1}{K'\left(\frac{W}{L}\right)_{sel}\sqrt{\frac{2I_d}{K'\left(\frac{W}{L}\right)_{sf}}}}$$
(3.26)

From equation 3.26, it is shown that the transistor aspect rations and the bias currents have an impact on the output voltage range. However, the output voltage range is primarily reliant on two important parameters that a circuit designer typically has no control over; the power supply voltage  $V_{dd}$ , and the threshold voltage  $V_{th}$ . Both of these parameters are determined by the fabrication technology and they limit the overall SST operational voltage range.

The amount of power supply noise that is coupled to the output signal is an important aspect of the readout circuit's analog performance. The following analysis reveals how susceptible the analog readout circuit is to power supply noise. The presence of digital components in an integrated circuit can potentially cause switching noise to appear on to the power supply lines. That power supply noise can manifest itself in the analog circuitry and degrade the noise performance of the analog output. The signal gain from the  $V_{dd}$  node to



Figure 3.10: Simplified readout circuit for analyzing the power supply noise transfer function

the analog output node is used to study how the power supply noise corrupts the readout output. The biasing circuitry for the readout circuit needs to be considered for this calculation. The biasing current is set by an external resistor  $R_{cs,bias}$ , and that current is mirrored to the  $M_{cs}$  PMOS transistor. The simplified circuit used to calculate the gain from  $V_{dd}$  to the output is shown in figure 3.10. The low frequency voltage gain from the  $V_{dd}$  node to the output given by the following equation:

$$\frac{\Delta Vo}{\Delta V_{dd}} \approx \frac{gm_{cs,mirror}(\frac{1}{gm_{sel}})}{gm_{cs}R_{cs,bias} + 1}$$
(3.27)

The PMOS  $M_{cs,mirror}$  and PMOS  $M_{cs}$  are sized equally to give the current mirror a unity current gain. Since both transistors have the same aspect rations and bias current,  $gm_{cs,mirror}$ and  $gm_{cs}$  share the same value. In general,  $gm_{cs}R_{cs,bias}$  is several times greater than 1 so the  $V_{dd}$  to output voltage gain can be simplified to:

$$\frac{\Delta v_o}{\Delta v_{dd}} \approx \frac{1}{gm_{sel}R_{cs,bias}} \tag{3.28}$$

In general, practical component sizes and typical biases values call for an  $R_{cs,bias}$  bias resistor on the order of several k $\Omega$ . Since the the  $M_{sel}$  PMOS transistor operates in the deep triode region, it has a low transconductance, which worsens the power supply noise rejection. According to the circuit simulation, the readout circuit has a  $v_{dd}/v_o$  small signal gain of 0.2V/V. This corresponds to a power supply rejection ratio of -14 dB. This reveals the potentially serious drawback that large noise components on the power supply may not be sufficiently attenuated on the output node. To prevent digital switching noise from coupling to the analog circuitry, the digital power supply is isolated from the analog power supply in the layout. It is cautioned that noise coupling from  $V_{dd}$  was witnessed in a few situations during testing when voltage spikes in the power supply appeared on the SST's analog output node.

The presence of voltage fixed pattern noise (FPN) is another issue that arises from the readout circuit. The voltage FPN refers to the phenomenon where the analog readout has random offset voltages that add a noise component to the SST output. The cause of the voltage FPN is primarily due to the threshold variation of each of the  $M_{sf}$  PMOS and  $M_{sel}$  PMOS transistors due to process variation [31]. Although offset voltage vary randomly, they are fixed for each cell position. Therefore the voltage FPN can be measured and compensated for during the waveform readout process. The voltage FPN characteristics and its mitigation through calibration are covered in more detail in chapter 5.

#### 3.4 High Speed Comparator

To implement sensitive and precise triggering capabilities, the SST relies on high speed comparators to discern when the input signal exceeds a programed threshold value. The SST has two separate threshold voltages per channel; a high threshold and a low threshold. A high signal pulse is outputted by the high speed comparators to indicate if the input signal ever reached above the high threshold voltage or if it fell below the low threshold voltage. The target performance specification for the high speed comparator is that its outputs resolve



Figure 3.11: Example of inputs for comparator triggering

to full logic levels with an input of a narrow Gaussian pulse. This pulse has a minimum full width half max (FWHM) duration of 500 ps and a minimum voltage difference (difference between the input and threshold voltage) of 15 mV. The plots in figure 3.11 illustrate the signals with the minimal pulse duration and minimal amplitude that the comparators must reliably detect. These specifications were determined based off of neutrino signal templates developed by the physicists performing the ARIANNA experiment.

The ARIANNA experiment requires continuous time monitoring for threshold crossings. Therefore the comparators need to operate asynchronously and cause a trigger anytime the input signal exceeds the threshold, regardless of the phase of any reference clock. There are two prevailing styles of comparators for use in high speed applications: an open loop comparator and a latch comparator. The open loop comparator is used to implement the SST's asynchronous comparator because it is more suitable for implementing the uninterrupted monitoring for threshold crossings.

The latch comparator (also referred to as the regenerative comparator) is widely used in high speed applications and is capable of high voltage gains, fast speed operation, and low power consumption [32][33]. However, the latched comparators generally require dual clock phase operation, so there are intervals where the latched comparator cannot register threshold crossings. A basic implementation of a CMOS latch, decision circuit is presented in figure 3.12.

The circuit in figure 3.12 requires two clock phases to operate. During the low clock phase, the circuit is in the sampling phase where the PMOS transistors S1 and S2 are on, passing input voltages  $V_{in1}$  and  $V_{in2}$  to nodes X and Y respectively. In this phase, the NMOS transistor M5 is in cutoff, disabling the latching transistors M1 through M4. The output of the comparator is unavailable during this time. When the clock phase switches from low to high, the comparator enters the evaluation phase. In this phase, the transistor M5 conducts in triode operation, allowing current to flow through the cross coupled transistor pairs M1-M2 and M3-M4. This enables a positive feedback loop where a small voltage differences between the X and Y node are amplified until the circuit latches up. Once the circuit settles, the output is available on nodes X and Y; the node with the higher voltage at the 'low-to-high' clock transition will be logic level high, while the other will be logic level low.

The latched comparator operates as a discrete time comparator, where the circuit makes the voltage comparison based on what the inputs are at the instant of the rising clock edge. This is effectively equivalent to performing comparator voltage discrimination on samples of the comparator inputs at a rate equal to the comparator clock rate. The output of the latch comparator is only valid for discrete time instances, and any changes between the sampled instances are not represented in the output of the latched comparator. Therefore, if a neutrino signal crosses the threshold voltage for an instant in between the comparator clock edges, then the SST would fail to trigger and the event would be lost.

If the input signal frequency spectrum were limited and well defined, and if the clock frequency of the latched comparator was sufficiently high, then it would be conceivable that the latched comparator can catch all threshold crossings. However, operating the latched



Figure 3.12: CMOS latch decision stage schematic for high speed comparators.<sup>2</sup>

comparator with a 1.0 GHz clock (assuming the nominal SST sampling rate of 2.0 GHz) will not make the voltage comparisons frequent enough to trigger on all pulses with a FWHM of 500 ps. A possible solution would be to synthesize a faster clock signal from the LVDS receiver clock using a clock multiplier circuit. Yet, doing so would significantly complicate the design. In addition to the added challenge of designing a high frequency clock multiplier circuit, the load on the LVDS receiver would increase and a clock distribution system would be needed for the new high speed clock to the latched comparators.

The SST implements the much simpler solution of using open loop comparators instead of the latched comparators. With the open loop comparator, there is no clocking required, and both the input threshold crossing and comparator output are continuous time operations. Therefore, the open loop comparator makes threshold discriminations without any periods of interruption. The open loop comparator used in the SST is shown in figure 3.13.

<sup>&</sup>lt;sup>2</sup>This CMOS comparator latch is presented in reference [32].



Figure 3.13: SST Open loop comparator schematic

The open loop comparator is essentially a high speed, high gain amplifier connected in open loop. The principal behind this circuit is that it amplifies the input with sufficient gain, so that the output voltage saturates to the supply rails. If the voltage at the positive input terminal were higher than the voltage at the negative terminal, the SST comparator amplifies the positive voltage difference, which causes the output voltage to reach  $V_{dd}$ . Similarly, if the positive input terminal has a lower voltage at the negative terminal, then the input voltage difference is negative and the comparator output voltage is driven to 0 V.

To achieve the speed and sensitivity goals illustrated in figure 3.11, it is necessary that the SST comparator has a high bandwidth while achieving the required gain. As shown in figure 3.13, each comparator circuit block consists of a series of five high speed, differential pair gain stages followed by a differential to single-ended conversion stage. The cascade of the high speed gain stages is a design technique that allows for extending the bandwidth at the expense of power consumption and layout area. The resistive load differential pair is an excellent candidate for the high speed gain stages for several reasons. It is one of the most basic amplifier configurations that operate with differential inputs, a characteristic that is required for a two input terminal comparator. Differential pair gain stages can be directly cascaded together without the need for capacitive interconnects to bias the circuit.
For differential signal operation, the resistive loaded differential pair behaves as a single pole stage, generally resulting in high speed operation.

The following method is used to calculate the minimum comparator gain required. The gain of the comparator determines the smallest input signals that the comparator can resolve to a full scale logic output; this is referred to as the comparator resolution. The minimum voltage gain is given by equation:

$$A_{vmin} = \frac{V_{dd}}{V_{in_{min}}} \tag{3.29}$$

Calculating the minimum comparator gain for the 2.5 V supply voltage and the 15 mV resolution from the ARIANNA specification, the comparator must achieve a voltage gain greater than 167 V/V. In practice, the comparator's low frequency gain is designed for a 500 V/V to ensure that it's output reaches the maximum/minimum voltages for all valid inputs, and to build in safety margins into the circuit.

The speed versus gain trade off is an important aspect of the differential pair design, and it is analyzed in the following section. Each of the high speed gain stages has a small signal voltage gain given by the equation:

$$a_0 = gm * R_{load} \tag{3.30}$$

Where  $R_{load}$  is the load resistor value and gm is the transconductance of the source coupled pair transistor. The single pole of each stage determines the stage's bandwidth and is given by:

$$\omega_0 = \frac{1}{C_{tot} * R_{load}} \tag{3.31}$$

Where  $C_{tot}$  is the total capacitance seen at the output of the gain stage given by the drain

capacitance of the MOS, the resistor's parasitic capacitance and next stage's load capacitance, which is the gate capacitance of the subsequent differential pair stage. Multiplying the gain and the bandwidth terms together results in the gain bandwidth product (GBW) given by equation:

$$GBW = \frac{gm}{C_{tot}}.$$
(3.32)

The gm of the a the differential pair is given by the equation:

$$gm = \sqrt{K' \left(\frac{W}{L}\right) I_{tail}} \tag{3.33}$$

The GBW can be increased at the cost of power dissipation by driving a larger tail current to an extent, but it is important to note that there is an upper limit to this for a given fabrication technology. For a fixed power dissipation, the constant GBW reveals the proportional trade off between the voltage gain and the bandwidth. Increasing the resistor load effectively increases the gain of the stage while also increasing its time constant and reducing its bandwidth. Using the 0.25  $\mu m$  fabrication technology, it is apparent that no practical single stage differential pair can meet the bandwidth design objective while achieving a sufficiently large voltage gain. An adequately high bandwidth can be achieved using a smaller resistor value for a small time constant, but this will result in a gain far too low to operate as a comparator. Conversely, a large resistive load would increase the low frequency gain, but it will not satisfy the speed requirement.

The SST comparator employs a cascade of gain stages as a circuit solution for achieving both the gain and high bandwidth requirements. Cascading a series of high speed, low gain stages is a technique that extends the gain and bandwidth well beyond what a single gain stage achieve. For a cascade of N gain stages, where each stage has voltage gain of  $a_0$  (from equation 3.30) and a bandwidth of  $\omega_0$  (from equation 3.31), the total gain and bandwidth are given by equations:

$$Av_{tot} = a_0^N \tag{3.34}$$

$$\omega_{3db_{tot}} = \omega_0 \sqrt[N]{2} - 1. \tag{3.35}$$

For cascaded stages of N > 2, the circuit bandwidth can be approximated by equation:

$$\omega_{3db_tot} \approx \omega_0 \frac{0.9}{\sqrt{N}} \tag{3.36}$$

Cascading multiple low gain single stage (each with a gain of  $a_0$ ) allows for a sufficient overall gain in the comparator. By lowering  $a_0$  through reducing the  $R_{load}$  resistance, each stage's bandwidth ( $\omega_0$ ) is increased. From equation 3.35 and equation 3.36, it is shown that increasing the number of cascaded stages will (at some point) lower the comparator bandwidth; however, there is a range of N where the increase in  $\omega_0$  is more pronounced, resulting in an overall increase in bandwidth. Using equations 3.30 - 3.36, the effect of cascading stages on the bandwidth (for a fixed overall gain) is illustrated in figure 3.14; in this plot, the overall voltage gain is fixed at 500 V/V, and the circuit bandwidth is plotted for different number of stages.

The most dramatic improvement in bandwidth comes from the first few increases in the number of cascaded stages. As illustrated in graph 3.14, there is a diminishing return with N so the bandwidth improvements beyond six stages is modest. An important factor to consider is that the power consumption and layout area practically increases proportionally with N, resulting in a power and area versus bandwidth trade off. The comparator is designed with five stages since it achieves a sufficiently high speed operation and the marginal performance improvements for more stages are not worth the additional power and area costs. With five



Figure 3.14: Comparator bandwidth vs. number of stages (for a fixed Av=500V/V) gain stages, each stage only needs to contribute a voltage gain of 3.47 V/V to meet the overall gain specification.

From the comparator schematic in figure 3.13, the initial gain stage is implemented with a PMOS differential pair. This was done to accommodate for the ARIANNA input voltage range and allow the comparators to operate for inputs that are a couple of hundred millivolts above the ground voltage. The NMOS differential pair stages would not work for these low voltages because the NMOS gate-source voltages would drop below the NMOS threshold voltage and the transistors would cutoff.

The NMOS differential pairs were preferred for the four subsequent gain stages after the initial PMOS gain stage because NMOS stages allow for higher frequency operation. For the TSMC 0.25  $\mu m$  fabrication process, the NMOS transistor's process transconductance parameter, denoted by Kn', is over four times greater than the PMOS parameter Kp'. Therefore, under the same bias currents and resistive loads, the PMOS differential pairs must be at least four times wider than the NMOS circuit to achieve an equivalent gain. The greater transistor sizes hamper the circuit's speed because they are accompanied by considerably larger parasitic capacitances.

When cascading the comparator gain stages, it is necessary to design each gain stages so that their common voltages are compatible with its adjacent stages. The output common mode voltage of one stage must be within the input common mode range of the next stage it connects to. If this condition is not met, then the subsequent stages will be driven out of saturation, and enter into the triode region or even cutoff. If any of the FETs are driven into cutoff, then the comparator will cease to function. If the FETs enter the triode region, then the overall comparator gain will suffer. Both cases would severely compromise the functionality of the comparator and are to be avoided.

The most critical cascaded junction in the comparator is the connection between the PMOS stage and the NMOS stage. The output voltage of a single side of the PMOS stage output terminal (referring to the  $V_o$ + and  $V_o$ - nodes of the PMOS stage) ranges between 0V and  $I_{biasp} * R_{loadp}$  volts; its common mode output voltage is given by equation:

$$V_{CM_{out}\ PMOS} = I_{biasp} * R_{loadp}/2 \tag{3.37}$$

The input common mode voltage for the NMOS stage must be between the maximum common mode voltage given by equation:

$$V_{CM_{in,max NMOS}} = V_{dd} - I_{biasn} * R_{loadn}/2 + V_{tn}$$

$$(3.38)$$

The minimum common mode voltage given by:

$$V_{CM_{in,min} NMOS} = V_{biasN} + V_{gs,4} - V_{tn}$$

$$(3.39)$$

Unless there is a concerted effort to make the common mode voltages of the NMOS and

PMOS gain stages compatible, there is a high likelihood that the PMOS common mode output falls below the NMOS input common mode range, causing the performance problems previously mentioned. In the realized design, the PMOS stage's output common mode voltage is designed to be near the center of the NMOS stage's input common mode range to build in a wide margin for error. Errors in the absolute values of resistances and bias currents are expected because there are inevitable variances in the fabrication process. With margins for common mode voltage errors designed into the comparator, the circuit can tolerate moderate voltage deviations due to process variations. The PMOS stage's output common mode voltage was designed by selecting a large enough resistive load and tail bias current to raise the PMOS common mode output voltage to the desired level.

The comparator gain stages use P+ poly resistors without silicide to implement the resistive loads seen in figure 3.13. The gain stage design calls for 3.75  $K\Omega$  resistors for the PMOS stage and 3.25  $K\Omega$  for the NMOS stage. P+ poly layers have a suitable sheet resistance, allowing for reasonable resistor lengths for the realization of the required resistances. The P+ poly resistors have several characteristics that make them an attractive option for the resistor implementation. Their resistance value remains relatively constant regardless of the voltage across it. The device matching among polysilicon resistors are fairly good, with typical matching accuracy of around 0.5% [34]. Importantly, the parasitic capacitance of P+ poly resistors is low.

Designs of the gain stage using diode connected active loads were also considered, but after comparing their performance with the resistor load designs, the latter was chosen. Higher operation speed was the primary factor for using the resistive loads. To implement a diode connected load with a small signal resistance equivalent the poly resistors, the require FETS must be several micrometers in width. These wide diode connected transistors add considerable parasitic capacitances to the output node. Not only are there additional source/drain diffusion capacitances from the diode connected loads, but there are also gatesource capacitances as well, due to the gate to drain coupling from the diode connected configuration. The P+ poly resistive loads offer a better high speed performance because they contribute less parasitic capacitance to the output node.

The final stage of the high speed comparator converts the differential signal from the prior high speed stages to a rail to rail, single ended output. The design of this stage is based off of an amplifier with a push-pull output. Referring back to the schematic in figure 3.13, the current through the differential pair transistors M7 and M8 are current mirrored to the output transistors M13 and M15 respectively. By the time the signals reaches the differential to single-ended stage, it has been amplified to be large enough to cause complete current switching between M7 and M8. Due to the current mirroring action, one of the output transistors (either M13 or M15) sources/sinks current while the other one shuts off. This creates the push-pull behavior that drives the capacitive load at the output node to the supply rails.

The speed of the differential to single-ended stage is essentially determined by its slew rate as opposed to the small signal -3dB frequency. This stage experiences wide voltage swings and current switching, which are large signal behaviors, and this is the reason using the -3dB frequency, derived from the small signal circuit model, is a poor indication of the speed performance. A better indication of its speed is its slew rate. The slew rate is a large signal parameter that specifies the output voltages' maximum rate of change, and it is given by  $I_{max}/C_{load}$ ; where  $I_{max}$  is the maximum available current at the output node and the  $C_{load}$  is the total load capacitance. As the comparator output voltage transitions between high and low, this stage spends most of the transition time slewing. This is especially true if the load capacitance is large. A large load capacitance reduces output slew rate and thus increases the amount of time needed to charge or discharge the load. To reduce the effect of large loads, a pair of inverter buffers is used to drive large capacitances and present a smaller load at the comparator output. It is worth noting that the circuit has no need for any type of phase compensation since it is used in an open loop fashion. Phase compensation is used to ensure the circuits' stability when used in negative feedback, but it is not needed in an open loop operation. If phase compensation were included, it will only serve to reduce the bandwidth and limit the speed without any benefit for the open loop operation. The possibility of using a high number of gain stages in the open loop comparator exists because it is unrestrained by a phase requirement. Most common amplifiers designs include no more than two or three stages because they are designed for use in negative feedback. Each stage contributes additional phase shift and an amplifier with more than three stages would likely have reached a phase shift of 180° degrees at lower frequencies. That would be a major problem if the circuit were to be used in negative feedback since it would require the introduction of a low frequency dominant pole to ensure stability. This would greatly limit the circuit speed. However, the SST comparators are used in an open circuit manner so the phase response is irrelevant and a large number of stages are not prohibited.

A problem with the comparators is that they experience input offset voltages, which are non-idealities where a zero differential input produces a non-zero output voltage. The input offsets voltage can be modeled as random shifts in the triggering threshold voltage. Random variations during the fabrication process result in mismatches among the devices and causes the input offsets. Resistor mismatch, transistor aspect ratio mismatch, and threshold voltage mismatches are the factors that determine the offset voltage for each of the high speed gain stages [26]. The offset voltage is reasonably approximated by the equation:

$$V_{os} = \Delta v_{th} + \frac{v_{ov}}{2} \left[ \frac{-\Delta R load}{R load} - \frac{\Delta (W/L)}{(W/L)} \right]$$
(3.40)

Where Rload and (W/L) are the mean load resistance and aspect ratio respectively. The  $\Delta V_{th}$ ,  $\Delta R_{load}$  and  $\Delta (W/L)$  denote the mismatch between the threshold voltages, resistor values and transistor aspect rations. While the component mismatch is dependent on the

process, designing the circuit for low overdrive voltage, choosing larger resistor values, and using larger aspect ratios will help to reduce the offset voltages. Without any additional circuit techniques to address them, the offset voltages are inevitable and typically range around several tens of millivolts. In practice, the comparators are manually offset with DAC generated voltages to calibrate out the offset errors and get a precise control of the event triggering.

The layout of the circuit can have a dramatic effect on its circuit performance, so it is important that good layout practices are employed, especially for high performance analog circuits. The layout of the high speed comparator is shown in figure 3.15. A common centroid layout pattern is used in the high speed comparator. Both the source coupled transistor pairs and the resistors were laid out with common centroids. If a process gradient were present in the fabrication, then a common centroid design would distribute the device variations evenly, lessening the impact of the process variation [23]. A common centroid layout is one where the components are arranged around a device center, and the arrangement pattern mirrors itself along every axis of symmetry. For example, the resistors in the PMOS high gain stages have a one dimensional common centroid pattern of ABBA|ABBA.



Figure 3.15: High speed comparator layout

Dummy components are also added for the purpose of improving component matching. The devices at the outer edges of the layout may experience component differences due to the edge devices having different diffusion profiles compared to the inner devices [23]. The dummy components are added to the edges so that all of the active elements used in the circuit are in the interior of the layout. These dummy components are tied off to ground (or  $V_{dd}$ ) and do not affect the circuit electrically; their only purpose is to eliminate the fabrication edge effects from the device layout.

Guard rings are added throughout the comparator to isolate it from noise coupled through the substrate. Sensitive analog component can be disrupted by digital components, which cause switching noise. The comparators are susceptible to noise due to its sensitive analog nature. The comparators also generate noise since their digital outputs create large switching currents and voltages that can disrupt other components. To mitigate noise coupling through the substrate, guard rings are placed around the NMOS, PMOS, and resistors. The guard rings around the various transistors and resistors are visible in the layout shown in figure 3.15. The guard rings are implemented by surrounding the components with a diffusion layer that is tied to either ground or  $V_{dd}$ .

#### 3.5 Designing for High Acquisition Bandwidth

A design goal of the SST is to achieve an analog -3dB bandwidth that exceeds 1.0 GHz to prevent the attenuation of the high frequency components in the sampled signal. An analog bandwidth that is greater than 1.0 GHz extends the SST's recordable frequency range to the entire Nyquist frequency band (assuming a 2.0 GHz sampling rate). Section 3.2 broached the topic of bandwidth and established that the bandwidth is effected by the design of the sample and hold cell. While the bandwidth of the SST relies in part on the individual sample and hold cell (equation 3.10), determining the overall bandwidth of the arrayed cells is more complicated. To adequately estimate the overall SST analog bandwidth, the design must take into account various parasitic element effecting the circuit. The following section



Figure 3.16: Diagram of an N-element sampling array

discusses how the sample and hold array's component parameters and parasitic elements were modeled, and how the model is used to derive an optimized bandwidth solution.

The signal source being sampled is connected to the SST chip through an input pin from the integrated circuit (IC) packaging. Each channel on the SST consists of an array of 256 sample and hold cells connected to an analog bus line as shown in a simplified diagram in figure 3.16. The various packaging elements, interconnects, and metal lines in the signal path between the sampling cell and the signal's source, influences the SST's frequency performance.

Suppose that the bandwidth of the sample and hold array is given by the RC time constant of a single sampling cell (equation 3.10), then the array's analog bandwidth can be made arbitrarily high simply by widening the switch transistor widths to reduce the  $R_{on}$  resistance as shown in the equations 3.7 through 3.9. However, in practice this has not been the case. When the switching transistors exceeded several tens of micrometers in width, the realized bandwidth of the sample cell array falls far below the prediction given by the single cell RC time constant. The sample cell bandwidth equation (equation 3.10) is a poor indicator of the bandwidth of the entire array because it does not in factor a number of significant parasitic resistances and capacitances, which significantly influence the bandwidth performance. As the switch transistors increase in size, so do their source/drain diffusion areas and perimeter length, causing unintended increases in the associated parasitic capacitances. To a first order approximation, the capacitances of the switch transistors are given by the following equations:

$$C_{DB_{PMOS}}(W_p) \approx W_p E_p + 2(W_p + E_p)C_{jswp}$$

$$(3.41)$$

$$C_{DB_{NMOS}}(W_n) \approx W_n E_n + 2(W_n + E_n)C_{jswn}$$

$$(3.42)$$

Where  $W_p$  and  $W_n$  are NMOS and PMOS transistor widths respectively,  $E_p$  is the source/drain diffusion width defined by the process,  $C_{jp}$  is the process capacitance per area, and  $C_{jswp}$ is the sidewall capacitance due to the perimeter [26]. Equations 3.41 and 3.43 clearly show that increasing the switch transistor widths increases their capacitance.

The circuit model in figure 3.17 is used to analyze the bandwidth of the SST's sample and hold cell array. This model factors in the significant parasitic components which lie in the signal path. While each individual parasitic element may have a minor effect, the overall impact of hundreds of these elements in the array can be severe. Figure 3.17 represents the scenario with the worst frequency response, where the Nth sampling cell is sampling and the remaining N-1 sampling cells are open circuited; N being the number of sampling cells in the array. When the Nth sample cell is tracking, it encounters the resistance from the entire analog bus and contains the most capacitances in its signal path. The bandwidth of the array is designed by maximizing the worst case bandwidth using the model in figure 3.17 since this represents the factor limiting the bandwidth of the sampling and hold cell array.

The values of the elements in the figure 3.17 model are calculated with their first order



Figure 3.17: The circuit model for the array while the Nth sampling cell is in its sampling phase.

approximations. While more precise expressions are available, the first order equations are reasonably predictive of the component values without being so complicated that design insights are obscured. The analog bus that delivers the input signal across the entire sampling array exceeds a thousand micrometers. Such a length of metal line has a large enough resistance to have an impact on high speed performance. The equivalent parasitic resistance across the entire analog bus is given by:

$$R_{wire} = \frac{L_{wire}}{W_{wire}}\rho\tag{3.43}$$

Where  $L_{wire}$  is the wire length,  $W_{wire}$  is the wire width, and  $\rho$  is the sheet resistance of the wire material. The distributed resistance of the wire implementing the analog bus is approximated with lumped resistances between each sampling cell capacitors and has a value of  $R_{wire}/N$ .

The  $R_{on}$  is the resistance across the CMOS sampling switch in the tracking phase, and it is given by the equations 3.7 - 3.9. In general,  $R_{on}$  is by far the largest single resistance in figure 3.17, and it typically ranging between several hundred ohms to several kilo-ohms. Reasonably, much of the effort to maximize the bandwidth is through reducing the  $R_{on}$  resistance. While this can be accomplished by widening the CMOS switch transistors, increasing the width also increases the parasitic capacitances of all N sampling cells connected to the analog bus, which opposes bandwidth gains from lowering  $R_{on}$  resistance.

The  $C_{d/s}$  capacitors represent the source/drain capacitances of the sample and hold cell while in the hold state.  $C_{d/s}$  is given by the equation:

$$C_{ds} = C_{DB_{nmos}} + C_{DB_{pmos}} \tag{3.44}$$

The  $C_{DB_{nmos}}$  and  $C_{DB_{pmos}}$  are the capacitances given in equation 3.41 and 3.43.  $C_{pin}$ ,  $R_{bond}$ , and  $C_{pad}$  are the input pin capacitance, resistance through the bond wire, and the capacitance of the input pad respectively. These elements were included to account for the effects of the IC packaging and the connections to the SST. The values of  $C_{pin}$ ,  $R_{bond}$  and  $C_{pad}$  are based on the type of IC packaging and the fabrication process, and a circuit designer may not have control over these parameters. The  $V_{sig}$  and  $R_{sig}$  components make up the Thevenin equivalent of the off chip amplifier.

The zero value time constants analysis [26],[35] is applied to the circuit in figure 3.17 to estimate each capacitor's contributions to the -3dB cutoff frequency. In this cutoff frequency estimation technique, each capacitor is associated with a driving point resistance. The driving point resistance is found by selecting a capacitor, replacing all other capacitors with an open circuit, and calculating the equivalent resistance seen across the selected capacitor. The driving point resistance is calculated for all the capacitor in the circuit. A time constant  $\tau$  is associated with each capacitor and is given by the product of the capacitance and its driving point resistance. The -3dB frequency is estimated by the equation:

$$\omega_{3dB} = \frac{1}{\sum_{i}^{n} \tau_i} \tag{3.45}$$

The denominator in equation 3.45 is the sum of all of time constants. The zero value time constant analysis may be an imprecise method to calculate the actual cutoff frequency,

especially when many poles are located at nearby frequencies. However, this method is still useful in determining how significantly each pole contributes to the -3dB frequency.

The time constants for all N of the sampling cell source/drain capacitances in the array can be combined into a single term given by:

$$\tau_{db*N} = R_{db_{tot}} * C_{db} \tag{3.46}$$

Where  $R_{db_{tot}}$  is given by the following:

$$R_{db_{tot}} = \frac{R_{wire}}{N} \left[ \sum_{i}^{n} \tau_i \right] + N(R_{sig} + R_{bond})$$
(3.47)

$$= \frac{R_{wire}(N+1)}{2} + N(R_{sig} + R_{bond})$$
(3.48)

The calculations for the remaining time constants are the following equations:

$$\tau_{pin} = R_{sig} * C_{pin} \tag{3.49}$$

$$\tau_{pad} = (R_{sig} + R_{bond})C_{pad} \tag{3.50}$$

$$\tau_{sh} = (R_{on} + R_{sig} + R_{wire} + R_{bond})C_{sh} \tag{3.51}$$

The bandwidth of the array is given by:

$$\omega_{3dB} = \frac{1}{\tau_{sdN} + \tau_{pin} + \tau_{pap} + \tau_{sh}} \tag{3.52}$$

Each of the sample and hold cells contributes a capacitance which impacts the bandwidth of the overall array. For large arrays, the cumulative effects of the  $C_{ds}$  capacitances significantly degrade the bandwidth. Based on equations 3.46, increasing the array size N while holding all other parameters fixed will increase the  $\tau_{sbN}$  time constant, bringing down the cutoff frequency. Therefore, when designing for the largest possible bandwidth, the sample and hold array size should be the smallest size that satisfies the system requirements.

The sampling cell array achieves the absolute highest bandwidth with only an NMOS for the sample and hold pass gate and that is why the bandwidth optimization process is described for this case. The poor bandwidth performance of the PMOS is due to its large parasitic capacitance and the low hole mobility. However, the drawback to the NMOS only pass gate is that would limit the upper voltage range of the SST. To find a satisfactory compromise between bandwidth and operational voltage range, the SST sampling array was designed by sizing an NMOS only switch in the cells to achieve the maximum bandwidth. Then, a PMOS switch was added in parallel to increase the voltage range. The PMOS is iteratively sized until an adequate input range is found.

Once the array size is established, the next design variable that most greatly affects the array bandwidth is the transistor width of the switch transistors in the sample and hold cells. The relationship between the transistor's resistance and its parasitic capacitance results in the existence of an optimal transistor width that maximizes the array bandwidth. Increasing the transistor width lowers the  $R_{on}$  resistance, but that also increases the  $C_{db}$  capacitance. On the other hand, lowering the width reduces the multiple  $C_{db}$  capacitances, but this drives up the  $R_{on}$  resistance. Sizing the transistor width at either extreme causes either the capacitance or the switch resistance to dominate frequency performance. There exists a sampling cell design with a pass transistor width that balances the counteracting effects on the bandwidth and results in a maximum -3dB frequency. Maximizing the bandwidth as a function of the transistor width is done through solving the equation:

$$\frac{\partial}{\partial W_n} \left( \tau_{sdN} + \tau_{pin} + \tau_{pap} + \tau_{sh} \right) = 0 \tag{3.53}$$

Solving equation 3.53 for  $W_n$ , finds the local minima of the sum of the time constants, which corresponds to the maxima of the -3dB bandwidth. Substituting equation 3.44 and equations 3.46 - 3.51 into equation 3.53, and solving for  $W_n$  results in:

$$W_{n} = \sqrt{\frac{C_{sh}}{\frac{\mu_{n}C_{ox}}{L}(V_{dd} - V_{bias} - V_{tn})\left[\frac{W_{wire}(N+1)}{2} + (N)(R_{sig} + R_{bond})\right](E * C_{jn} + 2 * C_{jswn})}$$
(3.54)

Sizing the NMOS pass transistors in the sampling and hold cells to the width given by equation 3.54 theoretically maximizes the bandwidth of the sampling and hold cell array.

Cadence Virtuoso circuit simulations were used to verify the discussed bandwidth behavior. Circuit simulations were ran for the sample and hold cell arrays using the TSMC 0.25  $\mu m$  fabrication process parameters and a  $C_{sh}=80$  fF. For an array size of N=256, the pass transistors width  $W_n$  was swept and plotted versus the bandwidth of the sampling array; the results are shown in figure 3.18. This figure contains two traces; the solid line trace (plotted in blue) shows the simulated behavior of the array's -3dB frequency as  $W_n$  is varied. The second trace in figure 3.18, represented in the dashed line (plotted in red), is a fitted plot that adjusts the Cadence Virtuoso bandwidth simulations to fit with the realized bandwidth measured from the fabricated 0.25  $\mu m$  SST. The measured bandwidth of the fabricated SST (with 256 sample depth) is 1.5 GHz and is represented with the triangular data point that lies on the dashed line trace; the realized bandwidth is 390 MHz higher than what the simulation predicted. The dashed trace in figure 3.18 is a shifted version of the Virtuoso bandwidth simulation (solid blue trace); this was done to account for the actual SST bandwidth measurement and to have a bandwidth versus array size curve that reflects the grounded reality of the realized circuit performance. A likely explanation for the discrepancy between the simulation and the realized bandwidth is that overly conservative estimates were used in the simulation of the parasitic resistances and capacitance in the analog bus. The simulation in figure 3.18 indicates that the maximum bandwidth for the array occurs when the NMOS pass gates are sized to  $2.579 \,\mu m$ . Calculating the optimal transistor width using the equation 3.54, the theoretical bandwidth maximizing transistor



Figure 3.18: Circuit simulation of the sample and hold array -3dB frequency versus the transistor width of the NMOS pass gate.

size is  $W_n = 2.39 \,\mu m$ . The theoretical  $W_n$  value is reasonably close to the simulated  $W_n$  with an error of 7.32%.

Simulations comparing the bandwidths of sampling arrays with varying sample depths is shown in figure 3.19. Figure 3.19 contains two traces. The solid trace (plotted in blue) represents the simulated bandwidth values. The dashed trace (plotted in red) is a shifted version of the simulated data curve that has been adjusted to reflect the SST's realized bandwidth. The triangular data point on the dashed trace represents the realized 1.5 GHz bandwidth of the 256 sample wide SST chip. Similar to the previous plot, the dashed trace is curve fits the simulation data to reflect the realized circuit's bandwidth measurement. The results of the simulations confirm the theorized circuit behavior where increasing the number of sampling cells in the array lowers the bandwidth of the system. The number of sample cells in the SST could potentially determine an upper limit to the achievable bandwidth of the device.



Figure 3.19: Circuit simulation the optimal array -3dB bandwidth versus N number of cells.

# Chapter 4

# **Timing Analysis**

This chapter examines the SST's ability to measure timing phenomenon, and explains the metrics used to characterize the timing measurements. Theory is presented to estimate the timing performance based on several factors including sampling noise, timing jitter, input signal parameters, and the SST's sampling rate. The SST's fixed pattern timing noise is a significant contributor to error in the SST timing measurements. Methods to characterize the fixed pattern timing noise are discussed. A timing calibration procedure that mitigates the effects of the fixed pattern timing noise is also explained in this chapter.

## 4.1 Timing Resolution

The ability to precisely measure timing information is important for high energy particle experiments. In these experiments, particles travel at relativistic speeds and events unfold over extremely short periods of time. A detector's ability to accurately measure timing differences among different antennas affects how accurately the particle events can be reconstructed. The flight path of a particle is calculated using the particle's arrival times, so high timing resolution is necessary to precisely determine the origin of detected neutrino event.

The timing resolution of the SST characterizes its ability to precisely capture timing information in its waveform records. The SST timing resolution is measured by repeatedly recording waveforms with constant timing features, and calculating the variation in the recovered timing data. The timing resolution is defined as the standard deviation of these timing measurements. The SST data acquisition system is capable of measuring timing phenomenon with a resolution on the order of picoseconds. Timing jitter in the sampling clock and voltage noise on the analog output degrade the SST's timing resolution.

The SST's timing resolution was characterized using two types of timing tests. One timing test uses linear interpolation to measure time intervals between zero crossings captured on a single channel. The other timing method relies on cross correlation techniques to test the SST's ability to measure timing phenomenon across multiple channels. These timing tests are described in further detail in the following sections of this chapter.

## 4.2 Intra Channel Timing Test

The precision of the timing information captured on a single channel of the SST is assessed by recording pure sinusoidal inputs, and measuring the amount of variation in the periods of the recovered signals. This timing test is referred to as the intra channel timing test because it measures the relative time between instances within the same captured waveform. The following are the procedures of the intra channel timing test. A pure sinewave generated from an RF signal generator serves as the input for the intra channel tests, and are recorded by the SST. The waveform is readout from the SST, and MATLAB is then used to process the recovered signal. The voltage fixed pattern noise and DC bias voltage are removed via pedestal subtraction. The fixed pattern sampling interval error is also calibrated out before performing the intra channel timing test; this timing fixed pattern calibration is discussed later in this chapter (in Section 4.7). The rising edge zero crossings are calculated using a simple linear interpolation given by equation 4.1, where  $X_0$  and  $Y_0$  are the time and voltage of the sample point before the zero crossing, and  $X_1$  and  $Y_1$  are the time and voltage of the sample point after the zero crossing.

$$t_{cross} = X_0 - Y_0 \left(\frac{X_1 - X_0}{Y_1 - Y_0}\right)$$
(4.1)

A period measurement is calculated as the time difference between zero crossings. An illustration of the inter channel timing test is presented in figure 4.1. While the rising edges were used for the period calculations, the same technique could be applied to the falling edges.



Figure 4.1: Inter channel timing test illustration

A full scale sinewave improves the accuracy of the period measurements in the intra channel timing tests. The slope of the input sinusoid is at its maximum at the zero crossings, and it is given by equation:

$$Max\left\{\frac{dV_{in}}{dV_t}\right\} = A * 2\pi * F_{in} \tag{4.2}$$

Where A is the amplitude and  $F_{in}$  is the frequency of the input sinewave. Equation 4.2 shows that larger input amplitudes results in sharper transitions at the zero crossings. This yields greater voltage differences between the two samples around the zero crossing point, and reduces the effect of sampled voltage noise on the period measurements. An increase in the input frequency also increases the transition rate; however, the higher input frequencies also introduce linear greater interpolation error.

The frequency of the input sinusoids used for the intra channel timing test can effect the outcome of the test and need to be chosen carefully. The lower the input frequency, the better the linear interpolation can predict the zero crossing. As the frequency increases, the sampled points are taken on less linear portions of the sinusoid, leading to larger errors in the linear interpolation of the zero crossings. The plot shown in figure 4.2 is a Monte Carlo simulation of the RMS zero crossing error versus input frequency for a sinewave sampled at 2.0 GHz. As shown in figure 4.2, the interpolation error is practically zero for low frequencies, but quickly rises as the input frequency increases.

While the interpolation error increases for high frequency inputs, the intra channel test can still be performed at frequencies with significant interpolation error due to the presence of node frequencies. At certain input frequencies, the zero crossing errors in the linear interpolations do not affect the period measurements; these frequencies are referred to as node frequencies in this paper. The appearance of the node frequencies are demonstrated with a Monte Carlo simulation. To simulate the node frequencies, ideal, noiseless sinewaves with random phase offsets are generated for a number of different input frequencies. The intra channel timing test is applied to this simulated data, and the RMS error of the period



Figure 4.2: Linear interpolation error versus input frequency

measures are plotted versus the input frequency in figure 4.3.

The node frequencies occur when input sinusoids have a period that is equal to an integer multiple of the sampling interval. Even in the presence of interpolation errors, the intra channel timing test correctly reports the input periods without any errors when measuring the ideal, noiselessly generated data. For K equal to all integers greater than 2, the node frequencies are given by the equation:

$$F_{node} = \frac{F_{samp}}{K} \tag{4.3}$$

Where  $F_{node}$  is the node frequency and  $F_{samp}$  is the sampling frequency. Equation 4.3 can



Figure 4.3: Simulated node frequencies with ideal sinusoidal inputs

be equivalently expressed in terms of periods with the equation:

$$T_{node} = K * T_{samp} \tag{4.4}$$

Where  $T_{node}$  is the period of the node frequency input, and  $T_{samp}$  is the sampling interval.

At these node frequencies, the exact periods can be recovered from the intra channel measurements because the errors on both of the interpolated zero crossings are completely correlated. The input signal becomes synchronized with the sampling instances, so the exact same series of voltages within an input period are repeated for the entirety of the record length. This will mean that the sampled voltage pairs used to calculate each zero crossing will be the same.

To demonstrate the reoccurring sampled voltage pairs mathematically, consider an input sinusoid at a node frequency described by the equation:

$$y(t) = A * \sin(2\pi * F_{in} * t + \varphi) \tag{4.5}$$

Let  $t_0$  be the time of the sample voltage before the occurrence of the first zero crossing, and the voltage sampled at the  $t_0$  instance equals to  $y_0$ . This is expressed in the equation:

$$y_0 = y(t_0) = A * \sin(2\pi * F_{in} * t_0 + \varphi).$$
(4.6)

Where  $\varphi$  is an arbitrary, constant phase of the sinusoid. The next sample is taken at time  $t_1$ , which occurs  $T_{samp}$  seconds after  $t_0$ ;  $t_1$  is the time of the very next sample after the first rising edge zero crossing. This is expressed in the equations:

$$t1 = t0 + T_{samp} \tag{4.7}$$

$$y_1 = y(t_1) = A * \sin(2\pi * F_{in} * t_1 + \varphi)$$
(4.8)

$$= A * sin(2\pi * F_{in} * (t_0 + T_{samp}) + \varphi)$$
(4.9)

The subsequent rising edge zero crossing occurs one input period after the first one. The sample taken right before the second zero crossing occurs at time  $t_0$ ' and has a sampled voltage of  $y_0$ '. Sample voltage  $y_0$ ', occurring at time  $t_0$ ', is separated from  $t_0$  by K multiples of the sampling intervals. This is expressed in the following equation:

$$y'_0 = y(t'_0) = A\sin\left(2\pi * F_{in} * t'_0 + \varphi\right)$$
(4.10)

$$t'_0 = t_0 + K * T_{samp} \tag{4.11}$$

The input is at a node frequency (from the initially stated condition), so  $K * T_{samp} = T_{node}$ . Therefore,  $t_0$ ' can be expressed with equations:

$$t'_{0} = t_{0} + T_{node} = t_{0} + \frac{1}{F_{in}}$$
(4.12)

Substituting  $t_0$ ' from equation 4.12 into equation 4.10 yields the following:

$$y'_{0} = A\sin\left(2\pi * F_{in} * (t_{0} + \frac{1}{Fin}) + \varphi\right)$$
(4.13)

$$= A\sin(2\pi * F_{in} * t_0 + 2\pi + \varphi)$$
(4.14)

$$=A\sin\left(2\pi * F_{in} * t_0 + \varphi\right) \tag{4.15}$$

Comparing equation 4.6 with equation 4.15, it is clear that sample voltages  $y_0$  and  $y_0$ ' are equal. Similarly it can be shown that  $y_1$  equals  $y_1$ ' and that all of the samples around the rising edge zero crossings behave in the same manner. Therefore, all of the zero crossing linear interpolations are performed with the same voltages, and the samples are separated by the same time interval of  $T_{samp}$ . This results in a linear interpolation error that is equal for each of the zero crossing calculation. Since the period measurement is taken as the time difference between the zero crossings, the interpolation error is subtracted out, resulting in an error free period measurement (assuming that the input and sampling times were ideal).

Performing the intra channel test on a real world system provides a measure of the limits of the systems timing measurements. The previous discussion demonstrated that the intra channel timing test will measure the input sinusoid's period with zero error if the system is noiseless. However the realized SST circuit is effected by the noise on the sampled voltage, and by noise that causes timing jitter on the sampling clock. These noise sources are reflected in the intra channel timing test. The sampled voltage noise and the timing jitter arise from different, unrelated random processes, and therefore are independent from each other. The random error on the intra channel test's period measurements, due to random noise, can be calculated using the zero crossing linear interpolation equation 4.1. It is assumed that the timing jitter is described by a Gaussian random variable with a variance equal to  $\sigma_t^2$  and the voltage noise is also normally distributed with a variance of  $\sigma_n^2$ . In equation 4.1, the

difference between sampled voltages  $(y^2 - y^1)$  is closely approximated by equation:

$$(y_1 - y_0) \approx A * \omega(x_1 - x_0)$$
 (4.16)

Where A is the input amplitude and  $\omega$  is the frequency of the input sinewave. This approximation holds because the input's instantaneous rate of change at the zero crossing is equal to  $A * \omega$ , and the samples used for the intra channel timing are close to the zero crossing point. Using the scaling and summing properties of independent Gaussian random variables, the standard deviation of the period measurements from the intra channel timing test are modeled with the equation 4.17. The constant term multiplied by  $\sigma_t^2$  in equation 4.17 comes from a  $\sqrt{2}$  term that accounts for the independent time jitter of two sample points which is multiplied by an empirically determined (3/5) weighing term that accounts for the position of the sampled voltage's effects on the zero crossing error due to the random noise.

$$\sigma_{period} = \sqrt{2\left[\left(\frac{\sigma_n}{A*\omega}\right)^2 + \left(\frac{3\sqrt{2}}{5}\sigma_t\right)^2\right]} \tag{4.17}$$

In addition to assessing the minimum resolution of the timing information captured on a single SST channel, the intra channel test can provide an indirect measure of the timing noise inherent in the system. The sampled voltage noise can be directly measured, and using equation 4.17, the jitter on the sampling clock can be solved for in order to estimate its value.

#### 4.3 Inter Channel Timing Test

The SST's multi-channel timing resolution is determined with a timing test that uses cross correlation to measure delays between two channels. This timing test is referred to as the inter channel timing test since it relies on measurements taken between different channels. The inter channel timing test is performed by repeatedly measuring a fixed timing delay with the SST. The variation on the delay measurements are calculated to reveal the accuracy of the SST's multi-channel timing capabilities. The standard deviation of the delay measurements is used to determine the inter channel timing resolution.

A time tested method for calculating delays between two signals involves the use of the cross correlation function. The cross correlation function measures the similarity between two signals across a range of time shifts; the time shifts are referred to as lags. When the cross correlation function is applied to a pair of similar, but time shifted signals, the maximum of the cross correlation function corresponds to the time delay between the signals. The discrete time, cross correlation function for the two sampled signals is given by the equation:

$$Xcorr(f,g)[k] = \sum_{i=1}^{N} f[i]g[i+k] \quad \text{for } \mathbf{k} \ \epsilon[-N,N]$$

$$(4.18)$$

Where f[i] and g[i] are the sampled input signals. In equation 4.18, the k values are the lags, which take on all positions from N to N. Cross correlating two signals as a function of k can be described as the following. One signal is shifted with respect to the other by an amount of lag k. Next, the aligning samples are multiplied and the resulting vector is summed. Each summation corresponds to the cross correlation function value at the  $k_{th}$  lag position. When k equals to the index corresponding to the time delay between the two signals, both of the signals are aligned, and the cross correlation function is at its maximum. The  $k_{max}$  lag position corresponds to the maximum of the cross correlation function and is expressed mathematically as:

$$Max\{Xcorr(f,g)[k]\} = Xcorr(f,g)[k_{max}]$$

$$(4.19)$$

The time delay between the two signals (in units of seconds) is given by equation:

$$time \ delay = K_{max} * T_{samp} \tag{4.20}$$

Where  $T_{samp}$  equals to the time interval in between each sample.

The inter channel test is a versatile test that can be performed with a variety of input wave forms, ranging from Gaussian pulses to neutrino template signals. Bipolar pulse inputs are predominantly used to characterize SST's inter channel timing resolution. The bipolar pulse are preferred for several reasons. These signals test the SST's capability to record signals with both positive and negative polarities. The event signals detected by the ARIANNA experiments are bipolar in nature, so it is important that the SST's performance measuring bipolar signals is accounted for in the timing tests. The bipolar pulses have several distinct features, including rapidly rising and falling segments, which allows for a high degree of timing precision. In addition, it is possible to easily adjust the bipolar pulses' pulse width and amplitude repeatably and reliably.

The bipolar pulses used as the inter channel timing tests consist of a one period segment of a sinewave wave. These inputs signals are created using the gated function on the RF function generator. A signals splitter is used to create two identical pulse signals. A length of SMA cable is used to create a fixed and very stable time delay between the two pulses. The delay created by the cable remains virtually constant, so any variation in the signal delays are due to noise and non-idealities inherent in the SST system. The input pulses experience signal attenuation when traveling through the cables. Therefore, longer delays, which require greater cable lengths, exacerbates the signal loss. Although this cable loss reduces the signal power, the shape of the pulse remains intact. Since the cross correlation measures the signal similarity, the inter channel test results in reliable delay measurements even with notable signal attenuation. The timing resolution did not show any significant degradation across inter channel test with delays ranging from 0ns to 60ns. However, it stands to reason that at some point, significant amounts of attenuation will compromise the timing resolution.

To perform the inter channel test, the input pulse and its delayed counterpart are recorded on to two separate SST channels. Before applying the timing test, the DC voltage and the voltage FPN are removed with pedestal subtraction. The SST's cell dependent gain variation and the fixed pattern timing noise are also corrected for. An example of the calibrated bipolar pulses used in the inter channel tests is plotted in figure 4.4.



Figure 4.4: An example of the bipolar pulse inputs used in the inter channel timing test

Interpolation is used to get a finer time resolution on the delay measurements. Without interpolation, it is only possible to measure the time delay in increments of the sampling interval  $T_{samp}$ . Using interpolation techniques, the values of the signal in between the samples are estimated. There are several methods of interpolation, and the choice of interpolation method can affect the accuracy of the delay measurements. The piecewise cubic interpolation function (PCHIP) in MATLAB is used in the inter channel timing tests. The PCHIP interpolation is chosen because it interpolates the data smoothly and it retains the overall shape of inputs without overshoots.



Figure 4.5: Plot of the cross correlation function versus lag

A plot of the cross correlation function versus the lag time for two pulses (the pulses plotted in figure 4.4) is shown in figure 4.5. The plot in figure 4.5 show that the cross correlation function sharply rises as the lag time approaches the time delay between the two pulses. The pulse delay is found by determining the lag time associated with the maximum value of the cross correlation function (plot in figure 4.5). However, simply taking the maximum value introduces delay errors due to the finite interpolation step and irregularities from the interpolation. A more precise determination of the cross correlation function maximum is achieved by mathematically fitting the peak of the cross correlation function. The points near the peak of the cross correlation function is closely approximated with an inverse parabola, so the peak can be accurately modeled with a second order polynomial fit. The mathematical fit is performed with the polyfit function in MATLAB, and it models the function with the following equation:

$$R_{fit}(t) = a * t^{2} + b * t + c \tag{4.21}$$

Where a,b, and c are the fitting coefficients resulting from the polynomial fit. The peak of the cross correlation function and the polynomial fit are plotted in figure 4.6. The location of the maximum is determined by equating the derivative of the fitting function to zero, and solving for the time. This results in the time delay, and it is simply given by the equation:

$$time \ delay = \frac{b}{2a} \tag{4.22}$$

Where a and b are the polynomial coefficients from the fit from equation 4.21. By using this fitting technique, the accuracy of the SST's delay measurements are improved, reducing the variance of the delays measurements by several hundred fempto seconds.



Figure 4.6: Second order fit of the cross correlation function maximum

The time delays created by the cable delays are reliably fixed, so any variation in the delay measurements are due to noise sources inherent in the SST system. Similar to the intra channel timing tests, the dominant noise sources in the inter channel test are caused by random noise in the sampled voltages and random noise causing fluctuation in the sampling time. The relationships between the noise and timing measurements are different between the intra channel test and the inter channel test since they rely on different fundamental principles. The impact of random noise on the delay measurements can be explained qualitatively by the following. The random voltage and timing noise on each sample creates minute differences in the shape of the two input signals. Each cross correlation delay measurement is dependent on similarity between the shapes of the two signals. Therefore, the signal variations caused by the noise shifts the maxima of the cross correlation functions, causing errors in the inter channel timing measurements.

In the SST system, the kT/C sampled voltage noise is the dominant source of voltage noise in the cross correlation delay measurements. However, mismatches between channels, sample cell mismatches, voltage fixed pattern noise, and other sources also add to the noise in quadrature, and they contribute to the uncertainty in the delay measurements. As proposed in the reference [36], the effect of voltage noise on the delay measurement uncertainty (using the cross correlation delay method) is estimated by the following equation:

$$\sigma_{v,n} = \sqrt{\frac{n * T_{samp}^2 * \sigma_n^2}{4A^2}} \tag{4.23}$$

Where n is the number of sample points in the input pulse,  $T_{samp}$  is the sampling interval,  $\sigma_n^2$ is the variance of the voltage noise, and A is the amplitude of the input pulse. Equation 4.23 was verified using the nominal SST parameters in Monte Carlo simulations. The predictions from equation 4.23 were in close agreement to the simulations when  $\sigma_n \ll A$ . Inputs with larger amplitudes improve the SNR, reducing the impact of the voltage noise on the delay measurements. When only considering the effect of the voltage noise, it is noted that increasing the number of sampling points increases the delay uncertainty. This occurs because each sample point has an independent random noise component, so including more sample points leads to a larger accumulation of voltage noise, which degrades timing measurements.

Timing jitter causes non-uniformity among the sampling edges, which contributes to the uncertainty in the delay measurements. Also proposed in the reference [36], the timing jitter's effect on the uncertainty of the cross correlation delay measurements can be estimated with the following equation:

$$\sigma_{jitter} = \sqrt{\frac{\sigma_t^2}{n}} \tag{4.24}$$

The n term in equation 4.24 is the number of sample points on the input pulse and  $\sigma_t^2$  is the variance of the sample interval error (timing jitter). For  $\sigma_t \ll T_{samp}$ , equation 4.24 was shown to be a good estimator of the delay uncertainty through Monte Carlo simulation. Unlike the timing uncertainty caused by the voltage noise, the uncertainty in the delay measurements due to timing noise is reduced by increasing the number of sample points on the input pulse (denoted by n). With more sampling points, the delay variation (created by the timing jitter) is reduced in a similar fashion to how averaging a measurement over n observations reduces the variance from random, uncorrelated noise by a factor of n.

The timing jitter and voltage noise are caused by separate mechanisms, and are statistically independent from each other. Therefore, the timing noises add in quadrature and the total error on the inter channel timing measurements is expressed as:

$$\sigma_{tot} = \sqrt{\frac{n * T_{samp}^2 * \sigma_n^2}{4A^2} + \frac{\sigma_t^2}{n}}$$

$$\tag{4.25}$$

From analysis of the equation 4.25, several observations are made about the inter channel timing resolution. Firstly, the error on the inter channel timing is dependent on the input signal amplitude, and maximizing the input signal power serves to minimize the timing error. The maximum power of the input signal is reliant (in part) on the SST's dynamic range specification, thus the dynamic range relates to the timing resolution. A large dynamic range allows for a greater SNR, which would mitigate the impact of sampled voltage noise on the SST's timing resolution.

The second observation made from equation 4.25 is that the inter channel timing error

can be improved by reducing the power of the sampling voltage noise and the timing noise. The timing resolution degradation from the voltage noise is usually dominated by the kT/C noise, which is dictated by the size of the sampling capacitor (discussed in section 3.2). This indicates that a high SST timing resolution requires sufficient voltage noise suppression achieved by adequately sizing the sampling capacitor. The source of the timing noise is the sampling clock generation circuitry, and the timing noise performance varies based on which circuit topology is used. The most significant timing noise sources in the SST's synchronously generated clock are the jitter from the external LVDS oscillator and the fixed pattern sample interval errors caused by circuit mismatches in the SST clock paths. Using a high performance, low jitter oscillator and applying timing calibrations (discussed in section 4.7) results in low timing noise, which allows for in inter channel timing resolutions on the order of picoseconds.

For a given inter channel timing test input pulse (specified amplitude, shape and time duration), equation 4.25 shows that a higher sampling rate reduces the inter channel timing error. Changing the SST's sampling rate alters both the number of sample points in the input pulse (denoted by variable n), and the sampling interval (represented by variable  $T_{samp}$ ). Assume that the inter channel timing test input is a pulse with a fixed duration of  $T_{pulse}$  (where  $T_{pulse}$  is the time duration where the input is non-zero). Then the number of sample point is given by  $n = T_{pulse}/T_{samp}$ . Substituting for n into equation 4.25 yields the following expression for the inter channel timing error:

$$\sigma_{tot} = \sqrt{\frac{T_{pulse} * T_{samp} * \sigma_n^2}{4A^2} + \frac{\sigma_t^2 * T_{samp}}{T_{pulse}}}$$
(4.26)

$$=\sqrt{T_{samp}\left(\frac{T_{pulse} * \sigma_n^2}{4A^2} + \frac{\sigma_t^2}{T_{pulse}}\right)}$$
(4.27)

The sample interval equals the reciprocal of sampling rate (expressed as  $f_{samp} = \frac{1}{T_{samp}}$ ), so increasing the sample rate reduces  $T_{samp}$ . Equation 4.27 shows that the overall effect of
decreasing the  $T_{samp}$  is a reduction in the inter channel timing uncertainty caused by both the voltage noise and timing noise. An increase in sampling rate is a practical method to improve the timing resolution of an SCA waveform recorder.

#### 4.4 Sample Interval Errors

The use of time-interleaved sampling has been shown to be an effective way to achieve high speed sampling rates. However, interleaving techniques are susceptible to sampling time errors, which degrade the performance of the circuit. Ideally, each subsequently voltage is sampled after a sampling interval of  $T_{samp}$ . In the realized circuit, various noise sources and device mismatches causes timing errors in the sample clock edges. This causes the sample intervals to vary from  $T_{samp}$  and creates sample interval errors. This is illustrated in figure 4.7. The sampling instances are supposed to align with the clock edges of the sampling clocks, but non-idealities create timing uncertainties that shift the actual sampling instances and appear as sample interval errors.

The timing uncertainties responsible for the sampling interval errors can be separated into two categories: the random jitter and the deterministic jitter. The random jitter in the SST is caused by the various device noise sources, including transistor thermal noise and flicker noise. The oscillator is subjected to random noise, which contributes to the random timing jitter in the SST. The distribution of the random jitter is described by a Gaussian random variable with a zero mean. The sampling interval errors caused by the random jitter can only be described statistically; the exact values of the random errors are unpredictable for any given instant and they change from moment to moment.

The deterministic jitter in the sampling clock creates sample interval errors that behave differently from the sample interval errors caused by random jitter. The deterministic jitter



Figure 4.7: Illustration of timing uncertainty that causes sampling interval errors

is defined as timing variations that are reoccurring and predictable; therefore, the sample interval errors due to deterministic jitter are also predictable. The deterministic sample interval error is also referred to as the timing FPN (fixed pattern noise) because of its fixed and repeating behavior. The deterministic sample interval errors are bound within finite values and they are measurable. Mismatches among the sampling clock signal path are major contributors to the SST's deterministic sample interval errors. As the sampling clock signals are distributed throughout the sampling array, different sampling cells have different delays because of differences in their clock paths. The device mismatches among the components, which are caused by process variations, are another source of the deterministic sample interval errors. The SST relies on duel clock edges to generate its sampling clock, so variations in the duty cycle of the SST's internal clock also contribute to the deterministic sample interval error.

The deterministic sample interval errors in an SST chip is quantifiable and predictable because the deterministic jitter is fixed for each device. If the sample intervals can be measured, the deterministic sample interval errors is given by the average of the errors. Averaging a sufficiently large set of sample intervals suppresses the errors from the random jitter components, leaving the deterministic sample interval errors. Once the deterministic sample interval errors are known, the error's predictable nature allows for it to be removed. By adjusting the SST's readout voltages to account for the fixed sample interval errors, the deterministic sample interval errors can be calibrated out. This timing calibration is presented in section 4.7.

The total sample interval errors (the combined random and deterministic errors) degrades the timing performance and corrupts the recovered signal. The variance of the total sample interval errors is the timing noise denoted by  $\sigma_t^2$  in section 4.2 and in section 4.3, which degrade the SST's timing resolution. The sample interval errors also causes errors in the SST's readout voltages, which appears as a noise signal added to the output. The noise power of the sample interval errors can be estimated for a sinusoidal inputs with the following equation:

$$E_p = \frac{(A\omega\sigma_t)^2}{2} \tag{4.28}$$

Where A is the input amplitude,  $\omega$  is the input frequency and  $\sigma_t$  is the RMS value of the sample interval error [37]. Therefore, the sample interval errors degrade the SST's SNR by an amount given in the following equation:

$$SNR = -20Log(\pi f\sigma_t) \tag{4.29}$$

Another adverse effect of the sample interval errors is that any regularity in the timing errors can cause spurious frequency components to appear in the spectrum of the recovered signal. For example, the SST experiences a deterministic sample interval error where odd sample positions are shifted by a mean of about +30ps above  $T_{samp}$  while the even samples are shifted by a mean of about -30ps below  $T_{samp}$ . This results in an alternating pattern in the sample interval errors that modulates the sampled signal. This modulation creates a spurious component at the frequency  $\frac{F_{samp}}{2} - F_{in}$ .

# 4.5 Zero Crossing Method for Timing FPN Characterization

Characterization of the fixed pattern sample interval errors presents several challenges because there are no practical means to directly measure each individual sample interval. One method used to indirectly determine the SST's fixed pattern timing noise is through a method that this text refers to as the zero crossing method. With the zero crossing method, the mean sample interval associated with each specific SST sample cell position is calculated based on the probability it would receive a randomly assigned zero crossing.

When capturing signals containing random zero crossings, the probability that the zero crossing occurs between a pair of samples is proportional to the time interval in between the samples. Therefore, the fixed pattern sample intervals can be found by experimentally determining the zero crossing probability mass function. These probabilities are empirically calculated by recording randomly offset waveforms and counting the zero crossings. The zero crossing method relies on the statistical law of large numbers, which states that the ratio of events approaches the actual probability as the number of trials approaches infinity. By sampling a sufficiently large set of data containing random zero crossings, the SSTs sampling intervals can be closely estimated.

The implementation of the zero crossing method is described in the following. The inputs for zero crossing method are full scale sine wave with randomly assigned phase offsets. The input frequency must be below the Nyquist rate of  $F_{samp}/2$ , to properly implement the

zero crossing detection. To avoid the sampled voltages from being correlated with each other, the periods of the input sinusoids must not be integer multiples of the sampling period. Pedestal subtraction is performed on each of the captured inputs to remove voltage FPN. For each captured input waveform, the rising edge zero crossings are determined and each zero crossing is counted according to the sampling position it occurred in. After a sufficiently large number of input waveforms are been recorded, the zero crossing probity mass function is calculated as:

$$Pd[i] = \frac{Z_c[i]}{\sum_{k=1}^{N} Z_c[k]}$$
(4.30)

The sampling intervals are calculated from the probabilities with the following equation:

$$T_{samp}[i] = T_{nom} * P_d[i] = T_{nom} \frac{Z_c[i]}{\sum_{k=1}^N Z_c[k]}$$
(4.31)

Where  $T_{nom}$  is the nominal sample interval; for a 2.0 GHz sampling rate, the nominal sampling interval  $T_{nom}$  is 500 ps.

The practical realization of the zero crossing method is subjected to statistic error since it relies on a finite set of data. The statistical error on the calculated sample intervals can be reduced by taking larger sets of zero crossing measurements. The following discusses the relationship between the number of waveforms in the data set and the error on the calculated sample intervals. The number of zero crossing counted in an interval is closely approximated with a Poisons distribution. From the Poisons distribution, the variance of the number of observed zero crossing in a given cell position is denoted by  $\sigma_{counts}^2$  and is given by equation 4.32.

$$\sigma_{counts}^2 = M * T_{samp} * F_{in} \tag{4.32}$$

Where M is the number of input waveforms,  $F_{in}$  is the frequency of the input sinusoid

and  $T_{samp}$  is the sample interval of a sample cell position. The exact time duration of the sampling interval varies from the nominal value and it is not initially known. However, using the nominal sample interval  $T_{nom}$  to estimate the variance results in a close approximation. Equation 4.33 is derived by scaling  $\sigma_{counts}^2$  to get the variance of the sample interval, and applying that to the formula for the standard error. The resulting equation (equation 4.33) is the error on the sampling interval in units of seconds.

Standard Error = 
$$\frac{T_{samp}}{N} \sqrt{\frac{1}{M * T_{samp} * F_{in}}}$$
 (4.33)

Where N refers to the number of sample cells in the SST and M is the number of input waveforms in the zero crossing data set. The sample intervals were calculated with 2 million events where the inputs were 427MHz sinusoids. This calculated the sampled intervals of the SST with a standard error within 0.73ps;

# 4.6 Simulated Annealing Method for Timing FPN Characterization

The simulated annealing timing characterization method is a heuristic method where brute force numerical simulation is used to characterize fixed pattern timing noise. The general concept behind the simulated annealing timing characterization method is that the inherent fixed pattern timing noise is found through optimizing a correction vector to minimize measured timing errors in a fixed set of data measured by the SST.

A timing calibration function was made to adjust the SST's readout with a correction vector that offsets the cell dependent fixed pattern timing noise. This timing calibration is used in the optimization function of the simulated annealing algorithm. The timing correction calibrations are discussed in further detail in section 4.7. The simulated annealing algorithm is applied in order to develop an optimized timing correction vector that minimizes the error in a set of timing measurements.

The principal behind the simulated annealing timing characterization is described in the following. If the applied timing calibration vector does not reflect to the actual timing fixed pattern noise, then the fixed pattern timing noise would increase the measured timing error. As the timing calibration approaches the system's actual fixed pattern timing, the magnitude of the fixed pattern timing noise is reduced, resulting in a reduction in the measured timing error. When the correction vector converges to the SST's true fixed pattern timing noise, the fixed pattern timing noise is effectively removed and the measured timing error will reach a minimum. Therefore, once timing error is minimized, the corresponding correction vector is the true fixed pattern timing noise.

The simulated annealing algorithm is a probabilistic approach to finding the global minimum of a function, even if the function contains several local minima. It relies on systematically introducing varying amounts of noise to the input of a function to search for an optimal solution. Simulated annealing differs for other search algorithms because it is designed to occasionally accept less optimal solutions, allowing the algorithm to avoid being 'trapped' in a local minima. The function that is being minimized is referred to as the cost function. The cost function is used to characterizes the fixed pattern timing noise; it involves applying the timing correction calibrations, followed by a calculation of the timing error using the intra channel timing test. The amount of added noise and the probability of accepting as worse solution are determined by a function referred to as the temperature. The temperature decreases for each iteration and it approaches 0 as the number of runs approaches infinity.

The simulated annealing method used to characterize the fixed pattern timing noise in the SST is described with the following psudocode:

- 1.  $T_{corr,new} = T_{corr,saved} + RMS_{noise}$ \*temperature
  - $T_{corr,new}$  is a new timing correction vector created by adding random noise to the saved correction vector (or initial state for first iteration)
  - *RMS*<sub>noise</sub> is a constant noise parameter, so the power of the added noise drops as the temperature decreases.
- 2.  $F_{cost}(S, T_{corr,new}) = STD(test_{intra channel}(S, T_{corr,new}))$ 
  - Define the cost function (the function that is being optimized)
  - The intra channel timing test is performed on a data set S (which is being calibrated with the vector  $T_{corr,new}$ )
  - The cost function returns the standard deviation of the period measurements from the intra channel timing test.
- 3. IF  $F_{cost}(S, T_{corr,new}) < F_{cost,saved}$ Perform{

 $F_{cost,saved} = F_{cost}(S, T_{corr,new})$  $T_{corr,saved} = T_{corr,new} \}$ 

- This accepts the new timing correction vector if it reduces the cost function
- 4. If the new timing vector is worst, still accept it with the probability =  $P_e^*$  temperature
  - Low probability of accepting worst result, so the algorithm in not stuck a local minima
  - $P_e$  is a fixed parameter, so the chances of accepting worst results reduces with more runs
- 5. temperature =  $\alpha^*$  temperature
  - $\alpha$  is a constant that sets the temperature reduction rate ( $\alpha < 1$ )

- This reduces the temperature after each iteration
- Loop the process until the rate of improvement in cost function drops to a desired low level

The input data set S contains sinewaves with several different frequencies to ensure that the simulating annealing algorithm does not optimize the performance for only one frequency, but reflects the performance over a range of frequencies. The intra channel timing test is selected as the cost function because it is responsive to changes in the timing calibration vectors, and it is less computationally intensive compared to the inter channel timing test. When performing the simulated annealing algorithm on the SST, the cost function monotonically improves, so the  $P_e$  coefficient is set to 0 to reduced simulation time. Although, this is not the case in general; depending on the system,  $P_e$  may need to be nonzero to avoid converging to a local minama instead of the global minimum .

## 4.7 SST Timing Calibration

The deterministic sample interval error is fixed for a given SST chip and the error is quantifiable; this was discussed in more detail in a prior section (section 4.4). Since the error is stable and fixed by cell position, the deterministic sampling error can be considered a timing fixed pattern noise (timing FPN). A portion of the sample interval errors are centered on predictable, reproducible mean values, and these errors constitute the timing FPN. With knowledge of these predictable errors, calibrations can be performed to nullify the effect of the timing FPN on the recovered data. Removing the timing FPN lowers the signal noise power caused by timing errors, and improves the SST's performance.

The timing FPN calibration utilizes interpolation to adjust for the fixed timing errors. This timing calibration is applied to the SST readout after pedestal subtraction has been performed. Either the zero crossing method (section 4.5) or the simulated annealing method (section 4.6) is used to determine the SST's timing FPN. A timing calibration vector, denoted as  $TC_{corr}$ , is a time vector that reflects the cell dependent interval errors, and it is created by taking the cumulative sum of the fixed pattern sample intervals. This is expressed in the equation:

$$TC_{corr}[i] = \sum_{k=1}^{i} Timing \, FPN[i] \quad for \, i \, equals \, 1 \, though \, N-1 \tag{4.34}$$

A piecewise cubic interpolation is used to adjust the sampled voltage, effectively removing the timing FPN from the data. Sample voltages corresponding to a uniformly sampled input signal are generated using the interpolation function in MATLAB, with the  $TC_{corr}$  vector and the SST readout vector as function arguments. The resulting output is a waveform record that is uniformly timed and free from the timing FPN.



Figure 4.8: Histograms of calibrated and uncalibrated period measurements of 3 ns period sine waves.

The timing FPN calibration reduces the RMS value of the total timing error. This is confirmed by comparing the intra channel timing test results of the calibrated data with the uncalibrated data. When the intra channel timing test is performed on sinewaves with 3 ns periods, the uncalibrated data measured periods with a standard deviation of 8.66 ps (RMS). After performing the timing FPN calibration on the same data set, the intra channel test period error was reduced to 2.21 ps (RMS). This improvement in the intra channel timing is illustrated in figure 4.8. The histograms of the period measurements for the calibrated and uncalibrated data are superimposed over each other. The period distribution of the calibrated data has a lower variance, signifying an improvement in the timing resolution.

Similarly, the inter channel timing tests performed on bipolar pulses with 60 ns delays show that the calibrated data has less timing uncertainty compared to the uncalibrated data. Before timing FPN calibrations, the standard deviation of the delays was measured to be 4.29 ps (RMS). Calibrating for the timing FPN improves the delay error to 2.28 ps (RMS). The normalized histogram for the delay measurements are show in figure 4.9.



Figure 4.9: Normalized histograms of calibrated and uncalibrated delays for 60ns cable delay inter channel timing measurements

The timing FPN calibrations also improve the SNR of the recovered signal. As mentioned in the section on sample interval error (section 4.4), the timing errors contribute to the noise power present on the output signal. Since the timing FPN calibration reduces the RMS sample interval error, the noise power is also reduced, thus improving the SNR. This is demonstrated by comparing the SNR of the calibrated data with the uncalibrated data. Figure 4.10 shows the FFT of a recovered 100 MHz sine wave before and after performing the timing FPN calibration. The timing FPN in the SST has a significant odd/even effect which cause a spurious frequency component at 900 MHz. After calibrating for the timing FPN, there is a near 10 dB reduction in the spurious frequency component. The timing FPN calibration yielded an improvement in the SNR of 6.57 dB, equivalent to over a bit increase in the output voltage resolution.



(b) With Timing FPN Calibrations

Figure 4.10: FFT of a recovered 100MHz sine wave

## Chapter 5

## SST Test Results

This chapter presents the SST's system verification results and its performance specifications. A description of the testing methodology and the measurement results are included in this chapter. The following specifications are discussed: sampling rates, voltage fixed pattern noise, analog bandwidth, sample voltage noise, dynamic performance, triggering sensitivity, and timing resolution. The final section of this chapter presents the results of scaling the SST design to a  $0.18\mu m$  technology and discusses the merits and demerits of fabricating the SST with more advanced fabrication processes.

## 5.1 System Verification

The waveform capture and readout functionality of the SST systems is tested by recording a pure sinusoidal tone and recovering the signal. An RF signal generator is used to provide a 100 MHz sinusoidal input. A plot of the recovered signal is shown in figure 5.1. The SST sampled the input at a rate of 2.0 GSPS and the analog output was read out at a rate of 1.0 MHz. To reconstruct the signal in figure 5.1, post processing was performed with MATLAB; this includes the removal of the voltage fixed pattern noise, the removal of the DC bias voltage, converting ADC bit outputs to a voltage scale, and performing timing fixed pattern noise calibrations.



Figure 5.1: SST readout of a recorded 100 MHz sinewave

The SST chip consumes 128mW of power when it is sampling at 2.0 GSPS. The SST's sampling phase is its most power intensive operation. Holding the record or performing the waveform readout require much less power since these operations do not require the high speed clock circuitry. With four SST channel per device, the SST realizes a low power operation of 32mW per channel.

The low frequency signal gain of the SST chip is measured to be 0.70V/V. This measurement was performed by dividing the amplitude of the SST's analog output over the amplitude of the input signal. The low frequency gain was also verified by taking the derivative of the SST's voltage transfer characteristic (DC sweep).

The SST system is also verified using a neutrino template signal, which is generated from a programmable waveform generator. The recovered neutrino template is shown in figure 5.2. The neutrino template signal was also sampled at 2.0 GSPS and post processed

with MATLAB.



Figure 5.2: SST readout of captured neutrino template signal

A notable SST chip feature is its extremely wide range of operational sampling rates. The SST sampling rates are simply set by changing the LVDS input clocks frequency. The SST has been verified to have an extremely wide range of sampling rates and it can operate across six orders of magnitude in its sampling rates. The SST demonstrates waveform captures with sampling rates from as low as 2.0 KHz, to as high as 2.0 GHz.

## 5.2 Voltage Fixed Pattern Noise

The SST chip experiences voltage fixed pattern noise (VFPN), where the analog readout voltage from each cell has a fixed, random offset voltage. The cause of the VFPN is process variation in the analog readout/multiplexing circuitry. Variations in the transistor dimensions and variations in the threshold voltages create differing  $V_{gs}$  voltages for each cell's readout circuitry (the modified source follower circuit). The differences among the  $V_{gs}$ voltages appear on the SST readout voltages as VFPN. The VFPN vector is measured by recording and digitizing a DC bias voltage repeatedly, and averaging each sample cell's offset voltage. The VFPN measurement (which includes a DC offset) is plotted in figure 5.3.



Figure 5.3: Voltage fixed pattern noise measurement(with DC offset)

The distribution of the VFPN is presented in figure 5.4. The standard deviation of the VFPN across all four channels of an SST chip is measured to be 6.3 mV (RMS). The presence of VFPN contributes additive voltage noise to the SST's analog output. This is problematic because it degrades the SST signal to noise ratio and lowers its effective bit resolution.

Fortunately, the VFPN can be calibrated out of the recovered waveform with post processing of the readout voltages. The VFPN is fixed for each channel in the SST chip. Therefore, once the VFPN is characterized, its affects on the signal is known and the VFPN can be removed. To characterize the VFPN vector, about 10<sup>4</sup> base line voltage measurements are taken and averaged to filter out stochastic noise. The resulting vector is the VFPN correction vector, and it represents the fixed, cell dependent offset voltages present in every analog readout. To correct for the VFPN, a pedestal subtraction is performed where the VFPN correction vector is subtracted from the SST readout vector. This effectively removes



Figure 5.4: Voltage fixed pattern noise distribution

the DC bias voltage and the VFPN from the SST's analog output.

## 5.3 Bandwidth Measurement

The analog -3dB bandwidth frequency of the SST chip is measured to be 1.52 GHz. A plot of the normalized output amplitude versus frequency is presented in the figure 5.5. The y-axis in figure 5.5 is expressed in decibel scale. The plot is normalized to the amplitude of the 100 MHz input (lowest frequency sinewave in this data set) so that the plot begins on 0dB. The frequency response of the SST is reasonably flat up to 1.3 GHz. Although there is minor frequency peaking at 1.1 GHz, the frequency response does not deviate by more than  $\pm 0.5$  dB across the Nyquist frequency band (assuming operation at the nominal 2.0 GSPS rate).

The data plotted in figure 5.5 was generated by sweeping the frequency of a sinewave produced with a RF signal generator, the recording the waveform, reading out the captured



Figure 5.5: Normalized output amplitude versus frequency plot

waveform, and measuring the peak to peak amplitude of the readout signal. The input sinewaves were large signal waveforms that span the SST's linear input voltage range. The input sinewaves were centered on 0.9 V, set by the bias Tee voltage, and had a maximum and minimum voltage of 1.6 V and 0.2 V respectively.

## 5.4 DC Performance Measurements

The voltage transfer characteristic (VTC) of the SST was measured by manually sweeping the SST input voltage with a DC voltage generator, and measuring the analog readout voltage. Each DC output voltage was taken as the average of hundreds of trials, where the voltages across all the samples cells were averaged; the averaging was necessary to remove the VFPN while preserving the output DC voltage. The result of the DC sweep is shown in figure 5.6.



Figure 5.6: DC voltage transfer characteristic of the SST

The output voltage has a very linear relationship with the DC input voltages when the inputs are below 1.6 V. The SST's VTC is modeled using a linear regression line; this line models the SST's input versus output relationship for DC signals and it is given by the equation:

$$V_{out} = 0.7036 * V_{in} + 0.36 \tag{5.1}$$

The regression line in figure 5.6 is the plot of the modeled VTC using equation 5.1. The  $r^2$  value of the data is 0.9995, indicating that equation 5.1 is an excellent fit for the data, and that the DC input and the DC output share a highly linear relationship. The derivative of equation 5.1 reveals the DC voltage gain of the SST chip and it is calculated to be 0.7036 V/V. The SST's DC output nonlinearity is examined by plotting the error on a linear fit. Figure 5.7 plots the voltage error between the SST DC output and the value predicted by the regression equation 5.1 at each DC input voltage; the error is plotted in units of milivolts.

The practical input voltage range of the SST is determined to be between 0.062 V



Figure 5.7: Linear error on DC output voltage

and 1.6 V. Low frequency input signals that stay within this voltage range are recorded by the SST with very little non-linear distortion. Driving the input voltage above 1.6 V will noticeably compress the signal, resulting in nonlinear distortions that clip the peak voltages of the waveform. The SST can process DC inputs up to 1.9V, but there is greater than 1% non-linearity for inputs above 1.7 V. The analog output voltages ranges between 0.43 V and 1.49 V when the input voltages are between 0.062 V and 1.6 V; this results in a 1.06 V output voltage range.

## 5.5 Sampled Noise Measurements

The output referred noise of the SST chip is measured directly using histogram techniques. Several thousand DC base line voltages were recorded across all four input channels. The VFPN and the DC bias voltages were removed using pedestal subtractions; this results in a signal that is the output voltage noise. The normalized distribution of the voltage noise is plotted in figure 5.8. The dashed line indicates an ideal Gaussian distribution fitted to the noise data. It is worthwhile to note that bins in figure 5.8 do not strictly correspond to the quantized values of a 12-bit ADC. This is because the post processing calibrations adjust the outputs voltages using correction vectors that have averaged ADC output voltages. Therefore, the corrected SST outputs voltages do not strictly correspond to the 12-bit quantization values.



Figure 5.8: Distribution of the SST output noise

The RMS value of the output noise is 0.34 mV (RMS), which corresponds to 0.56 LSB. It is noted that these noise measurements are unaffected by sampling jitter because the SST is sampling a fixed DC voltage. The sources of the measured 0.34 mV (RMS) output noise are the capacitor thermal noise (kT/C), the ADC quantization noise, and the noise inherent in the analog readout circuitry. Using the measured, low frequency SST voltage gain of 0.7 V/V, the noise referred to the input of the SST is 0.4857 mV (RMS).

## 5.6 Dynamic Signal Performance

In general, measuring a circuit's performance with transient signal reveals circuit properties which are not measured in the DC characterization. To characterize the dynamic performance of a system, its noise contributions and its distortions are measured using fast Fourier transform (FFT) techniques. The effective number of bits (ENOB) can be derived from the FFT. The ENOB signifies the number of bits that the output can be converted to, while being unaffected by the system's noise or distortion. The ENOB is a widely used metric that conveys the accuracy and fidelity of a system's output. To test the SST's ENOB, a 100 MHz, 1.0 Vpp sinewave (produced by a RF signal generator) is recorded, readout and digitized by the SST data acquisition system. Using MATLAB, the voltage FPN corrections and timing FPN corrections are applied, and an FFT is applied to the corrected signal. The resulting plot of the waveform's frequency component is shown in figure 5.9.



Figure 5.9: FFT of an SST readout of a 100 MHz sinewave

The SINAD is calculated from the FFT, and the ENOB of the SST can then be calculated using the SINAD. The SINAD value is the measured ratio of the roots means squared (RMS) value of the input signal over the RMS value of the spectral noise and distortion components (but excluding the DC component). The ENOB value is given by equation:

$$ENOB = \frac{SINAD - 1.76 + 20 * \log\left(\frac{V_{fs}}{V_{in}}\right)}{6.02}$$
(5.2)

In equation 5.2, the SINAD is expressed in dBc; the  $V_{fs}$  is the full scale voltage of the ADC, and  $V_{in}$  is the amplitude of the input sinewave. The 20 \* log  $\left(\frac{V_{fs}}{V_{in}}\right)$  term in the numerator of equation 5.2 is used as a correction factor to scale the ENOB value if the input amplitude is not full scale; if the input is at full scale, this term equals zero.

The measured ENOB for the SST system is 6.52 bits. As seen in the figure 5.9, there are significant harmonic spectral components; the largest spur occurs at the second harmonic. The large second harmonic component dominates the ENOB performance. This reveals that the SST's dynamic bit resolution is limited by the system's inherent, nonlinear distortion and not the system noise. Several factors contribute to the nonlinearity, including cell dependent gain variation, timing jitter and the nonlinearity of the ADC on the system board. However, the dominant contributor to the nonlinearity is the varying switch resistance of the sample and hold transmission gates. The sample and hold switch nonlinearity is the most significant source of distortion for several reasons. The cell dependent gain variation is ruling it out as a significant source of nonlinear distortion because calibrations were made to negate it, but that did not result in any appreciable improvements in the linearity. The magnitude of the timing jitter, which was measured with the previously mentioned timing analysis methods, is too small to account for the amount of distortion seen in the FFT. Finally, the total harmonic distortion (THD) and SINAD specification for the ADC on the system board are far better than the measured distortion. This indicates that ADC is not responsible for the large distortion terms seen in figure 5.9. Logically, that leaves the nonlinearities inherent in the sample and hold switch as the greatest source of distortion in the SST data acquisition system.

As discussed in section 3.2, input signals with large amplitudes causes the small signal switch resistance in the sampling cells to shift throughout the tracking phase. The switch resistance is dependent on the input voltage, and does not behave as a linear resistor. As the input signal spans the entire input voltage range, the small signal operating point resistance experiences wide variations. The operating point resistance deviates by as much as 107% from its resistance at the nominal bias point. The switch resistance nonlinearity is most prevalent with full scale inputs, and it is most evident when the input signal is near the upper limit of the voltage range. This nonlinearity results in the harmonic spurs which dominate the ENOB performance of the SST.

Another important measure of the SST's dynamic performance is its signal to noise ratio (SNR). The SNR is defined as ratio of the RMS noise to the RMS of the fundamental signal, and the SNR is calculated from the FFT. The SNR calculation excludes the harmonic terms, and only reports on the noise terms; therefore, the SNR will always be better than the SINAD. The SNR measurement includes noise from the entire Nyquist band. After accounting for the FFT process gain, the SST system has an SNR of 64.96 dBc. This noise measurement accounts for all the noise inherent in the SST system, including the kT/C noise (the largest SST noise source) and the ADC quantization noise. Based only on the noise performance, the system has an equivalent bit resolution of 10.5-bits.

## 5.7 Event Triggering Testing

The real time event triggering capability of the SST chip was tested by examining the functionality of the high speed comparators and the logic circuitry. The SST's capability to triggering on an event was tested using every trigger logic setting in conjunction with various combinations of high and low threshold voltages. The SST threshold values are set by programming the MBED micro-controller on the SST system board. Digital to analog converters (DAC) provide the appropriate voltages to the SST threshold voltage pins. Each of the four SST channels are equipped with both a high and a low threshold input, resulting in a total of eight threshold inputs per chip. Each channel has its own set of comparator outputs. The SST event triggering has been verified for threshold voltages ranging between 0.1 V and 1.6 V.

The SST triggering has two modes (an OR logic and an AND logic) and both of these modes were verified to be functional. The OR logic triggered any time either a high or low threshold crossing occurs. The functionality of the OR logic was tested using unipolar pulses that only crossed either the high or low threshold. When set to the AND logic mode, triggers successfully occurred if and only if both the high and low threshold crossings occurred within the coincidence window. By adjusting the external DC bias voltage, it was verified that the coincidence window can be set to any duration between 10.0 ns and  $1.0 \,\mu s$ . The AND logic testing inputs consisted of positive and negative pulses separated by variable delays; these signals were generated with an arbitrary waveform generator.

The sensitivity of the comparators was tested with transient pulses produced by a high speed pulser. An oscilloscope screen shot of the sensitivity test is included in figure 5.10. The waveform on the top of figure 5.10 is the input signal, which was probed at the SST motherboard input pin (before the bias tee). The waveform on the bottom of figure 5.10 is the trigger output, probed at the trigger output pin of the SST chip.



Figure 5.10: Comparator sensitivity test waveforms

The SST triggering successfully detected input pulses with a height as low as 9 mV and a pulse width of 500 ps FWHM. The bias tee on the SST system board provides AC coupling for the pulser signal. For the sensitivity test, the pulser signal is connected to the SST system board through an attenuator to reduce the amplitude of the input pulse. The attenuation was gradually increased until the SST triggering output fails to consistently trigger. The SST triggering detected events with a near100 percent accuracy rate for inputs with a duration of at least 500 ps FWHM and a minimum height of 9m Vpp.

#### 5.8 Measured Fixed Pattern Sample Interval Error

The mean sample interval for each cell was measured using the zero crossing method (discussed in section 4.5) and the simulated annealing method (discussed in section 4.6), with the SST set to its nominal 2.0 GHz sample rate. The zero crossing method and the simulated annealing method produce timing measurements that are in reasonably close agreement. The interval measurement using the zero crossing method is plotted in figure 5.11. Figure 5.11 reveals that each sampling interval experiences a nonzero error from the nominal 500 ps sample interval. These reoccurring errors account for the timing fixed pattern noise (timing

FPN) and are effectively removed with the timing calibrations discussed in section 4.7. The RMS value for the fixed pattern sample interval error is 30.95 ps (RMS). The timing FPN has two distinct features; one of the features is a clearly alternating pattern among the sample intervals; this will be referred to as the odd/even timing FPN in the remainder of the text. Separating the cell intervals into odd and even positions, the mean sample intervals of the odd positions is 529.37 ps, while the mean sample intervals of the even positions is 470.62 ps. This reveals a bimodal distribution in the SST's timing FPN. The bimodal distribution is further confirmed with the histogram of the sample intervals presented in figure 5.12.

The second notable feature of the timing FPN (figure 5.12) is a pronounced spike in the timing error on the sample positions 251 and 252. The large timing fluctuation occur in these positions because they are connected to the feedback node in the fast shift registers. The fast shift register implements the continuous signal sampling by feeding back the sampling pointer to the beginning position of the sampling array. The circuitry that feeds back the sampling pointer introduces additional capacitative loading on the nodes it is connected to. This uneven loading introduces additional timing delays associated with theses sampling positions. The timing spike can be effectively eliminated by equalizing the capacitive load on all stages of the fast shift register. Therefore, in future SST iterations, careful equalization of the capacitative loads will prevent the occurrence of the outlining spike in the timing FPN.

A possible cause of the odd/even timing FPN is the dual edge, synchronous sampling clock generation. One grouping of the sampling intervals closely track the high clock phase (duration the clock is high) while the other grouping of sample intervals track the low clock phase (duration the clock is low), causing the sample clock to be susceptible to errors from deviations in the clock duty cycle. Any deviation from a 50% high, 50% low clock signal would appear as differences in the odd and even sample intervals. There are a few likely causes of the uneven duty cycle in the SST's internal clock. One culprit is the oscillator itself; if the oscillator produces uneven duty cycles, that would be translated into periodic



Figure 5.11: The fixed pattern sample intervals plotted versus sample cell position



Figure 5.12: Histogram of the fixed pattern sampling intervals

sampling jitter. Another contributor to the odd/even timing FPN is the mismatches in the LVDS receiver. Uneven rise and fall times in the LVDS output would translate into variations between high and low durations, resulting in a skew in the duty cycle.

While the theory states that the presence of duty cycle distortion will cause the odd/even timing FPN, simulations and testing results indicate that the duty cycle is not the source of the odd/even effect measured in the SST. The LVDS reference clock input to the SST was observed directly using a high frequency oscilloscope with a high speed, differential probe. The measured duty cycle of the LVDS reference clock was within 0.5% of 50.0%. Simulations of the LVDS receiver also shown that, under proper bias conditions, the LVDS receiver output had virtually no duty cycle distortion. From these observations, the duty cycle variation in the SST system would not account for the amount of measured odd/even timing FPN.

The best explanation for the cause of the odd/even timing FPN is that differences in the clock paths create timing variations between the odd and the even sample intervals. The dynamic D flip-flop shown in figure 5.13 is a segment of the fast shift register, which synchronously shifts the sampling pointer to generate the sampling clock signals. The  $\phi_N$ and  $\phi_{N+1}$  signals are the sampling clock signals that go to an even and odd sampling cell respectively. The  $\phi_N$  signal experiences one less delay stage than the  $\phi_{N+1}$ . Therefore, the odd sample edges experience an additional fixed delay compared to the even sample edges, causing the odd/even timing FPN. In practice, the inverter stages in figure 5.13 consist of several tapered inverters in a chain in order to drive medium to large capacitive loads. Based on circuit simulations, the delays from these inverter chains are in the order of several tens of picoseconds. Using the simulated clock path delay discrepancies to account for the timing FPN, the simulation matched within 20% of the measured timing FPN. The clock path mismatch is believed to be the actual cause of timing FPN.



Figure 5.13: Fast shift register sampling clock generation

## 5.9 SST Timing Resolution Measurements

The limits of the SST's timing accuracy were measured using both the intra channel and the inter channel timing tests. The details of the timing tests were explained in chapter 4. In This section the results from these timing tests are presented, the significance of the results discussed and a brief comparison of the timing performance of the SST and other prior SCA recorders is presented.

The intra channel test, which measure the period variations among recorded sinusoids, is depended on the input frequency and amplitude. The intra channel timing test is performed for several different node frequencies. The inputs to the intra channel timing tests are sinusoids with an amplitude of 1.6 Vpp. This is the largest input amplitude that does not cause significant distortion in the SST waveform record. The standard deviation of the intra channel timing test's period measurements are shown in table 5.1. Table 5.1 contains the intra channel test results for the uncalibrated measurements, timing results with the zero crossing method timing calibrations, and timing results with the simulated annealing timing

| Freq     | No Timing Corrections | Sim Anneal | Zero Crossing       |
|----------|-----------------------|------------|---------------------|
| 100M     | 9.64ps                | 5.826ps    | $5.768 \mathrm{ps}$ |
| 125M     | 9.238ps               | 4.028ps    | 3.955ps             |
| 200M     | 8.997ps               | 3.061ps    | 3.034ps             |
| 333.333M | 8.69ps                | 2.3ps      | 2.21ps              |

calibrations. A scatter plot of the table 5.1 data is presented in figure 5.14.

Table 5.1: Table of intra channel timing test period error for various node frequencies (in units of ps RMS)



Figure 5.14: Scatter plot of intra channel timing test results for various node frequencies and timing calibrations

The intra channel timing test reveals that the SST's accurately capture timing features on a single channel with an error of 2.21 ps (RMS). It should be noted that this resolution is achieved under favorable condition, where the input amplitude is chosen to maximize the SNR, and the input is set to the relatively high frequency of 333.333 MHz. If the intra channel input were at a lower node frequency or the input amplitude is lower than 1.6 Vpp, then the period measurement resolution will not be as high. As expected, the calibrated data shows an improvement in the timing variations across each of the tested node frequencies. The improvement in the intra channel timing resolutions is evidence of the timing calibration's effectiveness at reduced the timing FPN. Comparing the outcomes of the zero crossing method with simulated annealing methods shows that the timing results are in very close alignment. This result is also to be expected; although the two calibrations were derived using different methods, they are both addressing the same inherent SST timing FPN. Figure 5.14 shows that the intra channel timing test has a dependence on the input frequency. With all other factors fixed, a higher frequency inputs reduces the impact of the voltage noise on the linear interpolations used in the intra channel period measurements, resulting in an improved intra channel timing resolution.

To provide a reference for the SST's timing performance, the timing resolutions of prior SCA recorders is discussed. The DRS4 chip developed by Professor Ritt reports results a period measurement timing resolution of 3.1ps (RMS) using from 100 MHz, 1.1 Vpp amplitude sinusoids. Ritt's team used a different method of timing corrections, which they developed and they describe in the reference [16]. Caution must be taken when making comparisons between devices since the two circuits were tested under different conditions and parameters. The SST was tested with a more favorable configuration, using larger amplitude sinusoid and a higher frequency inputs than the DRS4.

The inter channel timing test measures the resolution of the SCA recorder's time delay measurements made between multiple channels. The accuracy of the reconstruction of a detected particle's path is dependent on the accuracy of the measured signal's arrival times; this makes the inter channel timing resolution an important specification for using the SST in particle physics experiments. Generally, the inter channel timing test is of more interest to physicists than the intra channel test. The timing measurements made in the inter channel timing test are depended on the input's amplitude, pulse width, and the devices' sample rate. The SST's inter channel measurements are tested using bipolar pulses with an amplitude of 1.2 Vpp and pulse widths that are 20 samples wide (n=20 corresponds to  $T_{pulse}$ = 10 ns when the SST is sampling at 2.0 GSPS). The amplitude of the signal coming out of the RF generator is 1.6Vpp, but the signal splitter causes signal attenuation and lowers the amplitude of the signal at the SST input to 1.2 Vpp. The inter channel tests are repeated for several different pulse delays. The resulting the standard deviations of the delay measurements are shown in table 5.2. Table 5.2 contains the inter channel test results of the uncalibrated measurements, timing results with the zero crossing method timing calibrations, and timing results with the simulated annealing timing calibrations. The data in table 5.2 is plotted in figure 5.15.

| Delay   | No Timing Corrections | Zero Crossing Method | Simulated Annealing |
|---------|-----------------------|----------------------|---------------------|
| 0ns     | 1.15ps                | 1.15ps               | 1.15ps              |
| 3.79ns  | 2.68ps                | 2.35ps               | 2.26ps              |
| 12.33ns | 3.62ps                | 2.32ps               | 2.29ps              |
| 24.42ns | 3.59ps                | 2.33ps               | 2.29ps              |
| 60.74ns | 4.29ps                | 2.36ps               | 2.36ps              |

Table 5.2: Inter channel timing test delay uncertainty for various delays (in units of ps RMS)

An observation made from the delay resolution plot in figure 5.15 is that the Ons delay measurements has the lowest delay error. The minimum delay error for the Ons delay measurements is explained in the following. With the 0 ns delay applied, the input pulses are aligned on both channels, so both channels capture the pulses on the sample positions. The four SST channels share the same sampling clocks signals, so the each sampling cell position experiences the same sample interval error for all the channels. For example, the sample interval error on cell 1 of channel 0 is the same as the sample interval error on cell 1 of channel 1. Therefore, the Ons delay pulses are have identical timing errors since the input pulses are captured with the same cell positions. Since the timing errors match of the 0 ns delays, each channel's timing error relative to each other is effectively zero even



Figure 5.15: Inter-channel timing test

though there are nonzero errors with respect to the ideal interval. With no relative timing errors, the cross correlation delay measurement for the 0ns delays essentially have no timing jitter and thus exhibits the lowest error. On the other hand, the nonzero delayed pulses are sampled across different cell positions in the two channels resulting in different sample interval errors across the pulses in the two channel. In this non-zero delay scenario, the differences among the sample interval errors appears as timing noise when the signals are cross correlated to measure the delay. This increases the amount of noise on for the nonzero delay measurements and causes larger errors on the delay measurements.

For 0ns delay, the SST achieves its best resolution of 1.15 ps (RMS). With timing corrections, the inter channel resolution remains practically flat at 2.3 ps (RMS) across all non-zero delays up to 60 ns delay. The flat delay variation indicates that there is not any significant drifting in the sample interval errors across the sampling array, and that the interval errors maintain centered about a zero mean. Even without the application of the timing corrections, the SST resolutions still displayed reasonably high timing accuracy. The non-calibrated operations maintained a variation of 4.29 ps (RMS) or below across the measured range of delays.

The timing delay resolutions achieved by prior SCA records are presented here to provide reference points for the SST's timing performance. Professor Varner's PSEC4 recorder achieves a 10.9 ps resolution for 0ns delays without any calibrations and an improvement to 4.3 ps after timing calibrations. Tested with 16 ns delays, the PSEC4 has a resolution larger than 100 ps, but it improves to 8.8 ps after timing corrections [38]. Their resolution measurements were performed with an input sampling rate of 10.4 GSPS.

Professor Ritt's DRS4 chip realizes timing resolutions ranging between 0.75 ps to 1.65 ps for delay measurements between 0 ns and 50 ns [16]. These resolutions are achieved using timing calibrations and with a sampling rate of 5.0 GSPS.

Caution must be taken when comparing the delay resolution (inter channel timing test) of these devices. The SST, the PSEC4 and the DRS4 recorders were all tested under different conditions and and with different parameters. The reported timing resolutions were tested with different pulse shapes, pulse durations and pulse amplitudes; all of which would effect the resolution measurements. Additionally, the other research teams developed their own timing calibrations for used on their device and those calibrations differ from the method used on the SST.

### 5.10 Scaling the SST Design

A 0.18  $\mu m$  RF CMOS fabrication technology was made available to the ARIANNA engineering team. The SST chip design was adapted for the 0.18  $\mu m$  process to explore the capabilities of the more advanced technology and to experiment with scaling down the
SST design. The core of the 0.18  $\mu m$  SST prototype uses the same circuit topologies as the 0.25  $\mu m$  SST device. To develop the 0.18  $\mu m$  SST prototype, modifications were made to the bias currents, bias voltages, and transistor aspect rations so that the circuit operation would be compatible with the 0.18 $\mu m$  technology. The 0.18  $\mu m$  fabrication technology allows for shorter minimum transistor gate length and feature sizes. The 0.18  $\mu m$  technology has a slightly higher transconductance parameter K' compared to the 0.25 $\mu m$  process. The power supply voltage of the 0.18  $\mu m$  technology is reduced to 1.8 V compared to 2.5 V for 0.25  $\mu m$ . The nominal threshold voltage of the 0.18  $\mu m$  technology is around 0.35 V as compared to about 0.43 V for 0.25  $\mu m$ .

A major advantage of the 0.18  $\mu m$  fabrications is the reduction in parasitic capacitances, which improves the SST performance in areas related to speed and timing. Lower parasitic capacitances on high speed circuit nodes allows for greater sampling rates and higher bandwidth (assuming comparable array size and power consumption). The 0.18  $\mu m$ SST achieves a sampling rate of 3.0 GHz, reducing the sampling interval to 333.33 ps. Taking advantage of the lower parasitic capacitances, the array depth of was increased to 512 sampling cells while maintaining a bandwidth greater than a gigahertz. The 0.18  $\mu m$  SST achieved a measured analog bandwidth of 1.2 GHz.

The design of the 0.18  $\mu m$  SST offered the opportunity to investigate techniques to reduce the fixed pattern sample interval errors. The 0.25  $\mu m$  SST displayed a pronounced odd/even error pattern caused by fixed mismatches among clock paths. Due to the work of my colleague Tarun Prakash, the fixed pattern timing error in the 0.18  $\mu m$  SST design was significantly reduced; the details of the odd/even fixed pattern interval error reduction is discussed in his dissertation [19]. A second set of fast shift registers was added to the 0.18  $\mu m$ SST design so that an independent set of shift registers control the odd samples and another set of shift registers control the even samples. The dual fast shift register design allows for capacitive load and signal path equalization that largely reduces the sample interval offsets between the odd and the even cells, reducing the overall fixed pattern sample interval error.

The fixed patterns sample interval error for the 0.18  $\mu m$  SST sampling at 3.0 GHz was measured using the zero crossing method and is plotted in figure 5.16. The fixed pattern sampling interval plot is centered on the nominal interval of 333.33ps and has a standard deviation of 8.28 ps (RMS). While the 0.18  $\mu m$  SST is still subjected to errors from random process variations, the alternating odd/even component of the fixed pattern sampling interval error is greatly diminished compared to the 0.25  $\mu m$  SST's timing FPN (shown in figure 5.11). A histogram of the 0.18  $\mu m$  SST fixed pattern sample interval is shown in figure 5.17. The distribution of the fixed pattern timing noise is centered about a single center of mass as opposed to the bimodal distribution of the 0.25  $\mu m$  SST shown in figure 5.12.

Normalizing the sample interval error against the nominal sample interval allows for comparisons of sample interval uniformity between the 0.18  $\mu m$  SST and the 0.25  $\mu m$  SST despite the different sampling rates. The normalized sample interval error of the 0.18  $\mu m$  SST is 2.5% of the nominal interval; whereas the normalized sample interval error of the 0.25  $\mu m$  SST is 6.2% percent. The 0.18  $\mu m$  SST's marked improvement to the fixed pattern sample interval error is predominantly due to the changes in the sample clock generation that removes the odd/even fixed pattern sampling error.

A major drawback to the 0.18  $\mu m$  SST is that it suffers from a reduction in both the input and output operating voltage range. The input of the 0.18  $\mu m$  SST maintains linear operation for inputs between 178 mV and 1.09 V. The resulting input voltage range equals to 913 mV. The output voltage range of the SST (which experiences a DC voltage shift due to a PMOS source follower stage) ranges from 1.0 V to 1.545 V for an output voltage range of 545 mV. The SST output voltage range is limited by the output voltage range of the analog readout/multiplexing circuit discussed in section 3.3. The upper limit of the output voltage range is expressed in the equation 3.26. Equation 3.26 shows that some control can be exerted over the operating voltage range through bias and aspect rations choices. However,



Figure 5.16: Fixed pattern sample intervals plotted against cell position of the  $0.18 \mu m$  SST



Figure 5.17: Histogram of fixed pattern sample interval error of the  $0.18 \mu m$  SST

the analog readout circuit is bounded by the upper limit of  $V_{dd} - |V_{th0}|$ . The fabrication technology determines both the power supply voltage and the nominal threshold voltage.

Therefore the fabrication technology is a major factor in determining the usable voltage range of the SST. The 0.18  $\mu m$  fabrication process has a  $V_{dd}$  and PMOS threshold voltage (without body effect) equal to 1.8 V and -0.41 V respectively. Compared to the 2.5 V and -0.52 V for the 0.25  $\mu m$  process'  $V_{dd}$  and PMOS threshold voltage respectively. While the threshold voltage of the 0.18  $\mu m$  technology is reduced, the overall  $V_{dd} - |V_{th0}|$  term is still lower, presenting a lower limit the operating voltage range. The reduced operating voltage range of the 0.18  $\mu m$  SST limits its utility for use in the ARIANNA experiment since it decreases the dynamic range of the SST and potentially decreases the SNR.

Using more advanced fabrication technologies to fabricate the current SST design would result in a trend of worsening dynamic ranges. As the fabrication feature sizes reduce, the power supply voltage scales as well. While the transistor threshold voltages also reduce with scaling, the reduction the power supply voltage decreases more rapidly than the threshold voltage. Further scaling of the SST is impractical as it would result in an excessively narrow operational voltage range.

Greater leakage current present another drawback to scaling the SST. As the gate length of the transistors decrease, short channel affects become more pronounced, leading to greater subthreshold leakage current. This leakage alters the sampled voltages stored across the sampling array and create nonlinear, time dependent signal distortion.

The voltage gain of the SST potentially suffers when it is scaled down. The voltage gain of the 0.18  $\mu m$  SST was measured at 0.714 V/V. The 0.18  $\mu m$  SST reads out sample voltages through the analog readout circuity (which behaves in part as a source follower stage) followed by a unity gain feedback buffer to drive the analog output pin. The 0.714 V/V SST voltage gain is principally determined by the voltage gain of the analog readout circuit. The voltage gain of the 0.18  $\mu m$  SST is higher than 0.70 V/V SST voltage gain of the 0.25  $\mu m$  SST because the 0.25  $\mu m$  SST uses a source follower stage to drive the analog output pin. The 0.25  $\mu m$  SST's sample voltage experiences attenuation from two source follower stages

while the 0.18  $\mu m$  experiences the attenuation from only a single source follower. If using only a single source follower stage, the 0.25  $\mu m$  SST could achieve a voltage gain of around 0.80 V/V. A similar circuit in the 0.18  $\mu m$  process reaches a voltage gain closer to 0.7 V/V. While the difference may seem minor, if there is a cascade of multiple stages, the increase in attenuation dramatically impacts the overall gain. The 0.18  $\mu m$  SST uses a unity gain analog output driver to raise the overall voltage gain. If a source follower style output driver was used instead, the 0.18  $\mu m$  SST voltage gain would drop to 0.56 V/V and suffer a large reduction in dynamic range. In future iterations, the voltage gain of the 0.25  $\mu m$  SST could be increased by using a unity gain voltage buffer as the analog pin driver instead of the source follower stage. The reason the 0.18  $\mu m$  SST experiences the lower source follower gain is due to the lower output resistance ( $r_0$ ) and an increased body effect transconductance (gmb) of the transistors in the 0.18  $\mu m$  process. In general, the SST voltage gains will continue to drop as they are scaled down further. The more advances technologies will have more pronounced short channel effects, which make it difficult to achieve high analog voltage gains.

## Chapter 6

## **Conclusions and Summary**

The uses of time-interleaved sampling through an SCA, is a highly successful technique for capturing brief, high speed analog waveforms. The sampling operation is fairly low power, so the SST realizes a very power efficient circuit that achieves multi-gigahertz sample rates. For comparison, a multi-gigahertz ADC solution to data acquisition would require nearly a watt of power. On the other hand, the SST only consumes 32 mW of power per channel, realizing a low power data acquisition. The time-interleaved sampling technique is sensitive to circuit mismatches, which appears as noise on the SST output. However, many of these noises can be calibrated out resulting in accurate, low noise waveform readouts. After calibration, the SST output achieves a high SNR equivalent to over 10-bits.

The SST event triggering system uses trigger logic with dual high speed comparators per channel; the resulting triggering system is efficient and effective. An important feature in this triggering system is its ability to monitor for bipolar thresholds using a coincident window. This allows for a high degree of noise rejection and it provides precise control over the event triggering rate. It also simplifies the triggering setup compared to the ATWD's pattern matching triggering, which requires additional programming and calibrations. The SST trigger system's high sensitivity is a product of the high speed open loop comparator design. The cascaded, open loop comparators achieves high speed signal amplification that allows for accurate threshold discriminations for small and brief threshold crossings.

The synchronous timing generation is proven to be capable of multi-gigahertz operation. While other timing generation methods, such as the PLL and DLL timing generation methods, can achieve higher sampling rates, the synchronous timing generation still demonstrates high rate sample clock generation with low timing jitter. The use of a stable, low noise external oscillator results in robust, low noise timing generation that functions across an extremely wide range of sampling rates. The SST achieves an inter channel timing resolution of 4.3 ps (RMS) even without any timing calibrations; this is largely due to the synchronous timing generation's low timing noise performance.

This paragraph highlights several factors that play a prominent role in the SST's timing resolution. The inter channel timing resolution is dependent on the sampling rate, and increasing the sampling rate generally improves the inter channel timing resolution. The timing resolutions can also be improved by increasing the SST's output SNR. The SNR can be affected if the SST's input voltage range is too restrictive; therefore, a large SST dynamic range is vital to achieving high timing resolutions. Carefully balancing of the capacitive loads and signal paths in the timing generation circuitry (for both the schematic design and the layout) have shown to improve the fixed pattern sample interval errors. Despite best effort, some fixed pattern sample interval error will still be present due to process variation. However, it has been shown that fixed pattern sample interval error can be characterized and effectively removed through calibration to improve the timing accuracy of the SST.

In summary, the SST chip achieved several primary design objectives. Four channel waveform capture functionality has been integrated into a single IC. The SST chip's power consumption is significantly lower on a per channel basis compared to the previous ARI-ANNA data acquisition chip. Sampling at a 2.0 GHz rate, SST chip's 256 sampling cell

| Technology                                  | 0.25 $\mu$ m CMOS ( $V_{dd} = 2.5$ V) |
|---------------------------------------------|---------------------------------------|
| Number of channels                          | 4                                     |
| Samples per channel                         | 256                                   |
| Chip size                                   | 2.5 by 2.5 mm                         |
| Package size                                | 8  mm by $8  mm$ , $56  pins$         |
| Input clock (typical)                       | 1.0 GHz LVDS                          |
| Sample rate (typical)                       | 2.0 GHz                               |
| Minimum sample rate                         | 2.0 kHz                               |
| Analog bandwidth                            | 1.5 GHz, -3dB                         |
| Max power per chip with trigger             | 128 mW at 2.0 GSPS                    |
| Maximum analog input range                  | 0.05-1.95 V                           |
| Input-referred noise power                  | 0.42  mV, RMS                         |
| Dynamic range                               | 10.5 bits, RMS                        |
| Fixed pattern noise                         | 6.5  mV, RMS                          |
| Leakage rate                                | $\approx 0.15 \text{ V/s}$            |
| Uncorrected intra channel timing resolution | 9.3 ps sigma                          |
| Corrected intra channel timing resolution   | 2.1 ps sigma                          |
| Uncorrected inter channel res.              | 1.15 - 4.3 ps sigma                   |
| Corrected inter channel timing resolution   | 1.12 - 2.37 ps sigma                  |

Table 6.1: Summary of the SST Specifications

array records a 128 ns transient record of the input waveform. The SST has an analog bandwidth of 1.5 GHz allowing for the capture of frequency components across the entire Nyquist band. A high speed comparator based trigger logic system implements effective event triggering, and it is capable of discriminating on 500 ps FWHM pulses with an overdrive voltages as low as 9 mV. The internal SST sample clock is synchronously generated based on an external LVDS oscillator. By interchanging oscillators, the system could sample at any rate within the extremely wide range of 2.0 KHz to 2.0 GHz. The SST's synchronously generated sample clock demonstrated timing FPN that mostly stays within  $\pm 30$  ps. A method of timing FPN calibration was presented. With the timing calibrations, the SST achieves inter channel timing resolutions between 1.12 ps (RMS) and 2.37 ps (RMS). A table of the SST specifications is provided in table 6.1.

Several neutrino stations equipped with the SST data acquisition system were success-

fully deployed in Antarctica's Ross Ice Shelf in the 2014 season. Currently, the stations are actively monitoring for neutrino signal and are routinely transmitting data from the field back to UCI.

## Bibliography

- Francis Halzen. "Astroparticle physics with high energy neutrinos: from AMANDA to IceCube". Eur. Phys. J., C46:669–687, 2006.
- [2] S. W. Barwick. "ARIANNA: A New Concept for UHE Neutrino Detection". J. Phys. Conf. Ser., (60):276–283, 2006.
- [3] S. W. Barwick and S.A. Kleinfelder. "Development of Hexagonal Radio Array for the ARIANNA Ultra-High Energy Neutrino Detector". arXiv:1410.7369 [physics.ins-det], 2014.
- [4] Jordan C. Hanson. Ross Ice Shelf Thickness, Radio-frequency Attenuation and Reflectivity: Implications for the ARIANNA UHE Neutrino Detector. In Proceedings, 32nd International Cosmic Ray Conference (ICRC 2011): Beijing, China, August 11-18, 2011, volume 4, pages 169–173, 2011.
- [5] S. W. Barwick. "Development of Telescopes for Extremely Energetic Neutrinos: AMANDA, ANITA, and ARIANNA". *Nucl. Inst. Meth. A*, 602(1):279–284, 2009.
- [6] D. Saltzberg, P. Gorham, D. Walz, and et al. "Observation of the Askaryan Effect: Coherent Microwave Cherenkov Emission from Charge Asymmetry in High Energy Particle Cascades". *Phys. Rev. Lett.*, 86:2802–2805, 2001.
- [7] P W Gorham, S. W. Barwick, J. J. Beatty, F. Wu, and et al. "Observations of the Askaryan Effect in Ice". *Phys. Rev. Lett.*, 99:171101, 2007.
- [8] P. Brennan. "On Ice Shelf, a Hunt for Ghostly Particles". OC Register, Dec 11, 2011.
- [9] S.A. Kleinfelder. "A multi-GHz, multi-channel transient waveform digitization integrated circuit". *IEEE Nuclear Science Symposium Conference Record*, 1(1):544–548, 2002.
- [10] S. A. Kleinfelder and et al. "Design and Performance of the Autonomous Data Acquisition System for the ARIANNA High Energy Neutrino Detector". Nuclear Science, IEEE Transactions, 60(2):612–618, 2013.
- [11] "MOSIS Integrated Circuit Fabrication Service". http://mosis.com/.
- [12] "MOSIS Scalable CMOS (SCMOS)". https://www.mosis.com/files/scmos/scmos. pdf, May 11, 2009. Online; accessed September 2013.

- [13] S. A. Kleinfelder, E. Chiem, and T. Prakash. "The SST Multi-G-Sample/s Switched Capacitor Array Waveform Recorder with Flexible Trigger and Picosecond-Level Timing Accuracy". arXiv:1505.07085 [physics.ins-det], 2015.
- [14] S. A. Kleinfelder, W. C. Carithers, R. P. Ely, C. Haber, F. Kirsten, and H. G. Spieler. "A flexible 128 channel silicon strip detector instrumentation integrated circuit with sparse data readout". *Nuclear Science, IEEE Transactions*, 35(1):171–175, 1988.
- [15] S. A. Kleinfelder, Shiuh hua Wood Chiang, and Wei Huang. "Multi-GHz Synchronous Waveform Acquisition With Real-Time Pattern-Matching Trigger Generation". Nuclear Science, IEEE Transactions, 60(5):3785–3792, 2013.
- [16] D. A. Stricker-Shaverand S. Ritt and B. J. Pichler. "Novel Calibration Method for Switched Capacitor Arrays Enables Time Measurements with Sub-Picosecond Resolution". *IEEE Trans.Nucl.Sci.*, 61(6):3607–3617, 2014.
- [17] J-F. Genat, G. Varner, F. Tang, and H.J. Frisch. "A 15 GSa/s, 1.5 GHz Bandwidth Waveform Digitizing ASIC". Nuclear Instruments and Methods, A735:452–461, 2014.
- [18] E. Oberla, H. Grabas, M. Bogdan, H. Frisch, J.-F. Genat, K. Nishimura, and G. Varnerand A. Wong. "A 4-Channel Waveform Sampling ASIC in 0.13 um CMOS for Front-End Readout of Large-Area Micro-Channel Plate Detectors". *Physics Procedia*, 37:1690–1698, 2012.
- [19] Tarun Prakash. In Preparation. PhD Dissertation, University of California at Irvine, 2017.
- [20] S. A. Kleinfelder and et al. "Design of the Second-Generation ARIANNA Ultra-High-Energy Neutrino Detector Systems". In Proceedings, 2015 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC 2015): San Diego, California, USA, pages 1–4, 2015.
- [21] "IEEE standard for Low-Voltage Differential Signals (LVDS) for Scalable Coherent Interface (SCI)". IEEE Std 1596.3-1996, pages i–, 1996.
- [22] Jaeseo Lee, Jae-Won Lim, Sung-Jun Song, Sung-Sik Song, Wang joo Lee, and Hoi-Jun Yoo. "design and implementation of cmos lvds 2.5 gb/s transmitter and 1.3 gb/s receiver for optical interconnections". In *ISCAS 2001, The 2001 IEEE International Symposium* on Circuits and Systems (Cat. No.01CH37196), volume 4, pages 702–705 vol. 4, 2001.
- [23] R. Jacob Baker. "CMOS Circuit Design, Layout and Simulation". IEEE Press, John Wiley and Sons, 2nd edition, 2005.
- [24] M. M. Liu. "Demystifying switched capacitor circuits". Newnes, 2006.
- [25] M. Nayeby and B.A. Wooley. "A 10-bit Video BiCMOS Track-and-Hold Amplifier". IEEE J. of solid state circuits, 24(6):1507–1516, 1989.

- [26] Paul R. Gray and Robert G. Meyer. "Analysis and Design of Analog Integrated Circuits". Fourth Edition. John Wiley and Sons, 2001.
- [27] G. Wegmann, E. A. Vittoz, and F. Rahali. "Charge injection in analog MOS switches". *IEEE J. of solid state circuits*, 22(6):1091–1097, 1987.
- [28] B. Razavi. "Design of analog CMOS integrated circuits". McGraw Hill, 2001.
- [29] R. H. Nixon, S. E. Kemeny, B. Pain, C. O. Staller, and E. R. Fossum. "256 x 256 CMOS Active Pixel Sensor Camera-on-a-Chip". *IEEE J. of solid state circuits*, 31(12):2046– 2050, 1996.
- [30] S. K. Mendis, S. E. Kemeny, R. C. Gee; B. Pain, C. O. Staller, Quiesup Kim, and E. R. Fossum. "CMOS Active Pixel Image Sensors for Highly Integrated Imaging Systems". *IEEE J. of solid state circuits*, 30(2):187–197, 1997.
- [31] P. W. Fry, P. J. W. Noble, and R. J. Rycroft. "Fixed-pattern noise in photomatrices". *IEEE J. of solid state circuits*, 5(5):250–254, 1970.
- [32] B. Razavi and B. A. Wooley. "Design techniques for high-speed, high-resolution comparators". *IEEE J. of solid state circuits*, 27(12):1916–1926, 1992.
- [33] Yun-Ti Wang and B. Razavi. "An 8-bit 150-MHz CMOS A/D converter". IEEE 1999 Custom Integrated Circuits Conference, 35(3):308-317, 2000.
- [34] P. E. Allen. "CMOS Analog Circuit Design". Oxford University Press. Inc, 2012.
- [35] B. L. Cochrun and A. Grabel. "Method for the determination of the transfer function of electronic circuits". *IEEE Trans. Circuit Theory*, 20(1):16–20, 1973.
- [36] H. Grabas. "Development of a picosecond time-of-flight system in the ATLAS experiment". Dissertation, Universite Paris Sud, Paris, France, 2013.
- [37] W. Black and D. Hodges. "Time Interleaved Converter Arrays". IEEE J. Solid-State Circuits, 15(6):1022–1029, 1980.
- [38] J-F. Genat, G. Varner, F. Tang, and H.J. Frisch. "Signal Processing for Pico-second Resolution Timing Measurements". *Nuclear Instruments and Methods*, 607(2):387–393, 2009.
- [39] S. W. Barwick, Eric C. Berg, D. Z. Besson, and et al. "Time Domain Response of the ARIANNA Detector". Astropart. Phys., 62:139–151, 2015.
- [40] S. W. Barwick, Eric C. Berg, D. Z. Besson, and et al. "Design and Performance of the ARIANNA HRA-3 Neutrino Detector Systems". *IEEE Transactions on Nuclear Science*, 62(5):2202–2215, 2015.
- [41] S.A. Kleinfelder. "Development of a switched capacitor based multichannel transient waveform recording integrated circuit". *IEEE Trans. Nucl. Sci.*, 35(1):151–154, 1988.

- [42] S.A. Kleinfelder. "A Multi-Gigahertz Analog Transient Waveform Recorder Integrated Circuit". Thesis, University of California, Berkeley, USA, 1992.
- [43] S.A. Kleinfelder. "Gigahertz waveform sampling and digitization circuit design and implementation". Nuclear Science, IEEE Transactions, 50(4):955–962, 2003.
- [44] W. Huang, S. H. Wood, and S. A. Kleinfelder. "Waveform digitization with programmable windowed real-time trigger capability". *IEEE Nuclear Science Symposium Conference Record (NSS/MIC), Orlando, FL*, pages 422–427, 2009.
- [45] W. Huang. "Novel high-speed and multi-function CMOS signal processing circuit". Dissertation, University of California, Irvine, USA, 2011.
- [46] A. Boni, A. Pierazzi, and D. Vecchi. "LVDS I/O Interface for Gp/s-per-pin Operation in 0.35 m CMOS". *IEEE Journal of Solid-State Circuits*, 4(36):706–711, 2001.
- [47] B. Razavi. "Design of Integrated Circuits for Optical Communications". McGraw Hill, 2003.
- [48] Alberto Leon-Garcia. "Probability and Random Process for Electrical Engineering". Pearson Education, 2008.
- [49] Jinhong Wang, Lei Zhao, Changqing Feng, Shubin Liu, and Qi An. "Evaluation of a Fast Pulse Sampling Module With Switched-Capacitor Arrays". *IEEE Transactions on Nuclear Science*, 59(5):2435–2443, 2012.
- [50] M. Shinagawa, Y. Akazawa, and T. Wakimoto. "Jitter Analysis of High-speed Sampling Systems". IEEE J. Solid-State Circuits, 25(1):220–224, 1990.
- [51] Stefan Ritt, Roberto Dinapoli, and Ueli Hartmann. Application of the DRS chip for fast waveform digitizing. Nucl. Instrum. Meth., A623:486–488, 2010.
- [52] S. W. Barwick, S.A. Kleinfelder, and et al. "Design and Performance of the ARIANNA Hexagonal Radio Array Systems". 62(5):2202–2215, 2015.
- [53] S. A. Kleinfelder, E. Chiem, and T. Prakash. "the sst fully-synchronous multi-ghz analog waveform recorder with nyquist-rate bandwidth and flexible trigger capabilities". In 2014 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Seattle, WA, 2014, pages 1–3, 2014.
- [54] G.S. Varner, L.L. Ruckmana, J.W. Namb, R.J. Nicholc, J. Caod, P.W. Gorhama, and M. Wilcox. "The large Analog Bandwidth Recorder and Digitizer with Ordered Readout (LABRADOR) ASIC". Nucl. Instr. and Meth., A583:447–460, 2007.