Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Making the On-Chip World Smaller with Low-Latency On-Chip Networks

Abstract

Multi-core processors have rapidly grown in core count since the first commercial dual-core processor in 2001. Today, general-purpose multi-cores with 32 cores and embedded multi-cores with over 100 cores are available, with increasing core counts still to come. To enable multi-cores to run many different applications, the solution of choice has been to connect the cores by a shared Network-on-Chip (NoC) so that any communication pattern can be supported. However, previous NoC designs are not scalable in terms of network latency when the communicating cores are not nearby each other.

Unfortunately, high network latencies create performance bottlenecks and limit the flexible usage of on-chip resources. Computer architects have sought to avoid interactions between far away cores, but the effectiveness of locality optimizations are diminishing.

In this thesis, we propose to make the on-chip world appear smaller by providing extremely low-latency networks that can make far away resources appear much closer. This is achieved by leveraging specially-engineered electrical wires that can transport data across chip at both high data rates and low latencies. We first investigate the use of asynchronous repeated wires that run across a shared hop-by-hop 2D mesh net- work. Using these asynchronous repeated wires, we can configure routers to bypass their pipelines to create single-cycle paths across multiple routers. To allocate these single-cycle multi-hop paths, we present a novel arbitration scheme that has low implementation complexity, guarantees correctness, and avoids throughput loss. We also investigate the use of on-chip transmission lines that conduct signals at the speed of light at extremely high data rates. We present a shared medium architecture for global on-chip communications using these transmission lines, and we present several fully-distributed arbitration schemes for controlling access to this shared medium. In addition, we present a fast and accurate NoC simulation methodology that accounts for complex interactions between the NoC and the application, memory sub-system, and processing cores. This simulation approach can be used to effectively evaluate NoC designs, including those described in this thesis.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View