Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Reliability Enhancement of Many-core Processors

Abstract

Many-core systems are of great importance for building the exascale computing machine targeted for 2020. Last-Level Cache (LLC), as the largest on-chip shared memory in many-core systems, plays a crucial role in power, area, and more important in reliability. Reliability in LLC depends on both distributed banks and the communication fabric (Network-on-Chip (NoC) interconnect). In order to achieve high reliability factor, they both need to be

protected against errors. Existent error coding methods protect the cache and communication fabric, but in isolation of each other. Based on the observations in this thesis, when cache and NoC interconnect are considered together, the delay overhead of LLC protection has been decreased. In this thesis, the main contribution is NARC , an integrated method

that minimize the delay overhead of error protection in many-core architectures by integrating the error coding of cache and interconnection network. This new approach sets up a linked error coding scheme that guarantees the end-to-end protection of shared cache data

blocks throughout the on-chip network against both hard and soft errors. NARC partitions each shared cache block into multiple equally-sized segments. It extends each segment with a low-cost ECC, and transmits each extended segment as a flit in NoC. NARC eliminates the large ECC encoder/decoder blocks from the critical path of shared cache remote access

through the network. Using this technique, NARC minimizes latency in the common case of accessing a shared LLC bank over the network, and potentially accessing a local LLC bank, while providing almost the same error protection as strong multi-bit ECC in cache blocks using a segmented per flit ECC. It has been evaluated that on a 6 by 6 platform with mesh NoC, NARC improves the performance of the many-core systems on average of 9.6% about and it can go up to 22% compared to a baseline approach.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View