Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Improving Hardware Multithreading in General Purpose Graphics Processing Units

Abstract

General-purpose graphics processing unit (GPGPU) is one of the most popular many-core accelerators

that deliver a massive computing power in parallel applications. GPGPUs mainly

rely on the hardware multithreading to hide a short pipeline stall and a long memory latency.

Thus, the performance of GPGPU can be signicantly aected by how GPGPU's

hardware multithreading is applied. However, nding the optimal hardware multithreading

is a complex problem since there are many aspects to be considered. This work studies the

mechanisms for improving the eectiveness of hardware multithreading. First, it studies

the various scheduling policies and proposes an adaptive scheduling policy that chooses the

best scheduling policy at runtime. In addition, it proposes simple but eective warp throttling

mechanism that can increase the cache locality. Furthermore, it proposes a hardware

prefetching mechanism to extend the memory latency hiding degree of hardware multithreading.

Finally, it shows how a limited scalability of the conventional cache miss handling architecture

constrains the degree of hardware multithreading and proposes the highly scalable

cache miss handling architecture.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View