Rezaei, Siavash

Field Programmable Gate Array (FPGA) Accelerator Sharing

2020

Rezaei, Siavash
Advisor(s): Bozorgzadeh, Eli

Creative Commons 'BY' version 4.0 license

Abstract

The high demand for addressing the required processing power of today's big-data and compute-intensive applications and the cost of powerful processing units have led end-devices to move most of their data processing to clouds, edges, and datacenters. In clouds, edges, and datacenters, hardware accelerators are exploited to drastically augment the processing power of multi-core processors. Among hardware accelerators, Field Programmable Gate Arrays (FPGAs) have attracted significant attention due to their prominent performance and power efficiency. There are two main challenges in exploiting FPGAs as hardware platforms for acceleration: how to design efficient accelerators to gain high performance?, and how to efficiently enable software applications to access FPGA accelerators? In this dissertation, my focus is on the second challenge, where I show the importance of FPGA-based acceleration management on the total gained performance.Unlike Graphics Processing Units (GPUs), FPGAs offer a heterogeneous environment for different types of accelerators. This unique feature allows FPGAs to serve various applications concurrently and makes them suitable to address the demands from diverse applications in clouds, edges, and datacenters. However, due to the limited number of resources and time-consuming process of reconfiguration, the importance of sharing FPGA accelerators among different applications arises. Current state-of-the-art accelerator management schemes do not yet provide accelerator sharing among multiple applications concurrently. To achieve this goal, we need system software support as well as a hardware controller to enable seamless accelerator sharing. In this dissertation, I propose a scalable framework, called UltraShare Express, for the ultimate concurrent sharing of different accelerators among various applications. UltraShare Express provides software like function calls for the seamless FPGA acceleration virtualization. Unlike previous works that exploit a static accelerator allocation at the design-time, UltraShare Express offers a run-time dynamic accelerator allocation that enables maximum utilization of the available resources and accelerators on FPGAs. In Chapter 1, I discuss the essential reasons behind huge attention toward FPGA acceleration in clouds, edges, and datacenters. I also address the challenges of using FPGAs as hardware acceleration platforms for general applications. In Chapter 2, I present the background and history of using FPGAs as acceleration platforms. I also discuss the advantages and disadvantages of previous works in deploying FPGAs for the acceleration. In Chapter 3, I focus on the system software to address the conflicts that happen among multiple applications requesting FPGA accelerators simultaneously. In this chapter, I present our proposed single-command-based framework, called MQMAI, that uses a multi-queue architecture in the software-stack to minimize conflicts among different applications to access different FPGA accelerators. In Chapter 4, I focus on the hardware accelerator controller and specifically address the interleaved/concurrent sharing of multiple accelerators among multiple user applications. In this chapter, I propose our hardware controller, called UltraShare, that enables the ultimate sharing of FPGA accelerators among multiple applications. The proposed hardware controller introduces a dynamic accelerator sharing scheme through an accelerator grouping mechanism. In Chapter 5, I propose UltraShare Express, our full-fledged framework for ultimate sharing of several FPGA accelerators among multiple applications. UltraShare Express inherits the advantages of both MQMAI and UltraShare by combining them. In Chapter 5, I further investigate the opportunities for addressing and eliminating FPGA accelerators' stall times in different scenarios. Focusing on the software stack, I propose a novel multi-queue architecture in the software-stack to avoid possible command blocking scenarios that can significantly degrade the performance of FPGA acceleration when accelerators are shared. I also propose a mechanism to fairly distribute data-link bandwidth among different accelerators concerning the accelerator grouping mechanism proposed in Chapter 4. This mechanism prevents accelerators with higher throughput be sacrificed in accessing data-link bandwidth. Our Experimental results on various accelerator IP cores show significant improvements when the ultimate sharing of FPGA accelerators is enabled. I believe that UltraShare Express provides an important step toward the efficient and easy deploying of FPGAs in heterogeneous architectures. UltraShare Express is an open-source framework available on GitHub and can be used by other research groups.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Irvine

Field Programmable Gate Array (FPGA) Accelerator Sharing