Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

A Multi-Accelerator Architecture for Photon Mapping

Abstract

Real-time rendering of photorealistic images has always been an important goal in Computer Graphics. The most computationally expensive part of this process is obtaining the effects of global illumination. Photon mapping is a well-known technique for calculation of realistic global illumination, and also shows characteristics which we believe make it favorable for dedicated hardware acceleration.

Online arithmetic is a digit-serial form of arithmetic, where input vectors are processed from the most significant digit down to the least, and the result is also produced one digit at each step. Pipelined online arithmetic circuits are extremely regular while only requiring simple calculations between registers, which allows for high clock speeds and low power dissipation with a huge potential for parallel execution.

Combining these two concepts, we design and evaluate MAPM (Multi-Accelerator for Photon Mapping), a multi-accelerator architecture that employs pipelined online arithmetic to accelerate the two most time consuming operations in photon mapping: the tree search and shader operation. On a VHDL implementation, we perform behavioral verification using ModelSim, examine hardware cost with Synopsys tools and evaluate throughput gain and scalability of the architecture using a custom built cycle-accurate simulator based on the Intel Pin tool.

By employing two MAPMs set to a configuration of 16 Tree Search Modules, 16 Shader Operation Modules and 2 Shader Operation Accelerators per Shader Operation Module, we observed a throughput increase of 1384x over an optimized software setup, and an increase of 4.78x over a recent MPSoC implementation. This is achieved using an acceptable hardware cost of 28.8% of the bandwidth, 22.2% of the area, and 5.6% of the power consumption of the low-end Intel Celeron G1820T.

The MAPM also shows a significant reduction in power dissipation. Compared to a conventional parallel circuit with equivalent functionality, the MAPM showed a synthesizable clock speed at about 3.5x, dynamic power consumption of 0.104x, and area cost of 1.799x.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View