Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Physically-Motivated Learning For Photorealistic Scene Reconstruction and Editing in the Wild

Abstract

Rapid advances in imaging have made high-quality devices such as mobile phone cameras easily accessible, opening the doors to new applications in image editing and augmentation. They may allow an interior designer to visualize how a kitchen counter will appear after remodeling, or a consumer to see whether a fabric or leather sofa looks better in a living room with color bleeding from walls of various shades, or a real estate agent to demonstrate how a room imaged under fluorescent lights at night will appear in the glow of a sunrise when a window is opened.

Achieving a high degree of photorealism in such applications remains extremely challenging in computer vision and graphics. They require a comprehensive understanding of all the constituent factors of image formation — shape, material and lighting — which exhibit a wide spectrum of variations and interact in complex ways to create effects such as highlights, shadows and interreflections. Reconstruction of these intrinsic scene components or the ability to edit them is consequently an extremely ill-posed problem and especially so when only a single or a few images are available. Classical measurement-based methods need expensive, carefully calibrated setups. Prior model-based methods assume simplified physical models that break down in the face of diverse real-world appearances. Thus, a learning paradigm merits consideration, but even powerful deep learning methods suffer from a lack of generalization due to the diversity, long-range interactions and paucity of ground truth data associated with complex light transport.

The key insight of this thesis is to develop physically-motivated learning, which incorporates the inductive bias of image formation to enable deep neural networks to reason about shape, material and lighting in complex scenes. The success of our approach rests on three advances. First, we develop neural differentiable rendering modules that model the full physics of image formation, including non-local light transport effects such as shadows, interreflections or refraction. Second, we devise physically-valid representations of material and light sources that are compact enough to make learning tractable, yet expressive enough to model realistic appearance such as spatially-varying reflectance or high-frequency specular highlights and light shafts through an open window. Third, we exploit domain knowledge to create large-scale photorealistic synthetic datasets which circumvent the difficulty of obtaining ground truth for spatially-varying material and complex light paths that enable physically-motivated learning to generalize well to real scenes. We demonstrate the success of our approach through results that surpass the state-of-the-art or solve longstanding open challenges in reconstruction and editing of shape, material and lighting in the presence of complex light transport in unconstrained scenes, with just a single image as input.

This dissertation also democratizes research in vision and graphics through open frameworks that allow creation of high-quality virtual environments. Indeed, a key practical impact is to allow users to create realistic visual effects with only a few images captured with a mobile phone camera. Our high-quality predicted geometry, spatially-varying lighting and materials enable several augmented reality (AR) applications at an unprecedented level of photorealism — including virtual object insertion and material replacement with realistic shadows and color bleeding, transparent shape reconstruction and light source editing (such as turning off lamps or opening windows) with consistent non-local shadows, interreflections and highlights.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View