DiffRF: Rendering-guided 3D Radiance Field Diffusion

Abstract

We introduce DiffRF, a novel approach for 3D radiance field synthesis based on denoising diffusion probabilistic models. While existing diffusion-based methods operate on images, latent codes, or point cloud data, we are the first to directly generate volumetric radiance fields. To this end, we propose a 3D denoising model which directly operates on an explicit voxel grid representation. However, as radiance fields generated from a set of posed images can be ambiguous and contain artifacts, obtaining ground truth radiance field samples is non-trivial. We address this challenge by pairing the denoising formulation with a rendering loss, enabling our model to learn a deviated prior that favours good image quality instead of trying to replicate fitting errors like floating artifacts. In contrast to 2D-diffusion models, our model learns multi-view consistent priors, enabling free-view synthesis and accurate shape generation. Compared to 3D GANs, our diffusion-based approach naturally enables conditional generation like masked completion or single-view 3D synthesis at inference time.

Video

Radiance Field Synthesis

Our 3D denoising diffusion probabilistic model learns to synthesize diverse radiance fields that enable high-quality rendering with accurate geometry.

Unconditional synthesis results on PhotoShape Chairs

Unconditional synthesis results on ABO Tables

3D Masked Completion

DiffRF naturally enables 3D masked completion: Given a 3D mask (of arbitrary shape), the goal is to synthesize a completion of the masked region that harmonizes with the non-masked area. We observe that our model produces diverse and matching completions.

Image-to-Volume Synthesis

Given a posed (relative to the 3D bounded box), foreground segmented image, we can guide the sampling process by simultaneously minimizing the photometric rendering error. This leads to plausible radiance field proposals.

Asset generation for scenes

We see future applications of our radiance field diffusion method in the generation of scene assets where the accurately synthesized geometry can enable physics-based interaction.

BibTeX

@inproceedings{muller2023diffrf,
  title={Diffrf: Rendering-guided 3d radiance field diffusion},
  author={M{\"u}ller, Norman and Siddiqui, Yawar and Porzi, Lorenzo and Bulo, Samuel Rota and Kontschieder, Peter and Nie{\ss}ner, Matthias},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4328--4338},
  year={2023}
}