A neural radiance field (NeRF) is a fully-connected neural network for generating novel views of complex 3D scenes, based on a partial set of 2D images. NeRF takes input images representing a scene and interpolates between them to render a complete scene.
Rendering, or image synthesis, refers to generating an image (render) from a 2D or 3D model using a computer program. It is typically the final step in the graphics pipeline and gives models and animations their final appearance. The model contains features such as:
- texture
- shading
- shadows
- lighting
- viewpoints
Rendering algorithms loosely fall into one of three categories:
- Rasterization—geometrically projects objects in the scene to an image plane, without advanced optical effects
- Ray Casting—calculates an image by considering the scene observed from a specific point of view using only geometry and basic optical laws of reflection intensity
- Ray Tracing—similar to ray casting with more advanced optical simulation. Usually incorporates Monte Carlo techniques to obtain more realistic results at a speed that is often orders of magnitude faster
Volume rendering enables users to create a 2D projection of a 3D discretely sampled dataset. For a given camera position, volume rendering algorithms provide the RGBα (Red, Green, Blue, and Alpha channel) for every voxel in the space through which rays from the camera are cast.
The opposite of volume rendering, view synthesis involves creating a 3D view from a series of 2D images. This can be done using a series of photos that show an object from multiple angles, creating a hemispheric plan of the object and placing each image in the appropriate place around the object. View synthesis attempts to predict the depth from a series of images that describe different perspectives of an object.
NeRF generates novel views of complex scenes by optimizing the underlying continuous volumetric scene function using a set of input views.
It produces a 5D function representing the radiance emitted in all directions (θ; Φ) at every point (x; y; z) in space. This function outputs an emitted color (r; g; b) and volume density (α) to create a continuous scene for every point and direction. The NeRF is generated by:
- Tracing rays through the scene to generate a sampled set of 3D points
- Inputting these points and their corresponding 2D viewing directions into a neural network to produce colors and densities
- Using classical volume rendering to transform these colors and densities into a 2D image