We assume rendering happens on a device where we can't directly manipulate the memory or call functions, so all communication needs to go through the Device interface.

We've got a few device backends:

  • CPU: this device will render on the same CPU, with multithreading.
  • CUDA: render on a CUDA device, which in practice is an NVidia GPU.
  • OpenCL: render on an OpenCL device, which may be a CPU, GPU or other (working partially).
  • Network: render on a server running somewhere in the local area network (prototype code only).
  • Multi: balance rendering on multiple devices (only Multi GPU at the moment).

These devices have functions to allocate, copy and free memory and textures, and render a pass for a part of the image. Initially the intention was to abstract just CPU and GPU devices, but the Network and Multi devices seems to fit in too, though both are only prototypes still.

The CPU, CUDA and OpenCL rendering kernels are all compiled from the same source code, how this works will be explained in another post. In principle it would be possible to use only an OpenCL device, but it's useful to have a CUDA device to compare performance, and to have a CPU device for easier development and debugging.

Additionally, the CUDA and OpenCL devices are more limited in functionality since we can't use existing software libraries like OSL, OpenImageIO here due to GPU limitations.

The intention of the Multi device is to balance the work over multiple devices. Using the abstraction, it should then becomes possible to combine CPU's and GPU's, local and remote. How best to do the load balancing is something that still needs to be worked out, this is quite a challenging problem in practice.

Network devices translate the function calls into remote calls over a network connection, transmitting memory back and forth. A server can be started anywhere in a LAN, and it can be automatically discovered by clients that need to render. The network connection needs to be fast enough for transmitting complex scenes and sending back pixels at interactive rates, how well this works in practice we'll have to see, again this is more of a test than actual working code at this point.

The intention is not to replace renderfarms with jobs systems and such, only to use multiple devices in a local network to interactively render a single image. Rendering a single image over a network has major implications on choice of algorithms too, so it's useful to think about it in advance.