Source/Render/Cycles/Network Render

=Network Render=

Scope
The intention is not to replace renderfarms with jobs systems and such, rather it is designed to interactively render a single image over a network with immediate feedback in Blender. This requires a quite different design than render farms, although in certain simple cases it could serve as a replacement.

Status
This code is still quite rough and experimental, some things that need to be done:


 * Better error handling
 * Viewport render
 * Rethink multi device handling
 * Configuration
 * CMake windows build support
 * Scons build support
 * Boost libraries update

Setups
There's a few different types of setups that would be good to support.


 * Fast local network where the you can directly control and send data to all render slaves.
 * Internet / slow network connection to a render farm.
 * Cloud render farm, similar to the above case.
 * Huge render farm or mix of multiple farms, where you have multiple levels.

Connections
In the simple local network case, Blender itself could manage everything by itself. Perhaps multicast could be use to reduce overhead when sending shared data.

If you are sending data over the internet to a render farm or cloud, you want to send heavy data only once, so on the other end there needs to be one computer that is getting that incoming data and distributes it to all the other machines in the render farm (to which there presumably exists a fast connection).

If you have a huge render farm, even a single computer may not be able to manage all the connections (that's speculation, not quite sure about this). So you could have a multi level scheme where Blender sends data to some server, which then passes that on to other servers, which then pass it on to the slaves final. It's not clear if this is necessary, but designing the device abstraction with this kind of chaining in mind might be good.

Caching
When re-rendering the same scene, right after making some small changes, or hours later, it could be useful if the render slaves remembered data. Changing only camera positions should be quick automatically, however something like moving an object will resend the entire BVH data, even if only a small part of it changed.

We could improve this in various ways. On the Blender side, we could detect which parts of the data changed and only resend that, or send deltas in some way. LZO or some other fast compression scheme could be used. To avoid resending data hours later, data could be cached to disks.

A general solution might be to implement an rsync-like algorithm, along with a least-recently-used memory/disk cache on the servers to remember blocks of data and their hashes.

Configuration
Currently network render will automatically discover render slaves on a local network. Having this kind of simple case work automatic is nice, but at some point we need more advanced configuration to specify hostnames or IP addresses, and to do more complicated setups like render in the cloud.

It seems to me that it would be good to have a separate program to manage this configuration, which can stay alive through blender restarts. It could be automatically launched for the simple auto discover case, but for the more complex cases something persistent makes sense I think. This process could gather the list of IPs and descriptions or other information on them. As this process would be running on the same machine as Blender, this information could then be quickly retrieved to be displayed in the user interface when choosing network render, to show the status of the render farm.

The reason to have it separate is because you might want to customize or wrap it in various ways. If you have a local render farm you could have a wrapper script to launch render slaves, check if the machines are already busy doing something else, etc. For a cloud render farm you need to spin up the machines in the cloud, which takes a few minutes and billing happens for some minimum amount of time like 10 minutes or an hour. So this is not something you can do right before starting a render, you'd need to manage that separately.

So for this reason I think we can keep configuration of network rendering out of Blender basically, you would either have it work completely automatic or launch some separate application to do the configuration of which slaves exactly will render. In Blender it can just show you a short description message, like "Local network farm, 3 slaves", "Blender Institute farm, 12 render slaves" or "Google Compute Engine farm, 28 slaves"

Tiles and Progressive Render
For final tiled rendering, the current system is reasonably ok. Slaves will request tiles as needed and finish them entirely before moving on to the next tile.

For progressive render, as used in the viewport or optionally for final render, there is a more complex problem. Currently each tile is statically assigned to some device. This avoids copying (somewhat heavy) float buffer data between devices. Such static assignment isn't ideal though, it gives suboptimal load balancing for GPU devices already and for network render it's also a poor fit.

Some generic solution that works for both multi GPU and network render would be ideal, the problem is similar in a way. You don't want to be copying that float render buffer around for each individual sample, but at some point you do need to be able to hand things over to another device if there is an imbalance. I don't have a good design for this yet.