User:Phonybone/Depsgraph

From BlenderWiki

Note: This is an archived version of the Blender Developer Wiki. The current and active wiki is available on wiki.blender.org.

Data Handling in the Dependency Graph

Current Implementation Overview

The depsgraph is primarily a system for managing operations, i.e. function calls. These functions ultimately update data in the DNA, which exists before and after the depsgraph evaluation, independent of the lifetime of the graph representation itself. For instance:

lot/rot/scale fields in Object struct and obmat transform matrix are used by local and parent transform hierarchy, constraints and rigid bodies
Mesh data is input for modifier stack evaluation, derivedDeform/derivedFinal are input and output for modifier stacks
bone matrices are calculated based on pose hierarchy and rest pose for IK and other rigging constraints

For complex operations the DNA can only meaningfully store final input/output data, however. In addition to this general input/output, the individual parts of an operation may produce "transient" data, which is only needed as input for the next step, but doesn't have to be (and shouldn't be) stored beyond that. Examples are:

DerivedMesh instances being passed from one modifier to the next. Eventually these may end up in the DNA as the "final" versions, but generally a DM produced as result of modifier N can be discarded after executing modifier N+1.
The result of a hierarchical transform is calculated successively from the base matrix by applying parent transforms or constraints.

In the current implementation of the depsgraph, such transient data is easy to achieve using local function variables. In the two examples above:

locmat is the base local transform, passed on downstream for parenting and constraints: solve_parenting
dm is a single DerivedMesh instance, holding the return value of one modifier for passing it on to the next: mesh_calc_modifiers

The Challenge

The problem appears once we start breaking the large operational blocks into smaller pieces (for reasons beyond the scope of this document, see #Recommended Reading).

Local Variables:
While in a monolithic function using a local variable is a natural thing to do, there is no direct common equivalent for such simple stack-stored variables when issuing independent function calls wrapped as depsgraph operation nodes.
Multiple Concurrent Instances:
In the future, systems like modifiers and constraints may become node-based, which means there are more potential cases of multiple operations running in parallel which are associated to the same DNA data. Then storing such transient data in the DNA becomes impossible (now frowned upon already, but simple workaround sometimes).
Dynamic Scheduling:
The random nature of multithreaded dynamic scheduling makes it impossible to define start/end points of the lifetime of transient data in advance. It has to be created dynamically when needed and released again as soon as possible.
Evaluation Contexts:
Depsgraph can be evaluated for different purposes, such as viewport/render/baking. It would be prudent for the design to allow passing data to places other than DNA structs, e.g. render engine databases.

Requirements

Taking the above issues into account, lets define a set of requirements for the depsgraph:

Each operation can produce output data.
For the sake of simplicity this must be defined as a single type (for multiple outputs they can be combined in a struct)
Each operation can have "data dependencies"
These define which results of other operations (and possibly DNA input data) an operation requires access to.
- Before an operation can be scheduled, all its data dependencies must be ready.
- Once all data dependants have been finished, a node's result data can be freed.
NOTE: unlike regular dependencies, data dependencies are not transitive. This means that data dependencies can not simply be ignored if a dependency to a downstream node exists (see Transitive Reduction Algorithm).
Additional data dependencies can be defined to the outside world.
- In the input case data is expected to exist right from the start (not as result of an operation).
- In the output case data is retained even after operations have finished using it (e.g. derivedFinal modifier output).

Also note what is not defined in these requirements:

The depsgraph is not responsible for actual data storage. This should be handled by other subsystems and can depend very much on the type of data (e.g. matrices vs. DerivedMesh, using standard heap alloc vs. mempools etc.).

Implementation Details

Delayed Argument Passing

Executing a function via operation nodes and a scheduler means separating the function argument- and return definitions from the actual execution. All arguments to the function must be wrapped ("bound") in one way or another, so the scheduler can call the function at a later time without compile-time knowledge of all the argument types and values.

Classic C-style solution: define an argument struct type with all the arguments in it. Pass it to the function as a generic void pointer and cast explicitly for accessing the arguments.

GSOC 2013 branch

The original implementation in the GSOC branch used a generic PointerRNA for passing operation arguments, in combination with an (unfinished) "context type" system. This can work in some cases, but it's not sufficient for multithreading, so shouldn't be used as a guideline.

typedef struct ModifierEvalArgs {
    struct Object *ob;
    struct ModifierData *md;
    struct DerivedMesh *dm;
    [...]
} ModifierEvalArgs;
 
// eval function for some modifier
// note the generic function signature, so it can be used as a callback
void *eval_modifier(void *vargs)
{
    ModifierEvalArgs *args = vargs;
 
    return do_stuff(args->ob, args->md, args->dm); // returns a DerivedMesh instance
}
 
// when building the depsgraph:
ModifierEvalArgs *args = MEM_callocN(sizeof(ModifierEvalArgs), "eval_modifier args");
args->ob = ob;
args->md = md;
args->dm = NULL; // XXX this needs to be defined at scheduling time, see below
 
add_operation(eval_modifier, args, [...]);

This simple code lacks some important parts:

When executing the node it has to create an actual DerivedMesh instance to which the result is written.
DerivedMesh data itself is allocated inside modifier functions already, because only the modifiers themselves know the exact vertex/edge/face counts to specify the DerivedMesh dimensions. What the depsgraph has to handle though is passing DerivedMesh pointers to the next modifier.
After the next modifier in the stack (the only dependant) is finished, the result dm should be released and freed again.

Constant arguments can be defined at build time, as it happens with the modifier args above, right after allocating them. Variable arguments, such as the dm field, can not be set at that point! This input data is only available after all the dependencies have been executed and their output data is ready (for modifiers this is typically the previous modifier in the stack and possibly other objects' geometry). Only then can the arguments for the modifier be fully specified:

// this function copies input data pointers to arguments
// it is used as another callback in addition to the main eval function
// XXX the "index" identifier is used to keep track of which data relation defines which argument,
// this can get really complicated and is the reason C++ function binding is an attractive alternative (see below)
void modifier_copy_input_data(void *vargs, void *data, int index)
{
    ModifierEvalArgs *args = vargs;
    if (index == 0) {
        args->dm = data;
    }
    else {
        [...]
    }
}
 
// called by the scheduler when a node is ready for eval
void schedule_node(DepsNode *node)
{
    // acquire actual data for each data relation
    foreach (DataRelation *drel, node->data_inlinks) {
        // this is where the results are stored
        void *data = drel->from->result;
        // index identifies the data relation, in case a node has multiple input dependencies
        int index = drel->index;
 
        // copy to the input arguments for the operation
        // uses modifier_copy_input_data callback in the example modifier
        node->copy_input_data(node->args, data, index);
    }
}

This is roughly what happens during node evaluation:

// worker thread function
void thread_run([...])
{
    [...]
 
    // args has been defined in 2 steps for constants/variables at build/schedule time
    void *result = node->evaluate(node->args);
 
    // store the result
    node->result = result;
    // initial user count
    node->result_users = node->data_outlinks.size();
 
    // now release all the inputs
    foreach (DataRelation *drel, node->data_inlinks) {
        --drel->from->result_users;
        // this may free the data, but that's up to a modifier callback again
        if (drel->from->result_users == 0)
            drel->from->release_result(drel->from->result);
    }
 
    [...]
}

Using C++ function binding

The shortcomings of using C code for binding depsgraph operation arguments should be quite clear from the above pseudo-code:

Arguments have to be trivially copyable types (pointers mostly), otherwise we have to add more levels of indirection or extra callbacks for creating, copying and freeing an operation's argument struct.
void pointers only, which requires explicit casting in callbacks all the time.
Many bookkeeping callbacks with expected logic that has to be implemented by every future operation type (and there will be a lot of them).

A well established solution to the problem of deferred function calls is the use of function argument binding. This was introduced by boost and incorporated into the C++11 standard as std::bind.

Argument binding can be used to turn an arbitrary function into a closure with a minimal signature, which stores all the argument values and calls the actual function using these arguments. This makes closures ideally suited for scheduled threaded task execution.

Using std::bind, the example would look something like this:

DerivedMesh *do_stuff(Object *ob, Modifier *md, DerivedMesh *input_dm) { ... }
 
// std::bind returns a simple void() closure here
// note: input_dm is not actually valid here yet, see below
add_operation(std::bind(do_stuff, ob, md, input_dm), [...]);

There is again the same issue with delayed input variable evaluation here: "input_dm" is not a valid argument (NULL) until scheduling time, when the appropriate input operation has been executed. Luckily the C++ template syntax has some neat tricks in store to make such input variables as painless as possible.

There would be a specialized "bind_operation" function, which passes the actual arguments to the evaluation function:

// actual evaluation function
typedef function<void()> EvalFunc;
 
// only build-time constants specified here,
// operation-type arguments will use the operation's result automatically
EvalFunc evalfunc = bind_operation(do_stuff, ob, md, prev_mod_operation);
add_operation(eval,  [...]);

"evalfunc" then looks like so (not explicitly defined, just generated using function templates):

void evalfunc()
{
    // acquire the actual input_dm value from other nodes
    DerivedMesh *input_dm = input_dm_node->result; // XXX how to do sensible type casting here?
    // call actual eval function
    DerivedMesh *dm = do_stuff(ob, md, input_dm);
    // store result
    node->result = dm;
}

Scheduling the eval function does not need special code then, the variable arguments are copied when the eval function is called:

void thread_run([...])
{
    [...]
 
    // is an anonymous closure now, all argument and return values are handled internally
    node->evaluate();
    // initial user count (could also be handled internally)
    node->result_users = node->data_outlinks.size();
 
    // now release all the inputs
    foreach (DataRelation *drel, node->data_inlinks) {
        --drel->from->result_users;
        // release function is also a closure now, makes type handling easier
        if (drel->from->result_users == 0)
            drel->from->release_result();
    }
 
    [...]
}

In summary, the main advantages of C++ function binding:

Less bloated code without noisy argument structs for every single operation (sensible use of templates).
Moves responsibility for bookkeeping, user counting, data management out of individual operations and into a single piece of common code (bind_operation).
This is especially important considering the expected large number of different operations. Keeping these as simple as possible helps in the long run.
Cleaner type handling, since types are known when binding the function, so there is no need for explicit void pointer casting.