From BlenderWiki

Jump to: navigation, search
Note: This is an archived version of the Blender Developer Wiki. The current and active wiki is available on wiki.blender.org.

Threaded Dependency Graph: Weekly progress

Week 1: 17th-23th June

From the code side only one thing is done: removed static and global variables from meta balls and curves code, essential for all ways of dealing with threaded depsgraph.

And then run into lots of design limitations which weren't so much difficult to solve for current "feature set", but which was expected to solve in a way ultimate feature set is possible.

This ended up in lots of discussions in IRC which just made me being completely unsure what actually to do.

Here's a wiki page with current proposal coming from Ton with cons/pros and possible ways to solve cons: link.

Still not sure "copy everything" is a nice idea, needs more tech specifics aspect of this approach, not limit to generalish discussions how easy it is.

Next week. There're some things appeared in the design discussions which would be nice to implement anyway (which is locked rendering and threaded modifier stack).

And hope we'll finally find an ultimate way to go, which wouldn't smell like a deadeand from any of aspects.

Questions. Basically, only one question: which way to go :) Will figure this out with Campbell and Ton next week.

Week 2: 24th-30th June

From the code side this week i did:

  • Experimental feature to lock the interface while rendering, which locks the whole interface forbidding changing any data and preventing viewport from running into conflict with render thread.
Also implemented small feature which will allow panning/zooming in image editor while interface is locked.
  • Worked on running object_handle_update from multiple threads.
Basis code is here, need to be cleaned up a bit, but it works in general. However this change unleashed some more areas which are not safe for threading, read about this in plans for the next week.

From the documentation point of view, not so much done actually. Made some discussions with Campbell about how to keep memory usage sane when copying all the scene graph, but it's still some black holes in the design i'd say.

Next week. Would work on making this areas thread safe:

  • Curves, which are likely still using some pointers stored in Curve datablock, which makes it unsafe to create displist from multiple threads when curve is used by multiple objects.
  • Virtual modifier list, which seems to be smallest change for the next week.
  • Look into armature modifier and hope to change storing runtime data in armature itself, so armature modifier becomes safe for threading.
  • And hope to finish design document for safe and nice local graphs.

Questions. No new question actually,

Week 3: 1th-7th July

From the code side this week i did:

  • Virtual modifier list is now nice thread-safe.
  • Commited task scheduler ported to C, which originally was written by Brecht. Requires some bugfixing, which was time. consuming (threading issues are never easy to troubleshoot).
  • Replace stupid static balancing with task-based one.
  • Made curves almost safe for threading.
  • Enabled threading update by default.

Smaller changes:

  • Always use ob->bb when drawing the curve types, solved nasy ob->bb ? ob->bb : cu->bb stuff.
  • Get rid of a display list stored in Curve datablock (needs for thread-stability).
  • Move bevel list and path from Curve to Object datablock.
  • Avoid deformation of actual curve's splies when applying modifier stack on a curve.

From the documentation point of view, was refining http://wiki.blender.org/index.php/User:Nazg-gul/GSoC-2013#Design

Next week.

  • Continue making objects safe for threaded update (object data level bb gets me crazy)
  • Would need to figure out how to deal with dupligroups (currently there's some nsty code, which doesn't seem to be needed), Joshua's feedback would be needed here. We'll talk in IRC.
  • Look into granular update (maybe not actually do the code, but at least check how much we'll need to change to support this).
  • If i'll have time still, will look into adding local graph for renderer.

Questions. Nothing to be mentioned in this report.

Week 4: 8h-14th July

Most of the time was spent on solving crappyness going around object-data level biunding box. Issue is caused by the fact that object-update modifies object-data, which is bad. Also for curves bounding box was calculating in non-acceptable for threaded update way.

  • Workaround for crash caused by threaded dupligorup update (needed for now, real fix will come within few days)
  • Removed unused bounding box from MetaBall
  • Tag object-data level boundbox as invalid rather than freeing it
  • Solved crash with threaded update of font objects
  • Moved curve's boundbox and texspace calculation out of modifier stack
  • Added an operator to match texture space to object's bounding box (needed at leats for now, we'll figure out which tools are the best to have after changed made for curve's bounding box)
  • Did some tests with granular updates. Namely tested stuff like bounding box calculation for curves and meshes (used this task just for example). Such things fit scheduler pretty well and i'm happy with this.
  • Worked on making VBOs safe for threading. Patch is not in svn yet, it was revieweing and need to finish some thing first.
  • Also worked on making metaballs even more thread-safe, namely was trying to drop static variables BKE_scene_base_iter_next. I did this but it ended up ThreadVariables (TLS) are not available in OSX 10.6. So reverted all this changes (trunk is safe, and for the branch we'll figure out better way to solve the issue).

Also worked on trunk a bit:

  • Fix #36042: Subdividing a cyclic spline shifts start/end points
  • Fix #36076: Metaballs as particles with particle texture (size influence) crashes Blender
  • Get rid of global originmat matrix from object.c (smells it'll be helpful for threaded update as well)

Next week.

  • Would need to figure out how to deal with dupligroups (currently there's some nsty code, which doesn't seem to be needed), Joshua's feedback would be needed here. We'll talk in IRC. (This is indeed remained from the last week).
  • If i wouldn't finish VBOs work over weekend, will do it next week.
  • Will look into local graphs for renderer and viewport.

Questions. Nothing to be mentioned in this report.

Week 5: 15h-21th July

This was rather shorter week.

Quite some time was spent on finishing release logs, looking into some last-minute reports and so on.

As for the project, i was working on solving some weird slowdown on multi-core CPU, which ends up in like 1.5-2x times slower than it will ideally could be.

Root of the issue goes to the guarded allocator adding some overhead (MemHead+MemTail) around each block, which leads to much less cache efficiency when handling DeformVerts and DeformWeights.

Ideally we'll need to get rid of bunch of small allocations. Currently looking into a patch which originally came from Joedh and Campbell, which switches DeformWeights allocations to memory pool. Original patch doesn't give any speedup, but after some tweaks it seems it might be close enough to performance of bare mallocs.

Didn't fully finish the patch yet, would work over the weekend perhaps.

Another things i did are:

- Added some timing measurements to the threaded update, which helps to see what each thread was doing and for how long. - VBOs seems to be working fine now, and no workarounds are needed now.

Next week.

  • Would need to figure out how to deal with dupligroups (currently there's some nasty code, which doesn't seem to be needed), Joshua's feedback would be needed here. We'll talk in IRC.
  • Will look into local graphs for renderer and viewport.
  • And yes, this is all from previous week :(
  • Finish DeformWeights patch

Questions. We're solving issues with Campbell and Brecht in irc and over personal mails, but nothing to be raised here.

Week 6: 22h-28th July

This was rather even shorter week, which was spent at SIGGRAPH.

But did some code work as well. Namely, was looking into that nasty slowdown issue. For this issue the way to go would be to keep weights cached in keyblocks. There's no big memory benefit of keeping allocating and freeing weights arrays for keyblocks whenusing threaded update (you'll hit memory limit anyway eventually).

Also did some non-depsgraph patches, which are still useful. They still need to be cleaned up a bit, but they're not highest priority to finish.

Next week.

  • Same as was planned for previous week (didn't do much from that plans because of the conference).

Questions. None yet.

Week 7: 29th July - 4th August

This week:

  • Added check for address being freed by mempool free (this helped troubleshooting some issues, and think this commit better be merged to trunk)
  • Added check for whether thread lock is being removed while thread is using guarded alloc. This allows to detect cases when malloc lock is removed while guardedalloc is used by the thread. Disabled by default and better be also merged to trunk.
  • Optimization and threading fix for shapekeys weights calculation. Fixes crash when the same mesh with shapekeys is shared between different threads. And this also gives ~1.7 speedup ion test chinchilla files.
  • Use one global task scheduler for all kind of tasks. This removes overhead caused by threads launching, and also allows to use tasks for such things as subdivision surface (as an alternative to current OpenMP which conflicts with threaded object update).
  • Fix typo which lead to crash when applying lattice modifier.
  • Hack to workaround dead-lock caused by GIL when scene update is invoked from Python.

Next week.

  • This week discovered unexpected high CPU usage by spin-lock which currently needs for safe update DAG children when it's parent was updated. Ideally the code need to be changed to use atomic subtract. There's rather small implementations of this guy in jemalloc which we could easily re-use (due to it's cross-platform nature, the file we need is couple of kilos of code).
  • Would work on finishing current part of the project (there're still some issues with metaballs, defweights could be optimized a bit more, and some other remaining issues in the code). After discussion with Campbell we decided to completely finish current stuff before going to next challenges.

Questions. None yet.

Week 8: 5th - 11th August

This week i spend in San Francisco working together with Keir on new plane tracker. We consider the project is trunk-ready and it'll be merged within few days. Some further improvements will happen for sure.

And now switching back to GSoC project!

Week 9: 11th - 18 August

This week mainly was working on benchmarking blender with files from Tube project and was optimizing discovered bottlenecks:

  • Use atomic operations instead of spin lock for threaded update (not an actual speedup, but keeps CPU usage low while mutex in task scheduler is locked)
  • Speedup for guarded allocator (using spin instead of mutex, reshuffle code so no lock happens for until it's actually needed, use atomic operations).

Also some smaller fixes/improvements:

  • Fix crash happening in particle code caused by non-reentrant qsort()
  • Added detailed timing information output, so now it's possible to visualize time spend on object's update (example image, parser of the output).

This is it for depsgraph project, but also did:

  • Some bug-tracker fixes
  • Fixes for some unreported bugs
  • Merged plane trunk into trunk
  • Optimized compositor Image Input node
  • Tweaked to MapUV and PlaneTrack nodes to make results less doggy

Next week.

  • Do some further benchmarking, there're still some cases where it seems speedup could be a bit higher.
  • Still need to solve some TODOs in existing code
  • We're starting to review commits to be included to this release (some of the changes gives some speedup, other would make final code review much easier).

Questions. None yet.

Week 10: 19th - 25th August

This week i've merged lots of stuff from branch to trunk. Changes mainly makes code safe for threading, solves some issues where object-dependend data is stored in object-data datablock. For user perspective there's no functional changes, but makes code easier to maintain.

This lead to some time needed to spend on bug-fixing. Apparently there was quite reasonable amount of bugs which weren't noticed in the branch.

In the branch:

  • Got rid of static variable needed for mballs traversal. It doesn't seem to be needed now, it also makes it possible to support mballs in dupligoups (depsgraph need to be tweaked for this tho).
  • Dependencies for hairs with weight group weren't calculated right, which was pretty safe for trunk but lead to crashes in the branch.
  • Removed some workarounds from the code. After lots of testing with Tube and Mango files they're not needed.
  • There're some accidental crashes and memory corruptions which i didn't succeed to solve. Happens with really huge files, isoloating the issue takes time.

Next week.

  • Try to solve crashes and memory corruptions mentioned above
  • Glue API with Joshua
  • Would need to work on EvaluationContext for objects update (solves some nasty parts of code, and needed to solve some bugs it seems)

Questions. None yet.

Week 11: 26th August - 1st September

Small improvements and fixes in the trunk.

For the depsgramg_mt branch:

  • Solved bugs caused by DM creation for array/boolean operands from the modifier stack (took quite a while to nail down the case of bug i've experienced)
  • Did some smaller fixes as well
  • Rest of the time was spent on EvaluationContext and derivedRender thins being discusses with Brecht (initially it's needed to solve report #36474, but the way we're solving it is flexible enough to support dupli-group local time and so).

Next week. Mainly focus on solving TODOs from the EvaluationContext patch (which are a lot).

Questions. None yet.

Week 12: 2d - 8th September

Some fixes in the trunk, such as:

  • Fix crash when adjusting plane track after re-tracking point tracks
  • Color managed color didn't work properly for float sequencer frames.
  • Fix T36124: VSE - Input Color option does not work for video files
  • Fix T36587: Tracking markers fail to track near the left and right edge of a movie clip.

Depsgraph branch changes:

  • Use special flags in DagNode for tags and scheduled instead of trying to use color for this.
This way it's possible to have both tag and scheduler working at the same time.
Maybe not too much needed long-term wise, but needed this to unlock some upcoming changes.
  • Simplified DAG threaded traversal code, now it's easy to be re-used in other places as well.
This is helpful for derivedRender calculation in convertblender.
  • Set scenes shall be handled properly in convertblender.
  • Creating new task pool now ensures malloc is switched to thread-safe one.
Easier to re-use pools in lots of places.
  • Create derivedRender in threads using the same way asit's done for object update.
In really quick tests gave around 2x speedup of database_init_objects when having 130 suzannes with subsurf level of 3.
  • Added check for last datamask used for derivedRender calculation.
  • Objects will now restore their derived caches nicely after rendering in locked UI (was a small regression in the branch since last week).
  • Shrintwrap modifier will use proper derived mesh in render mode.
  • Initial support of render mode for constraints.
Some constraints depends on derivedMesh, which failed for objects which had different viewport/render settings. Constraints now could deal with this, but more global changes are needed to make it fully working. See below.
  • Smoke modifier is now expected to use proper render flags

The issue with constraints is caused by the fact that viewport and render are sharing the same ob->obmat, which leads to conflicts: once render thread re-evaluated constraints for render purposes, viewport location is not correct anymore.

Next week.

Next week i'm planning to focus on solving viewport/render conflict mentioned above.

Questions. None yet.

Week 13: 9th - 15th September

From code side there were changes all over the place:

  • Bug fixes in motion tracking area.
  • Film response curves implemented as a looks.
  • Enable vertex snapping to bundle positions.
  • Create/delete keyframe for motion tracks in clip editor.
  • Tweaks to plane track after getting artists' feedback.
  • Lock-free memory overhead-less guarded allocator is finally in the svn (in my branch, not in trunk yet).
  • Fix T36701: Mask pivioting doesnt honor parenting

Some of the changes were needed to make current tracking ready for the release, some of them were just general improvements.

As for the threaded dependency graph, spent some time writing documentation, see here. Did some smaller speed improvements to subsurf code.

Most of the time was spent on the local graphs and data/state separation. For now conclusion is: copy-on-write is the way to go. Documentation is not finished, will finish it and send to all related developers for final review.

Next week.

Next week is for finishing COW proposal, warping up the code, finishing documentation.

Questions.

None at this point, had loads of help from Brecht brainstorming local graph ideas already :)

Week 14: 15th - 22th September

Was only a bit of code cleanup in the branch. Only made sure http://wiki.blender.org/index.php/User:Nazg-gul/GSoC-2013 is up-to-date, and was working on Copy-on-Write proposal: http://wiki.blender.org/index.php/User:Nazg-gul/GSoC-2013/LocalGraphs

Unfortunately, didn't manage to cover all cases yet.

General development: - Fix muted footage in MCE still was reading the frames from disk - Re-track the plane after clearing the keyframe - Fix {{BugReport|36747}:} Curve bevel and extrude issue - Fix T36718: Wrong lighting on text objects - Fix for displacement bake buffer might be allocated twice - Fix for margin which didn't work properly with normalized displacement baking - Clear color to gray when baking displacement map - Images didn't get clear when using multires baker from python script

Next week.

Guess only warping up, prepare for code submission and evaluation?

Questions.

Nicht.