User:Jaguarandi/SummerOfCode2009/WeeklyReports

From BlenderWiki

Note: This is an archived version of the Blender Developer Wiki. The current and active wiki is available on wiki.blender.org.

Week 1 report

This week

Black dots - bug fix.

The simplest reproducible bug was: http://andresp.no-ip.org/soc2009/blackdots.png <= you can see black dots on edges connecting quads faces

The bug fixed: http://andresp.no-ip.org/soc2009/blackdots_fix.png <= this scene previously gave many black dots.

I still wonder about the bug fix .. as the old code seemed more coeherent with the comments and old code.

I took quite a while to fix the bug.. as there was no selftintersection reported by code, but it was detecting intersections on the inner edge of a quad face. So i discarded many parts of code... after many/many hours.. i found that makeraytree already receives triangles faces, so the bug was probably about edges between faces.

Link bvhkdop with raytrace

was done, it does the same results as octree on simples cases (image diff == 0) But it looks something is wrong on more complex scenes.

Next week

See whats wrong with bvh tree
Do instance support (that means adding a raytrace object that "transforms" between world space and local space).

Schedule

still ahead of time considering the initial boost (before SoC starts)

Week 2 report

This week

tried to find why the new code was giving nice results:

http://www.zanqdo.com/tmp/Jaguarandi.png http://www.zanqdo.com/tmp/Trunk.png but couldn't find why... its quite hard to trace/debug rays

mirror rays work now
work on both (bvh and octree)

(after a few bugs.. they were rendering nice, thanks ZanQdo for the fish scene) http://andresp.no-ip.org/soc2009/fish3.png http://andresp.no-ip.org/soc2009/shit3.png

instance code done

it encapsulates a rayobject and perfoms space-transformation to convert between local and global coordinates.

Next week

plugin instance rayobjects with dupli(verts/faces)

I was thinking that instance support could be done at Mesh level (alt+d), altought since each mesh instance can be different because of modifiers, it seems that it will only be supported at dupliverts. (i hope this doesn't has negative effects)

try hierarchic raytrees (needed for instances).. (i guess some code still needs to be fixed for this to work nicely)

I will probably create a bvh of objects and where each node is an octree/instance.

test other types of rays.. that means ao, and transparent/soft shadows (currently they are disabled)

Schedule

Still quite busy with school... its taking more time than I expected But I am still on schedule

This initial part of adapting ray structure is taking more time than expected, mostly because of large amounts of time wasted on finding bugs.

I am still gaining experiencing on how to trace render bugs
I hope to catch most bugs on this initial part before moving since for now the code still "looks like" the original code and so its useful to debug by comparison.

Week 3 report

This week

hierarchic tree objects was done

  octree was adapted to behave well on this (old code was "destructive")
 current branch (r20823) is using a BVH of BVH's

instance support done (it currently uses instance suport at the level of "obi->flag & R_TRANSFORMED") that means its supported at stuff like dupliverts, duplifaces.

http://andresp.no-ip.org/soc2009/dupliverts3.png

implemented a faster ray-bb method.. that actually was enough to make bvh have nicer time results than octree on the few scenes I tested.

Some results obtained by community (blender community is really nice ^__^) can be seen at http://blenderartists.org/forum/showthread.php?t=157463

I have been reading raytrace documentation, papers, forums etc..

Next week

test the remaining asserts of disabled features...

   ray_trace_shadow_tra
   ray_trace_shadow_rad
   ray_ao_qmc
   ray_ao_spheresamp
   ray_shadow_jitter
   ray_shadow
   ray_translucent

It would be nice to have some small scenes to test those features...

start thinking on how to implement the next step on the soc project - getting a "test-framework"

Questions/Issues

I was planing on doing support at Mesh level (alt+d), but that is outside the scope of this SoC. Currently the reason for not supporting those is because of modifiers are object-level identities. Implementing a stack-modifier sanity function would be just a quick hack as the real solution of the problem would be to implement something like "Mesh-level-modifiers". This was discussed with ZanQdo and broken.

Schedule

I finished school works... now only 3 exams left.. I hope to be able to increase the available time for SoC during the next weeks.

Week 4 report

This week

Enabled the remaining render code... and got no bug reports on those (except a phantom one that only showed on strange compiler optimizations flags as so it was ignored assuming it was a compiler problem)

Discussed the render system with some members of community, and considering the objectives of such a system I started coding it. Objectives:

 * make it easier to run render tests
 * other people can run tests and send results, which can be used to compare
 * easy to add test cases
 * easy to add builds
 * run tests must be aware of the machine it runs on
 * be flexible as possible for future usage and extension

Decisions made were:

 * Render statistics like number of rays or other counters, time spend on each part and memory usage should be exported on image metadata (with possible extension for stamping on image)
 * the test-system should be in python
 * the test-system should be directory oriented
 * the test-system should have reporting features (like html comparison tables)
 * usage of hashs to make sure the build and scene are still the same

As so I started developing a python app for running those. A sample run of the code developed so far can be seen at: http://andresp.no-ip.org/soc2009/btest/html/all.html

Next week

Keep working on the test system..

  *make code better.. its very dirty for now
  *start selecting scenes to use on render tests

Start add counters/statistics on render code and look on metadata/stamp export

Questions

As you can see some renders quite differ from 2.5 to 2.49.. i guess those may be related with animation? well just look to the image and you will see.

Where should I put the test-system code? kept it in my own local repository? or where on my branch?

About documentation: nope i havent touched my wiki so far. I will try to put some updated info on there.

Schedule

I am on it.

For those who may think I may be wasting time on the test-system and other stuff (like instances) instead of making raytracer faster.. I recall those are deliverables of my SoC proposal and also help to create a good environ to eficiently improve the raytracer.

Week 5 report

This week

Gathered some scenes (very few.. i was expecting to get a bigger community

feedback..) and made other machine ready for testing - Test stuff is done.

Earlier on the week I also looked on how to generate false-color images (i

used UncleZeiv code as reference, but I get a crash when trying to enable a new added SCE_PASS_*, since I wasnt able to debug it, I gave up on that addition, thought it could be useful) Anyway optimization phase starts now.

coded a raytree-builder helper "class", to make it easier to build trees

(currently suports only generic implicit tree building)

coded a BVH tree for raycasting (based on BLI_bvh), but more generic in the

aspect of trying build methods and data organization.

profiled a few renders.. most time spend on ray-bb intersection

tested latest build with the scenes

http://andresp.no-ip.org/soc2009/btest/html/test_real http://andresp.no-ip.org/soc2009/btest/html/test_concept

fixed some bugs and found a big regression (happened after merging from 2009-06-20 until 2009-07-02) on the "remembering past" scene, which seems to be related with some faces that are created and that may be "destroying" the tree, but I am not sure about.

Next week

Find the reason of "remembering the past" slowdown..
Implement a BIH
Try diferent build methods (SAH, both on bvh and bih)

questions

none

shedule

First 2 phases concluded.

Week 6 report

This week

I got quite a few scenes through ba.org:

http://blenderartists.org/forum/showthread.php?t=159928 You can see latest runs on: http://andresp.no-ip.org/soc2009/btest/html/test_real http://andresp.no-ip.org/soc2009/btest/html/test_concept The "text_rocks" and "Geko_TVC_ToolBoard_05" render incorrectly/crash.. but that already happens on trunk2.5 (broken is taking care of it)

"Remembering the past" slowdown was caused by a particles bug
Coded an Object Level BIH
Implemented Object Level SAH during build
Tested a few heuristics (hint, last hint) and did some other improvements

(like tree building (although still not nlogn), raycast)

Reverted the SUN & HEMI lights to have the same behaviour as trunk

Next week

Keep working on data structures Look on SSE Fix bugs

Questions

I was told gcc auto vectorization isnt that great and I would probably need to code specific SSE (SIMD). Any pointers on this and how to integrate SSE on blender are welcome.

Like: would it be possible todo runtime switch between code? (this would mean having a part of code compiled with -sse and during runtime decide which version to execute?) would it be possible to rely on auto vectorization and then having sse binaries distributed?

Schedule

I would say some nice level of optimization as already been reached. And all that's left is: more optimization! and bug fixing.

As I said before I will have a summer break from 20july to 31july, so this is the last full week before that.

Week 7 report

This week

This week i mainly worked on data structures, read papers and explored some experimental stuff.

 *added submodule bf_render_raytrace (C++), to be able to code C++ for the

raytracer. The code although C++ is very C style (C++ is mainly used for templates and std algorithms).

 *coded a variable way bvh (which i called vbvh) (that means each node can

have a any/variable number of childs)

 *coded LCTS (longest commum transversal sequence) hint ability in the

vbvh... for now only BB hints were codded and they proved not to be very usefull (read as: there's no clear speedup, nor clear slowdown). Later this can be expanded on a cone hint, allowing to speedup mainly localized and spot lamps. (this type of stuff is usually used on primary rays but blender uses Zbuffer fot that)

 *coded an experimental stuff (read as: never saw that on any paper) to

reduced expected number of BB tests:

   1) Specially on binary trees it happens that a node is actually

"useless", and so it can be removed from the tree, a node is considered useless when it probabilisticaly increases the expected number of BB tests. Any node with N childs should have an hit probability inferior to (N-1)/N otherwise its better to discard that node and "pushup" the childs of that node. This pass is O(N) and should always reduce the number of BB tests per ray (if they are coherent with the hit probability model). Eg.: Lone ballon went from 76BB tests/45BB hits to 67BB tests / 27BB hits.

  2) If you are not allowed to remove any nodes from a BVH tree and/or

change their BB but you are allowed to move childs arround, whats the best BVH tree you can do?

Given a possible set of parents for a node, and using the surface area

model, a node should be the child of the parent with the smallest hit probability and that is the one with the smallest area. (the number of parents until root is useless).

I guess this might be doable in some linear time, i believe current code

does a O(NlogN) (worst case N^2)

 this also reduces the expected number of BB tests, but it doenst seems to

be as much as "pushups"

^^ "best tree" refers to a tree where casting a full-space-search rays would yeild a equal or smaller number of BB tests.

tested diferents raycast stack size (didnt seem to have any influence on

speed :S)

played with single tree (vs hierarquich trees), single trees are as

expected faster (atm they render "ballon scene" 20times faster, showing that the BVH after the build optimizations can deal well with diferent mixed types off geometry), build time is still N log^2 N, and thats the main reason for not rendering faster on all cases)

did some local experiences with sse, but I did not have time to start

implementing it.

Next week

As said before, I will be camping until 31july. When I get back I plan to start looking on things like: sse, memory organization.

Questions

I hope people don't mind big reports. None.

Schedule

The idea was to start appling SSE this week, but I was quite busy thinking and reading and drawing (algorithms, papers and graphs).

I will be camping until 31july (on iceland).

Week 8 and 9 report

As said, I have been camping in iceland (on a scout activity named Roverway) I greatly recommend hiking in iceland for all nature lovers

Questions

Sorry for not sending report on friday.. but since I hadn't done anything last week I thought it was useless (but cwant asked for it.. so here it is).

This email servers more as a "Hi everybody! I actually survived iceland!" I actually cannot say I am totally ok.. because I have this cough.. but probably that's because of many weather changes and some long travel times (>24hours non sleep)"

Next week

Look on tree building Better single tree build code (that is only a tree with all the primitives) Expecteds runtime diference can be seen on the "*-single" versions though due to nlog^2n building its not that easy to see its diference ( http://andresp.no-ip.org/soc2009/btest/html/test_real)

Schedule

I have lost any advanced in schedule I had.

Though the important part has all been coded.. and its faster and very close to expected 2-10 times faster (yeah it changes a lot between scenes, some are ~20 times faster.. others only 1.3).

I guess the important is now todo some more speed tweaks and then make it stable enough to be merged.

Week 10 report

This week

Build time NlogN
Single tree is the default
Tested some memory organization and started some SIMD stuff.

(SIMD recursion, 4 nodes are pop-ed from stack and theirs BB tested at same time, this doens't seems to scale that well, probably due to memory reorganization time and somehow bad assembly code). As so I have tried some compile optimization flags to try to make it worth it.

I found that compile flags have a great impact on runtime (specially on C++ and sse code), althouth I still haven't found what flags and in what files I need them.

Next week

Keep working on SIMD and memory stuff
Make it out-of-memory-proof and fix some other "broken" stuff
Update documentation

Questions

Is there any problem on having -O3 on render code?

Schedule

Firm 'pencils down' date is august 18. By now I was able to archieve the "expected" speedup (2-10x), support for generic primitives, a data structure that can deal well with large environs, and easibility to add diferent data structures.

My plans after SoC-2009:

I would like to still spend some more time, until end August, polishing

some details and trying some things out.

And then on september: work on having it merged on blender2.5 trunk.
Maintaining the render acceleration strutures

User:Jaguarandi/SummerOfCode2009/WeeklyReports

From BlenderWiki

Week 1 report

This week

Next week

Schedule

Week 2 report

This week

Next week

Schedule

Week 3 report

This week

Next week

Questions/Issues

Schedule

Week 4 report

This week

Next week

Questions

Schedule

Week 5 report

This week

Next week

questions

shedule

Week 6 report

This week

Next week

Questions

Schedule

Week 7 report

This week

Next week

Questions

Schedule

Week 8 and 9 report

Questions

Next week

Schedule

Week 10 report

This week

Next week

Questions

Schedule

Contents