User:Jbakker/projects/CyclesOpenCL2019/Cycles

Cycles OpenCL kernels

 * 1) for each tile
 * 2) for each sample
 * 3) `data_init`
 * 4) `path_init`
 * 5) while active rays
 * 6) `scene_intersect`; This kernel takes care of scene_intersect function. This kernel changes the ray_state of RAY_REGENERATED rays to RAY_ACTIVE. This kernel processes rays of ray state RAY_ACTIVE. This kernel determines the rays that have hit the background and changes their ray state to RAY_HIT_BACKGROUND.
 * 7) `lamp_emission`; This kernel operates on QUEUE_ACTIVE_AND_REGENERATED_RAYS. It processes rays of state RAY_ACTIVE and RAY_HIT_BACKGROUND. We will empty QUEUE_ACTIVE_AND_REGENERATED_RAYS queue in this kernel.
 * 8) `do_volume`
 * 9) `queue_enqueue`; This kernel enqueues rays of different ray state into their appropriate queues
 * 10) `indirect_background`
 * 11) `shader_setup`; This kernel sets up the ShaderData structure from the values computed by the previous kernels.
 * 12) `shader_sort`;
 * 13) `shader_eval`; This kernel evaluates ShaderData structure from the values computed by the previous kernels.
 * 14) `holdout_emission_blurring_pathtermination_ao`; This kernel takes care of the logic to process "material of type holdout", indirect primitive emission, bsdf blurring, probabilistic path termination and AO.
 * 15) `subsurface_scatter`
 * 16) `queue_enqueue`; This kernel enqueues rays of different ray state into their appropriate queues
 * 17) `direct_lighting`; This kernel takes care of direct lighting logic. However, the "shadow ray cast" part of direct lighting is handled in the next kernel.
 * 18) `shadow_blocked_ao`; Shadow ray cast for AO
 * 19) `shadow_blocked_dl`; Shadow ray cast for direct visible light
 * 20) `enqueue_inactive`
 * 21) `next_iteration_setup`; This kernel takes care of setting up ray for the next iteration of path-iteration and accumulating radiance corresponding to AO and direct-lighting
 * 22) `indirect_subsurface`
 * 23) `queue_enqueue`; This kernel enqueues rays of different ray state into their appropriate queues
 * 24) `buffer_update`; This kernel takes care of rays that hit the background (sceneintersect kernel), and for the rays of state RAY_UPDATE_BUFFER it updates the ray's accumulated radiance in the output buffer. This kernel also takes care of rays that have been determined to-be-regenerated.

Split program compile times
We created a script that measures the compilation times of a split kernel program with a random set of compile directives. The next table will show you the results.

program_name                     | samples | min_compile_time | max_compile_time |    avg_compile_time ---+-+--+--+ kernel_state_buffer_size.cl                           |    9843 |             0.68 |             0.99 | 0.84133699075485116326 kernel_enqueue_inactive.cl                            |   10015 |             0.68 |             1.00 | 0.85272391412880678982 kernel_queue_enqueue.cl                               |   10084 |             0.69 |             1.04 | 0.86120587068623562079 kernel_data_init.cl                                   |    9975 |             0.71 |             1.04 | 0.87913684210526315789 kernel_indirect_subsurface.cl                         |    9948 |             0.70 |             1.06 | 0.86709187776437474869 kernel_shader_setup.cl                                |    9965 |             0.84 |             1.41 |     1.1189764174611139 kernel_shader_sort.cl                                 |   10029 |             0.89 |             1.89 | 1.06694087147272908565 kernel_path_init.cl                                   |    9927 |             1.22 |             2.68 |     1.7768157550115846 kernel_scene_intersect.cl                             |    9934 |             0.78 |             2.80 |     1.5094664787598148 kernel_buffer_update.cl                               |    9917 |             1.42 |             2.95 |     2.0160169406070384 kernel_next_iteration_setup.cl                        |    9893 |             3.32 |             6.29 |     4.7226523804710401 kernel_shader_eval.cl                                 |    9972 |             3.29 |            32.86 |    10.9066185318892900 kernel_indirect_background.cl                         |   10096 |             3.27 |            33.44 |    11.0069482963549921 kernel_direct_lighting.cl                             |    9859 |             5.39 |            36.80 |    14.2002201034587686 kernel_lamp_emission.cl                               |    9890 |             3.31 |            36.82 |    12.2152942366026289 kernel_holdout_emission_blurring_pathtermination_ao.cl |   9951 |             1.85 |            38.32 |     8.5834157371118481 kernel_shadow_blocked_ao.cl                           |    9912 |             0.89 |            42.59 |    12.3766010895883777 kernel_shadow_blocked_dl.cl                           |   10029 |             0.80 |            48.80 |    17.1998235118157344 kernel_do_volume.cl                                   |   10067 |             0.69 |            51.67 |    13.4433267110360584 kernel_subsurface_scatter.cl                          |    9915 |             0.69 |            59.42 |    12.6498608169440242

Let's go deeper into the programs that can take longer then 10 seconds by looking at the different compilation directives.

program_name                     |  avg_compile_time   | __kernel_cl_khr_fp16__ ---+-+ kernel_do_volume.cl                                   | 13.4762423282518313 | f kernel_do_volume.cl                                    | 13.4101814194577352 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4882726180944756 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6793461150353179 | t kernel_indirect_background.cl                          | 10.9867633526705341 | f kernel_indirect_background.cl                          | 11.0267451442024720 | t kernel_lamp_emission.cl                                | 12.1135504032258065 | f kernel_lamp_emission.cl                                | 12.3176572008113590 | t kernel_shader_eval.cl                                  | 10.8741181149333864 | f kernel_shader_eval.cl                                  | 10.9396844021849080 | t kernel_shadow_blocked_ao.cl                            | 12.6275120675784393 | f kernel_shadow_blocked_ao.cl                            | 12.1240647773279352 | t kernel_shadow_blocked_dl.cl                            | 17.1187026383654037 | f kernel_shadow_blocked_dl.cl                            | 17.2818063352044908 | t

program_name                     |  avg_compile_time   | __kernel_opencl_debug__ ---+-+- kernel_do_volume.cl                                   | 13.5959765940274415 | f kernel_do_volume.cl                                    | 13.2953062023087458 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4345416666666667 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.7362003665241295 | t kernel_indirect_background.cl                          | 11.0819661354581673 | f kernel_indirect_background.cl                          | 10.9327580772261623 | t kernel_lamp_emission.cl                                | 12.2619979838709677 | f kernel_lamp_emission.cl                                | 12.1683062880324544 | t kernel_shader_eval.cl                                  | 10.8915838383838384 | f kernel_shader_eval.cl                                  | 10.9214376742333732 | t kernel_shadow_blocked_ao.cl                            | 12.4973224043715847 | f kernel_shadow_blocked_ao.cl                            | 12.2566083283041642 | t kernel_shadow_blocked_dl.cl                            | 17.1251963506545022 | f kernel_shadow_blocked_dl.cl                            | 17.2752737116502908 | t

program_name                     |  avg_compile_time   | __kernel_debug__ ---+-+-- kernel_do_volume.cl                                   | 13.6354379059240307 | f kernel_do_volume.cl                                    | 13.2475551544324108 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5703446885708527 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5964108216432866 | t kernel_indirect_background.cl                          | 11.0365537730243613 | f kernel_indirect_background.cl                          | 10.9773310877749158 | t kernel_lamp_emission.cl                                | 12.2277207392197125 | f kernel_lamp_emission.cl                                | 12.2032390438247012 | t kernel_shader_eval.cl                                  | 10.9384738874212558 | f kernel_shader_eval.cl                                  | 10.8755830528608196 | t kernel_shadow_blocked_ao.cl                            | 12.4887765957446809 | f kernel_shadow_blocked_ao.cl                            | 12.2674621815286624 | t kernel_shadow_blocked_dl.cl                            | 17.1620127795527157 | f kernel_shadow_blocked_dl.cl                            | 17.2375363473411671 | t

program_name                     |  avg_compile_time   | __kernel_experimental__ ---+-+- kernel_do_volume.cl                                   | 13.3263311437403400 | f kernel_do_volume.cl                                    | 13.5671396442445308 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6782696177062374 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4887713310580205 | t kernel_indirect_background.cl                          | 11.0460938115884903 | f kernel_indirect_background.cl                          | 10.9673974512146555 | t kernel_lamp_emission.cl                                | 12.1728550814265100 | f kernel_lamp_emission.cl                                | 12.2561500297678111 | t kernel_shader_eval.cl                                  | 10.7713534447615165 | f kernel_shader_eval.cl                                  | 11.0376115278326096 | t kernel_shadow_blocked_ao.cl                            | 12.3875850202429150 | f kernel_shadow_blocked_ao.cl                            | 12.3656878519710378 | t kernel_shadow_blocked_dl.cl                            | 17.1164304047384008 | f kernel_shadow_blocked_dl.cl                            | 17.2849133763094279 | t

program_name                     |  avg_compile_time   | __no_hair__ ---+-+- kernel_do_volume.cl                                   | 13.5722628320351508 | f kernel_do_volume.cl                                    | 13.3157411067193676 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.7404943729903537 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4263055276381910 | t kernel_indirect_background.cl                          | 11.0271168316831683 | f kernel_indirect_background.cl                          | 10.9867637732857709 | t kernel_lamp_emission.cl                                | 12.1087017684887460 | f kernel_lamp_emission.cl                                | 12.3232315832315832 | t kernel_shader_eval.cl                                  | 11.0372387305437639 | f kernel_shader_eval.cl                                  | 10.7780258706467662 | t kernel_shadow_blocked_ao.cl                            | 13.0288574963910085 | f kernel_shadow_blocked_ao.cl                            | 11.7519138850483903 | t kernel_shadow_blocked_dl.cl                            | 17.9479142070051161 | f kernel_shadow_blocked_dl.cl                            | 16.4313179704871639 | t

program_name                     |  avg_compile_time   | __no_object_motion__ ---+-+-- kernel_do_volume.cl                                   | 13.6522115384615385 | f kernel_do_volume.cl                                    | 13.2291892979279823 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6292167832167832 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5370683380509503 | t kernel_indirect_background.cl                          | 11.1669826034793041 | f kernel_indirect_background.cl                          | 10.8498665358194308 | t kernel_lamp_emission.cl                                | 12.3591212121212121 | f kernel_lamp_emission.cl                                | 12.0711761133603239 | t kernel_shader_eval.cl                                  | 10.9368144597563411 | f kernel_shader_eval.cl                                  | 10.8761671701913394 | t kernel_shadow_blocked_ao.cl                            | 12.7996805993690852 | f kernel_shadow_blocked_ao.cl                            | 11.9332417355371901 | t kernel_shadow_blocked_dl.cl                            | 17.7198317794892582 | f kernel_shadow_blocked_dl.cl                            | 16.6962473012757605 | t

program_name                     |  avg_compile_time   | __no_camera_motion__ ---+-+-- kernel_do_volume.cl                                   | 13.4769023034154091 | f kernel_do_volume.cl                                    | 13.4097177499503081 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4937155155546860 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6781384297520661 | t kernel_indirect_background.cl                          | 11.0142412451361868 | f kernel_indirect_background.cl                          | 10.9993845843422115 | t kernel_lamp_emission.cl                                | 12.2041389977314910 | f kernel_lamp_emission.cl                                | 12.2260245982939893 | t kernel_shader_eval.cl                                  | 10.8949710173895663 | f kernel_shader_eval.cl                                  | 10.9183457436103844 | t kernel_shadow_blocked_ao.cl                            | 12.3722251736820597 | f kernel_shadow_blocked_ao.cl                            | 12.3808688720605819 | t kernel_shadow_blocked_dl.cl                            | 17.0923717059639390 | f kernel_shadow_blocked_dl.cl                            | 17.3086772380570052 | t

program_name                     |  avg_compile_time   | __no_shader_raytrace__ ---+-+ kernel_do_volume.cl                                   | 13.6956241480038948 | f kernel_do_volume.cl                                    | 13.1806447688564477 | t kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6079684338324565 | f kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5591914553803154 | t kernel_indirect_background.cl                          | 11.1771351134310379 | f kernel_indirect_background.cl                          | 10.8412199413489736 | t kernel_lamp_emission.cl                                | 12.4153029997986712 | f kernel_lamp_emission.cl                                | 12.0134978671541743 | t kernel_shader_eval.cl                                  | 11.1381384676775738 | f kernel_shader_eval.cl                                  | 10.6726713709677419 | t kernel_shadow_blocked_ao.cl                            | 12.4737031758957655 | f kernel_shadow_blocked_ao.cl                            | 12.2812080000000000 | t kernel_shadow_blocked_dl.cl                            | 17.2491691631715598 | f kernel_shadow_blocked_dl.cl                            | 17.1506252489048188 | t

Until now no really interesting effects.

program_name                     | samples |  avg_compile_time   | __nodes_max_group__ ---+-+-+- kernel_do_volume.cl                                   |    2441 | 11.6218967636214666 |                   0 kernel_do_volume.cl                                   |    2513 | 12.6581894150417827 |                   1 kernel_do_volume.cl                                   |    2568 | 13.5505957943925234 |                   2 kernel_do_volume.cl                                   |    2545 | 15.8573516699410609 |                   3 kernel_holdout_emission_blurring_pathtermination_ao.cl |   2484 |  7.3680716586151369 |                   0 kernel_holdout_emission_blurring_pathtermination_ao.cl |   2445 |  7.5575869120654397 |                   1 kernel_holdout_emission_blurring_pathtermination_ao.cl |   2558 |  8.7847458952306489 |                   2 kernel_holdout_emission_blurring_pathtermination_ao.cl |   2464 | 10.6175324675324675 |                   3 kernel_indirect_background.cl                         |    2513 |  8.6922244329486669 |                   0 kernel_indirect_background.cl                         |    2476 |  9.2696688206785137 |                   1 kernel_indirect_background.cl                         |    2588 | 11.0195401854714065 |                   2 kernel_indirect_background.cl                         |    2519 | 15.0108455736403335 |                   3 kernel_lamp_emission.cl                               |    2475 |  9.8700767676767677 |                   0 kernel_lamp_emission.cl                               |    2501 | 10.6073970411835266 |                   1 kernel_lamp_emission.cl                               |    2408 | 12.6640531561461794 |                   2 kernel_lamp_emission.cl                               |    2506 | 15.7049800478850758 |                   3 kernel_shader_eval.cl                                 |    2524 |  8.7771315372424723 |                   0 kernel_shader_eval.cl                                 |    2495 |  9.1851262525050100 |                   1 kernel_shader_eval.cl                                 |    2567 | 11.1343241137514608 |                   2 kernel_shader_eval.cl                                 |    2386 | 14.7144258172673931 |                   3 kernel_shadow_blocked_ao.cl                           |    2484 | 10.3099476650563607 |                   0 kernel_shadow_blocked_ao.cl                           |    2492 | 10.9489245585874799 |                   1 kernel_shadow_blocked_ao.cl                           |    2514 | 12.6312291169451074 |                   2 kernel_shadow_blocked_ao.cl                           |    2422 | 15.7007968620974401 |                   3 kernel_shadow_blocked_dl.cl                           |    2506 | 14.4531604150039904 |                   0 kernel_shadow_blocked_dl.cl                           |    2531 | 15.7420624259186092 |                   1 kernel_shadow_blocked_dl.cl                           |    2499 | 17.6736014405762305 |                   2 kernel_shadow_blocked_dl.cl                           |    2493 | 20.9658724428399519 |                   3

program_name                     | samples |  avg_compile_time   | node_feature_volume | node_feature_hair | node_feature_bump | node_feature_bump_state ---+-+-+-+---+---+- kernel_do_volume.cl                                   |       6 |  8.0150000000000000 |                     |                   |                   | kernel_do_volume.cl                                   |      13 | 12.9523076923076923 | VOLUME              |                   |                   | kernel_do_volume.cl                                   |       9 |  6.2988888888888889 |                     | HAIR              |                   | kernel_do_volume.cl                                   |       6 |  7.6700000000000000 | VOLUME              | HAIR              |                   | kernel_do_volume.cl                                   |       7 | 16.5728571428571429 |                     |                   | BUMP              | kernel_do_volume.cl                                   |       5 | 13.1840000000000000 | VOLUME              |                   | BUMP              | kernel_do_volume.cl                                   |       9 | 17.4444444444444444 |                     | HAIR              | BUMP              | kernel_do_volume.cl                                   |       3 | 15.1533333333333333 | VOLUME              | HAIR              | BUMP              | kernel_do_volume.cl                                   |       8 | 16.7250000000000000 |                     |                   |                   | BUMP_STATE kernel_do_volume.cl                                   |       8 |  8.9537500000000000 | VOLUME              |                   |                   | BUMP_STATE kernel_do_volume.cl                                   |       6 | 13.8983333333333333 |                     | HAIR              |                   | BUMP_STATE kernel_do_volume.cl                                   |       8 | 14.3225000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_do_volume.cl                                   |       7 | 18.1728571428571429 |                     |                   | BUMP              | BUMP_STATE kernel_do_volume.cl                                   |      12 | 26.4591666666666667 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_do_volume.cl                                   |       7 | 13.2357142857142857 |                     | HAIR              | BUMP              | BUMP_STATE kernel_do_volume.cl                                   |      10 | 14.1830000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      7 |  7.2357142857142857 |                     |                   |                   | kernel_holdout_emission_blurring_pathtermination_ao.cl |     10 |  5.0550000000000000 | VOLUME              |                   |                   | kernel_holdout_emission_blurring_pathtermination_ao.cl |      9 |  6.1988888888888889 |                     | HAIR              |                   | kernel_holdout_emission_blurring_pathtermination_ao.cl |     10 |  8.3040000000000000 | VOLUME              | HAIR              |                   | kernel_holdout_emission_blurring_pathtermination_ao.cl |      5 | 11.4940000000000000 |                     |                   | BUMP              | kernel_holdout_emission_blurring_pathtermination_ao.cl |      7 | 11.1171428571428571 | VOLUME              |                   | BUMP              | kernel_holdout_emission_blurring_pathtermination_ao.cl |      9 |  6.9644444444444444 |                     | HAIR              | BUMP              | kernel_holdout_emission_blurring_pathtermination_ao.cl |      3 |  9.6200000000000000 | VOLUME              | HAIR              | BUMP              | kernel_holdout_emission_blurring_pathtermination_ao.cl |      8 |  6.4950000000000000 |                     |                   |                   | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      6 |  4.2383333333333333 | VOLUME              |                   |                   | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      6 |  5.1000000000000000 |                     | HAIR              |                   | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      7 |  8.2600000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      4 | 10.7500000000000000 |                     |                   | BUMP              | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      5 | 12.3420000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |     10 | 11.0060000000000000 |                     | HAIR              | BUMP              | BUMP_STATE kernel_holdout_emission_blurring_pathtermination_ao.cl |      8 |  9.3137500000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_indirect_background.cl                         |       7 |  6.5614285714285714 |                     |                   |                   | kernel_indirect_background.cl                         |       4 |  9.9050000000000000 | VOLUME              |                   |                   | kernel_indirect_background.cl                         |       6 |  8.4233333333333333 |                     | HAIR              |                   | kernel_indirect_background.cl                         |       9 |  6.0400000000000000 | VOLUME              | HAIR              |                   | kernel_indirect_background.cl                         |      10 | 14.1550000000000000 |                     |                   | BUMP              | kernel_indirect_background.cl                         |       8 | 17.8650000000000000 | VOLUME              |                   | BUMP              | kernel_indirect_background.cl                         |       7 | 15.9328571428571429 |                     | HAIR              | BUMP              | kernel_indirect_background.cl                         |       8 | 12.7250000000000000 | VOLUME              | HAIR              | BUMP              | kernel_indirect_background.cl                         |       5 |  8.1700000000000000 |                     |                   |                   | BUMP_STATE kernel_indirect_background.cl                         |      11 |  7.6436363636363636 | VOLUME              |                   |                   | BUMP_STATE kernel_indirect_background.cl                         |       5 |  5.9620000000000000 |                     | HAIR              |                   | BUMP_STATE kernel_indirect_background.cl                         |       4 |  7.0000000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_indirect_background.cl                         |       6 | 14.7033333333333333 |                     |                   | BUMP              | BUMP_STATE kernel_indirect_background.cl                         |       8 | 14.1800000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_indirect_background.cl                         |       3 | 16.4200000000000000 |                     | HAIR              | BUMP              | BUMP_STATE kernel_indirect_background.cl                         |       6 | 10.8100000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_lamp_emission.cl                               |      10 |  8.4230000000000000 |                     |                   |                   | kernel_lamp_emission.cl                               |       4 | 12.1075000000000000 | VOLUME              |                   |                   | kernel_lamp_emission.cl                               |       7 |  9.0471428571428571 |                     | HAIR              |                   | kernel_lamp_emission.cl                               |       9 | 12.6433333333333333 | VOLUME              | HAIR              |                   | kernel_lamp_emission.cl                               |      10 | 14.5930000000000000 |                     |                   | BUMP              | kernel_lamp_emission.cl                               |       7 | 12.1228571428571429 | VOLUME              |                   | BUMP              | kernel_lamp_emission.cl                               |      12 | 16.5375000000000000 |                     | HAIR              | BUMP              | kernel_lamp_emission.cl                               |       4 | 13.6375000000000000 | VOLUME              | HAIR              | BUMP              | kernel_lamp_emission.cl                               |       8 |  8.0575000000000000 |                     |                   |                   | BUMP_STATE kernel_lamp_emission.cl                               |       9 |  9.8155555555555556 | VOLUME              |                   |                   | BUMP_STATE kernel_lamp_emission.cl                               |       6 |  9.7766666666666667 |                     | HAIR              |                   | BUMP_STATE kernel_lamp_emission.cl                               |       9 | 11.4177777777777778 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_lamp_emission.cl                               |       6 | 23.8083333333333333 |                     |                   | BUMP              | BUMP_STATE kernel_lamp_emission.cl                               |       4 | 16.0425000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_lamp_emission.cl                               |       8 | 13.4762500000000000 |                     | HAIR              | BUMP              | BUMP_STATE kernel_lamp_emission.cl                               |       6 | 15.7100000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_shader_eval.cl                                 |       6 |  8.6333333333333333 |                     |                   |                   | kernel_shader_eval.cl                                 |       6 |  8.0450000000000000 | VOLUME              |                   |                   | kernel_shader_eval.cl                                 |       9 |  7.5122222222222222 |                     | HAIR              |                   | kernel_shader_eval.cl                                 |       7 | 11.5071428571428571 | VOLUME              | HAIR              |                   | kernel_shader_eval.cl                                 |       4 | 12.5775000000000000 |                     |                   | BUMP              | kernel_shader_eval.cl                                 |       6 | 20.2983333333333333 | VOLUME              |                   | BUMP              | kernel_shader_eval.cl                                 |      10 | 13.6550000000000000 |                     | HAIR              | BUMP              | kernel_shader_eval.cl                                 |       8 | 17.1725000000000000 | VOLUME              | HAIR              | BUMP              | kernel_shader_eval.cl                                 |       5 |  6.2040000000000000 |                     |                   |                   | BUMP_STATE kernel_shader_eval.cl                                 |       6 |  8.2150000000000000 | VOLUME              |                   |                   | BUMP_STATE kernel_shader_eval.cl                                 |       4 |  7.0275000000000000 |                     | HAIR              |                   | BUMP_STATE kernel_shader_eval.cl                                 |       6 |  5.8000000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_shader_eval.cl                                 |       5 | 19.5220000000000000 |                     |                   | BUMP              | BUMP_STATE kernel_shader_eval.cl                                 |       8 | 15.0650000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_shader_eval.cl                                 |       8 | 16.6200000000000000 |                     | HAIR              | BUMP              | BUMP_STATE kernel_shader_eval.cl                                 |       7 | 13.2771428571428571 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_shadow_blocked_ao.cl                           |      10 |  8.9390000000000000 |                     |                   |                   | kernel_shadow_blocked_ao.cl                           |       3 | 12.0033333333333333 | VOLUME              |                   |                   | kernel_shadow_blocked_ao.cl                           |      14 | 11.4742857142857143 |                     | HAIR              |                   | kernel_shadow_blocked_ao.cl                           |       7 |  4.6714285714285714 | VOLUME              | HAIR              |                   | kernel_shadow_blocked_ao.cl                           |       3 | 25.3566666666666667 |                     |                   | BUMP              | kernel_shadow_blocked_ao.cl                           |       5 | 21.1040000000000000 | VOLUME              |                   | BUMP              | kernel_shadow_blocked_ao.cl                           |       5 | 11.6740000000000000 |                     | HAIR              | BUMP              | kernel_shadow_blocked_ao.cl                           |       7 | 10.0642857142857143 | VOLUME              | HAIR              | BUMP              | kernel_shadow_blocked_ao.cl                           |       7 |  8.2000000000000000 |                     |                   |                   | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       5 | 10.1140000000000000 | VOLUME              |                   |                   | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       6 | 12.2850000000000000 |                     | HAIR              |                   | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       7 |  4.5857142857142857 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       8 | 15.8650000000000000 |                     |                   | BUMP              | BUMP_STATE kernel_shadow_blocked_ao.cl                           |      14 | 17.3207142857142857 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       3 |  9.9066666666666667 |                     | HAIR              | BUMP              | BUMP_STATE kernel_shadow_blocked_ao.cl                           |       7 | 11.7128571428571429 | VOLUME              | HAIR              | BUMP              | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       5 | 12.4680000000000000 |                     |                   |                   | kernel_shadow_blocked_dl.cl                           |       9 | 12.5255555555555556 | VOLUME              |                   |                   | kernel_shadow_blocked_dl.cl                           |       4 | 16.1900000000000000 |                     | HAIR              |                   | kernel_shadow_blocked_dl.cl                           |       8 | 13.5450000000000000 | VOLUME              | HAIR              |                   | kernel_shadow_blocked_dl.cl                           |       7 | 16.3928571428571429 |                     |                   | BUMP              | kernel_shadow_blocked_dl.cl                           |      11 | 23.4472727272727273 | VOLUME              |                   | BUMP              | kernel_shadow_blocked_dl.cl                           |       9 | 19.7722222222222222 |                     | HAIR              | BUMP              | kernel_shadow_blocked_dl.cl                           |      12 | 22.1775000000000000 | VOLUME              | HAIR              | BUMP              | kernel_shadow_blocked_dl.cl                           |      10 | 11.2430000000000000 |                     |                   |                   | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       8 | 13.4325000000000000 | VOLUME              |                   |                   | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       9 | 12.8100000000000000 |                     | HAIR              |                   | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       8 | 11.0175000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       8 | 24.0175000000000000 |                     |                   | BUMP              | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       7 | 28.0685714285714286 | VOLUME              |                   | BUMP              | BUMP_STATE kernel_shadow_blocked_dl.cl                           |      15 | 20.4206666666666667 |                     | HAIR              | BUMP              | BUMP_STATE kernel_shadow_blocked_dl.cl                           |       6 | 13.1983333333333333 | VOLUME              | HAIR              | BUMP              | BUMP_STATE

The sample count is currently too low to show the real difference. Will need to run the bench marks the whole week to find out more.

Conclusion
Conclusion so far:


 * Kernels what do shader evaluation take a long time to compile.
 * Other renderers compile shaders as source code, cycles uses a stack based approach. This saves the number of times that recompilations are needed, but the compilations are heavier.

Experiment 1: empty svn_eval_nodes
removed the whole contents of `svn_eval_nodes` reduces the compilation times (as expected) a lot. Even the base program compiles a lot faster. This because the base kernel also has the kernel for baking.

Compilation times for BMW scene With `svm_eval_nodes`:

Kernel compilation of split_path_init finished in 1.26s. Kernel compilation of split_scene_intersect finished in 0.73s. Kernel compilation of split_lamp_emission finished in 8.06s. Kernel compilation of split_do_volume finished in 0.64s. Kernel compilation of split_queue_enqueue finished in 0.65s. Kernel compilation of split_indirect_background finished in 7.93s. Kernel compilation of split_shader_setup finished in 0.81s. Kernel compilation of split_shader_sort finished in 0.84s. Kernel compilation of split_shader_eval finished in 8.01s. Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 1.33s. Kernel compilation of split_subsurface_scatter finished in 0.64s. Kernel compilation of split_direct_lighting finished in 10.45s. Kernel compilation of split_shadow_blocked_ao finished in 9.05s. Kernel compilation of split_shadow_blocked_dl finished in 8.99s. Kernel compilation of split_enqueue_inactive finished in 0.63s. Kernel compilation of split_next_iteration_setup finished in 3.37s. Kernel compilation of split_indirect_subsurface finished in 0.64s. Kernel compilation of split_buffer_update finished in 1.41s. Kernel compilation of base finished in 13.50s. Kernel compilation of split_data_init finished in 0.66s. Kernel compilation of split_state_buffer_size finished in 0.64s.

without `svm_eval_nodes` Kernel compilation of split_path_init finished in 1.27s. Kernel compilation of split_scene_intersect finished in 0.72s. Kernel compilation of split_lamp_emission finished in 0.77s. Kernel compilation of split_do_volume finished in 0.63s. Kernel compilation of split_queue_enqueue finished in 0.64s. Kernel compilation of split_indirect_background finished in 0.73s. Kernel compilation of split_shader_setup finished in 0.80s. Kernel compilation of split_shader_sort finished in 0.84s. Kernel compilation of split_shader_eval finished in 0.65s. Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 1.33s. Kernel compilation of split_subsurface_scatter finished in 0.64s. Kernel compilation of split_direct_lighting finished in 2.69s. Kernel compilation of split_shadow_blocked_ao finished in 1.62s. Kernel compilation of split_shadow_blocked_dl finished in 1.58s. Kernel compilation of split_enqueue_inactive finished in 0.63s. Kernel compilation of split_next_iteration_setup finished in 3.36s. Kernel compilation of split_indirect_subsurface finished in 0.64s. Kernel compilation of split_buffer_update finished in 1.41s. Kernel compilation of split_data_init finished in 0.66s. Kernel compilation of split_state_buffer_size finished in 0.63s.

The speedup of compilation times are measured in the following kernels


 * split_lamp_emission
 * split_indirect_background
 * split_shader_eval
 * split_direct_lighting
 * split_shadow_blocked_ao
 * split_shadow_blocked_dl

Note that this scene has no subsurface and volumetric, otherwise these would also be here.

This experiment shows us that we need to focus on the svm_eval. Possible options:


 * merge kernels to use the same `svm_eval_nodes`

Next steps

 * Comment out shader evaluation and see if compilation times change a lot.
 * Find kernels that can be merged into a single kernel so the compilation times are shared.
 * Find out if some kernels can be optimized by making the compilation directives more precise.