Note: This is an archived version of the Blender Developer Wiki (archived 2024). The current developer documentation is available on developer.blender.org/docs.

User:Jbakker/projects/CyclesOpenCL2019/Cycles

Cycles OpenCL kernels

  1. for each tile
    1. for each sample
      1. data_init
      2. path_init
      3. while active rays
        1. scene_intersect; This kernel takes care of scene_intersect function. This kernel changes the ray_state of RAY_REGENERATED rays to RAY_ACTIVE. This kernel processes rays of ray state RAY_ACTIVE. This kernel determines the rays that have hit the background and changes their ray state to RAY_HIT_BACKGROUND.
        2. lamp_emission; This kernel operates on QUEUE_ACTIVE_AND_REGENERATED_RAYS. It processes rays of state RAY_ACTIVE and RAY_HIT_BACKGROUND. We will empty QUEUE_ACTIVE_AND_REGENERATED_RAYS queue in this kernel.
        3. do_volume
        4. queue_enqueue; This kernel enqueues rays of different ray state into their appropriate queues
        5. indirect_background
        6. shader_setup; This kernel sets up the ShaderData structure from the values computed by the previous kernels.
        7. shader_sort;
        8. shader_eval; This kernel evaluates ShaderData structure from the values computed by the previous kernels.
        9. holdout_emission_blurring_pathtermination_ao; This kernel takes care of the logic to process "material of type holdout", indirect primitive emission, bsdf blurring, probabilistic path termination and AO.
        10. subsurface_scatter
        11. queue_enqueue; This kernel enqueues rays of different ray state into their appropriate queues
        12. direct_lighting; This kernel takes care of direct lighting logic. However, the "shadow ray cast" part of direct lighting is handled in the next kernel.
        13. shadow_blocked_ao; Shadow ray cast for AO
        14. shadow_blocked_dl; Shadow ray cast for direct visible light
        15. enqueue_inactive
        16. next_iteration_setup; This kernel takes care of setting up ray for the next iteration of path-iteration and accumulating radiance corresponding to AO and direct-lighting
        17. indirect_subsurface
        18. queue_enqueue; This kernel enqueues rays of different ray state into their appropriate queues
        19. buffer_update; This kernel takes care of rays that hit the background (sceneintersect kernel), and for the rays of state RAY_UPDATE_BUFFER it updates the ray's accumulated radiance in the output buffer. This kernel also takes care of rays that have been determined to-be-regenerated.

Split program compile times

We created a script that measures the compilation times of a split kernel program with a random set of compile directives. The next table will show you the results.

                     program_name                      | samples | min_compile_time | max_compile_time |    avg_compile_time    
-------------------------------------------------------+---------+------------------+------------------+------------------------
kernel_state_buffer_size.cl                            |    9843 |             0.68 |             0.99 | 0.84133699075485116326
kernel_enqueue_inactive.cl                             |   10015 |             0.68 |             1.00 | 0.85272391412880678982
kernel_queue_enqueue.cl                                |   10084 |             0.69 |             1.04 | 0.86120587068623562079
kernel_data_init.cl                                    |    9975 |             0.71 |             1.04 | 0.87913684210526315789
kernel_indirect_subsurface.cl                          |    9948 |             0.70 |             1.06 | 0.86709187776437474869
kernel_shader_setup.cl                                 |    9965 |             0.84 |             1.41 |     1.1189764174611139
kernel_shader_sort.cl                                  |   10029 |             0.89 |             1.89 | 1.06694087147272908565
kernel_path_init.cl                                    |    9927 |             1.22 |             2.68 |     1.7768157550115846
kernel_scene_intersect.cl                              |    9934 |             0.78 |             2.80 |     1.5094664787598148
kernel_buffer_update.cl                                |    9917 |             1.42 |             2.95 |     2.0160169406070384
kernel_next_iteration_setup.cl                         |    9893 |             3.32 |             6.29 |     4.7226523804710401
kernel_shader_eval.cl                                  |    9972 |             3.29 |            32.86 |    10.9066185318892900
kernel_indirect_background.cl                          |   10096 |             3.27 |            33.44 |    11.0069482963549921
kernel_direct_lighting.cl                              |    9859 |             5.39 |            36.80 |    14.2002201034587686
kernel_lamp_emission.cl                                |    9890 |             3.31 |            36.82 |    12.2152942366026289
kernel_holdout_emission_blurring_pathtermination_ao.cl |    9951 |             1.85 |            38.32 |     8.5834157371118481
kernel_shadow_blocked_ao.cl                            |    9912 |             0.89 |            42.59 |    12.3766010895883777
kernel_shadow_blocked_dl.cl                            |   10029 |             0.80 |            48.80 |    17.1998235118157344
kernel_do_volume.cl                                    |   10067 |             0.69 |            51.67 |    13.4433267110360584
kernel_subsurface_scatter.cl                           |    9915 |             0.69 |            59.42 |    12.6498608169440242

Let's go deeper into the programs that can take longer then 10 seconds by looking at the different compilation directives.

                     program_name                      |  avg_compile_time   | __kernel_cl_khr_fp16__ 
-------------------------------------------------------+---------------------+------------------------
kernel_do_volume.cl                                    | 13.4762423282518313 | f
kernel_do_volume.cl                                    | 13.4101814194577352 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4882726180944756 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6793461150353179 | t
kernel_indirect_background.cl                          | 10.9867633526705341 | f
kernel_indirect_background.cl                          | 11.0267451442024720 | t
kernel_lamp_emission.cl                                | 12.1135504032258065 | f
kernel_lamp_emission.cl                                | 12.3176572008113590 | t
kernel_shader_eval.cl                                  | 10.8741181149333864 | f
kernel_shader_eval.cl                                  | 10.9396844021849080 | t
kernel_shadow_blocked_ao.cl                            | 12.6275120675784393 | f
kernel_shadow_blocked_ao.cl                            | 12.1240647773279352 | t
kernel_shadow_blocked_dl.cl                            | 17.1187026383654037 | f
kernel_shadow_blocked_dl.cl                            | 17.2818063352044908 | t


                     program_name                      |  avg_compile_time   | __kernel_opencl_debug__ 
-------------------------------------------------------+---------------------+-------------------------
kernel_do_volume.cl                                    | 13.5959765940274415 | f
kernel_do_volume.cl                                    | 13.2953062023087458 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4345416666666667 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.7362003665241295 | t
kernel_indirect_background.cl                          | 11.0819661354581673 | f
kernel_indirect_background.cl                          | 10.9327580772261623 | t
kernel_lamp_emission.cl                                | 12.2619979838709677 | f
kernel_lamp_emission.cl                                | 12.1683062880324544 | t
kernel_shader_eval.cl                                  | 10.8915838383838384 | f
kernel_shader_eval.cl                                  | 10.9214376742333732 | t
kernel_shadow_blocked_ao.cl                            | 12.4973224043715847 | f
kernel_shadow_blocked_ao.cl                            | 12.2566083283041642 | t
kernel_shadow_blocked_dl.cl                            | 17.1251963506545022 | f
kernel_shadow_blocked_dl.cl                            | 17.2752737116502908 | t


                     program_name                      |  avg_compile_time   | __kernel_debug__ 
-------------------------------------------------------+---------------------+------------------
kernel_do_volume.cl                                    | 13.6354379059240307 | f
kernel_do_volume.cl                                    | 13.2475551544324108 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5703446885708527 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5964108216432866 | t
kernel_indirect_background.cl                          | 11.0365537730243613 | f
kernel_indirect_background.cl                          | 10.9773310877749158 | t
kernel_lamp_emission.cl                                | 12.2277207392197125 | f
kernel_lamp_emission.cl                                | 12.2032390438247012 | t
kernel_shader_eval.cl                                  | 10.9384738874212558 | f
kernel_shader_eval.cl                                  | 10.8755830528608196 | t
kernel_shadow_blocked_ao.cl                            | 12.4887765957446809 | f
kernel_shadow_blocked_ao.cl                            | 12.2674621815286624 | t
kernel_shadow_blocked_dl.cl                            | 17.1620127795527157 | f
kernel_shadow_blocked_dl.cl                            | 17.2375363473411671 | t


                     program_name                      |  avg_compile_time   | __kernel_experimental__ 
-------------------------------------------------------+---------------------+-------------------------
kernel_do_volume.cl                                    | 13.3263311437403400 | f
kernel_do_volume.cl                                    | 13.5671396442445308 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6782696177062374 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4887713310580205 | t
kernel_indirect_background.cl                          | 11.0460938115884903 | f
kernel_indirect_background.cl                          | 10.9673974512146555 | t
kernel_lamp_emission.cl                                | 12.1728550814265100 | f
kernel_lamp_emission.cl                                | 12.2561500297678111 | t
kernel_shader_eval.cl                                  | 10.7713534447615165 | f
kernel_shader_eval.cl                                  | 11.0376115278326096 | t
kernel_shadow_blocked_ao.cl                            | 12.3875850202429150 | f
kernel_shadow_blocked_ao.cl                            | 12.3656878519710378 | t
kernel_shadow_blocked_dl.cl                            | 17.1164304047384008 | f
kernel_shadow_blocked_dl.cl                            | 17.2849133763094279 | t


                     program_name                      |  avg_compile_time   | __no_hair__ 
-------------------------------------------------------+---------------------+-------------
kernel_do_volume.cl                                    | 13.5722628320351508 | f
kernel_do_volume.cl                                    | 13.3157411067193676 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.7404943729903537 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4263055276381910 | t
kernel_indirect_background.cl                          | 11.0271168316831683 | f
kernel_indirect_background.cl                          | 10.9867637732857709 | t
kernel_lamp_emission.cl                                | 12.1087017684887460 | f
kernel_lamp_emission.cl                                | 12.3232315832315832 | t
kernel_shader_eval.cl                                  | 11.0372387305437639 | f
kernel_shader_eval.cl                                  | 10.7780258706467662 | t
kernel_shadow_blocked_ao.cl                            | 13.0288574963910085 | f
kernel_shadow_blocked_ao.cl                            | 11.7519138850483903 | t
kernel_shadow_blocked_dl.cl                            | 17.9479142070051161 | f
kernel_shadow_blocked_dl.cl                            | 16.4313179704871639 | t


                     program_name                      |  avg_compile_time   | __no_object_motion__ 
-------------------------------------------------------+---------------------+----------------------
kernel_do_volume.cl                                    | 13.6522115384615385 | f
kernel_do_volume.cl                                    | 13.2291892979279823 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6292167832167832 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5370683380509503 | t
kernel_indirect_background.cl                          | 11.1669826034793041 | f
kernel_indirect_background.cl                          | 10.8498665358194308 | t
kernel_lamp_emission.cl                                | 12.3591212121212121 | f
kernel_lamp_emission.cl                                | 12.0711761133603239 | t
kernel_shader_eval.cl                                  | 10.9368144597563411 | f
kernel_shader_eval.cl                                  | 10.8761671701913394 | t
kernel_shadow_blocked_ao.cl                            | 12.7996805993690852 | f
kernel_shadow_blocked_ao.cl                            | 11.9332417355371901 | t
kernel_shadow_blocked_dl.cl                            | 17.7198317794892582 | f
kernel_shadow_blocked_dl.cl                            | 16.6962473012757605 | t


                     program_name                      |  avg_compile_time   | __no_camera_motion__ 
-------------------------------------------------------+---------------------+----------------------
kernel_do_volume.cl                                    | 13.4769023034154091 | f
kernel_do_volume.cl                                    | 13.4097177499503081 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.4937155155546860 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6781384297520661 | t
kernel_indirect_background.cl                          | 11.0142412451361868 | f
kernel_indirect_background.cl                          | 10.9993845843422115 | t
kernel_lamp_emission.cl                                | 12.2041389977314910 | f
kernel_lamp_emission.cl                                | 12.2260245982939893 | t
kernel_shader_eval.cl                                  | 10.8949710173895663 | f
kernel_shader_eval.cl                                  | 10.9183457436103844 | t
kernel_shadow_blocked_ao.cl                            | 12.3722251736820597 | f
kernel_shadow_blocked_ao.cl                            | 12.3808688720605819 | t
kernel_shadow_blocked_dl.cl                            | 17.0923717059639390 | f
kernel_shadow_blocked_dl.cl                            | 17.3086772380570052 | t


                     program_name                      |  avg_compile_time   | __no_shader_raytrace__ 
-------------------------------------------------------+---------------------+------------------------
kernel_do_volume.cl                                    | 13.6956241480038948 | f
kernel_do_volume.cl                                    | 13.1806447688564477 | t
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.6079684338324565 | f
kernel_holdout_emission_blurring_pathtermination_ao.cl |  8.5591914553803154 | t
kernel_indirect_background.cl                          | 11.1771351134310379 | f
kernel_indirect_background.cl                          | 10.8412199413489736 | t
kernel_lamp_emission.cl                                | 12.4153029997986712 | f
kernel_lamp_emission.cl                                | 12.0134978671541743 | t
kernel_shader_eval.cl                                  | 11.1381384676775738 | f
kernel_shader_eval.cl                                  | 10.6726713709677419 | t
kernel_shadow_blocked_ao.cl                            | 12.4737031758957655 | f
kernel_shadow_blocked_ao.cl                            | 12.2812080000000000 | t
kernel_shadow_blocked_dl.cl                            | 17.2491691631715598 | f
kernel_shadow_blocked_dl.cl                            | 17.1506252489048188 | t


Until now no really interesting effects.

                     program_name                      | samples |  avg_compile_time   | __nodes_max_group__ 
-------------------------------------------------------+---------+---------------------+---------------------
kernel_do_volume.cl                                    |    2441 | 11.6218967636214666 |                   0
kernel_do_volume.cl                                    |    2513 | 12.6581894150417827 |                   1
kernel_do_volume.cl                                    |    2568 | 13.5505957943925234 |                   2
kernel_do_volume.cl                                    |    2545 | 15.8573516699410609 |                   3
kernel_holdout_emission_blurring_pathtermination_ao.cl |    2484 |  7.3680716586151369 |                   0
kernel_holdout_emission_blurring_pathtermination_ao.cl |    2445 |  7.5575869120654397 |                   1
kernel_holdout_emission_blurring_pathtermination_ao.cl |    2558 |  8.7847458952306489 |                   2
kernel_holdout_emission_blurring_pathtermination_ao.cl |    2464 | 10.6175324675324675 |                   3
kernel_indirect_background.cl                          |    2513 |  8.6922244329486669 |                   0
kernel_indirect_background.cl                          |    2476 |  9.2696688206785137 |                   1
kernel_indirect_background.cl                          |    2588 | 11.0195401854714065 |                   2
kernel_indirect_background.cl                          |    2519 | 15.0108455736403335 |                   3
kernel_lamp_emission.cl                                |    2475 |  9.8700767676767677 |                   0
kernel_lamp_emission.cl                                |    2501 | 10.6073970411835266 |                   1
kernel_lamp_emission.cl                                |    2408 | 12.6640531561461794 |                   2
kernel_lamp_emission.cl                                |    2506 | 15.7049800478850758 |                   3
kernel_shader_eval.cl                                  |    2524 |  8.7771315372424723 |                   0
kernel_shader_eval.cl                                  |    2495 |  9.1851262525050100 |                   1
kernel_shader_eval.cl                                  |    2567 | 11.1343241137514608 |                   2
kernel_shader_eval.cl                                  |    2386 | 14.7144258172673931 |                   3
kernel_shadow_blocked_ao.cl                            |    2484 | 10.3099476650563607 |                   0
kernel_shadow_blocked_ao.cl                            |    2492 | 10.9489245585874799 |                   1
kernel_shadow_blocked_ao.cl                            |    2514 | 12.6312291169451074 |                   2
kernel_shadow_blocked_ao.cl                            |    2422 | 15.7007968620974401 |                   3
kernel_shadow_blocked_dl.cl                            |    2506 | 14.4531604150039904 |                   0
kernel_shadow_blocked_dl.cl                            |    2531 | 15.7420624259186092 |                   1
kernel_shadow_blocked_dl.cl                            |    2499 | 17.6736014405762305 |                   2
kernel_shadow_blocked_dl.cl                            |    2493 | 20.9658724428399519 |                   3
                     program_name                      | samples |  avg_compile_time   | node_feature_volume | node_feature_hair | node_feature_bump | node_feature_bump_state 
-------------------------------------------------------+---------+---------------------+---------------------+-------------------+-------------------+-------------------------
kernel_do_volume.cl                                    |       6 |  8.0150000000000000 |                     |                   |                   | 
kernel_do_volume.cl                                    |      13 | 12.9523076923076923 | VOLUME              |                   |                   | 
kernel_do_volume.cl                                    |       9 |  6.2988888888888889 |                     | HAIR              |                   | 
kernel_do_volume.cl                                    |       6 |  7.6700000000000000 | VOLUME              | HAIR              |                   | 
kernel_do_volume.cl                                    |       7 | 16.5728571428571429 |                     |                   | BUMP              | 
kernel_do_volume.cl                                    |       5 | 13.1840000000000000 | VOLUME              |                   | BUMP              | 
kernel_do_volume.cl                                    |       9 | 17.4444444444444444 |                     | HAIR              | BUMP              | 
kernel_do_volume.cl                                    |       3 | 15.1533333333333333 | VOLUME              | HAIR              | BUMP              | 
kernel_do_volume.cl                                    |       8 | 16.7250000000000000 |                     |                   |                   | BUMP_STATE
kernel_do_volume.cl                                    |       8 |  8.9537500000000000 | VOLUME              |                   |                   | BUMP_STATE
kernel_do_volume.cl                                    |       6 | 13.8983333333333333 |                     | HAIR              |                   | BUMP_STATE
kernel_do_volume.cl                                    |       8 | 14.3225000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_do_volume.cl                                    |       7 | 18.1728571428571429 |                     |                   | BUMP              | BUMP_STATE
kernel_do_volume.cl                                    |      12 | 26.4591666666666667 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_do_volume.cl                                    |       7 | 13.2357142857142857 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_do_volume.cl                                    |      10 | 14.1830000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       7 |  7.2357142857142857 |                     |                   |                   | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |      10 |  5.0550000000000000 | VOLUME              |                   |                   | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       9 |  6.1988888888888889 |                     | HAIR              |                   | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |      10 |  8.3040000000000000 | VOLUME              | HAIR              |                   | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       5 | 11.4940000000000000 |                     |                   | BUMP              | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       7 | 11.1171428571428571 | VOLUME              |                   | BUMP              | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       9 |  6.9644444444444444 |                     | HAIR              | BUMP              | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       3 |  9.6200000000000000 | VOLUME              | HAIR              | BUMP              | 
kernel_holdout_emission_blurring_pathtermination_ao.cl |       8 |  6.4950000000000000 |                     |                   |                   | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       6 |  4.2383333333333333 | VOLUME              |                   |                   | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       6 |  5.1000000000000000 |                     | HAIR              |                   | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       7 |  8.2600000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       4 | 10.7500000000000000 |                     |                   | BUMP              | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       5 | 12.3420000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |      10 | 11.0060000000000000 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_holdout_emission_blurring_pathtermination_ao.cl |       8 |  9.3137500000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_indirect_background.cl                          |       7 |  6.5614285714285714 |                     |                   |                   | 
kernel_indirect_background.cl                          |       4 |  9.9050000000000000 | VOLUME              |                   |                   | 
kernel_indirect_background.cl                          |       6 |  8.4233333333333333 |                     | HAIR              |                   | 
kernel_indirect_background.cl                          |       9 |  6.0400000000000000 | VOLUME              | HAIR              |                   | 
kernel_indirect_background.cl                          |      10 | 14.1550000000000000 |                     |                   | BUMP              | 
kernel_indirect_background.cl                          |       8 | 17.8650000000000000 | VOLUME              |                   | BUMP              | 
kernel_indirect_background.cl                          |       7 | 15.9328571428571429 |                     | HAIR              | BUMP              | 
kernel_indirect_background.cl                          |       8 | 12.7250000000000000 | VOLUME              | HAIR              | BUMP              | 
kernel_indirect_background.cl                          |       5 |  8.1700000000000000 |                     |                   |                   | BUMP_STATE
kernel_indirect_background.cl                          |      11 |  7.6436363636363636 | VOLUME              |                   |                   | BUMP_STATE
kernel_indirect_background.cl                          |       5 |  5.9620000000000000 |                     | HAIR              |                   | BUMP_STATE
kernel_indirect_background.cl                          |       4 |  7.0000000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_indirect_background.cl                          |       6 | 14.7033333333333333 |                     |                   | BUMP              | BUMP_STATE
kernel_indirect_background.cl                          |       8 | 14.1800000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_indirect_background.cl                          |       3 | 16.4200000000000000 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_indirect_background.cl                          |       6 | 10.8100000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_lamp_emission.cl                                |      10 |  8.4230000000000000 |                     |                   |                   | 
kernel_lamp_emission.cl                                |       4 | 12.1075000000000000 | VOLUME              |                   |                   | 
kernel_lamp_emission.cl                                |       7 |  9.0471428571428571 |                     | HAIR              |                   | 
kernel_lamp_emission.cl                                |       9 | 12.6433333333333333 | VOLUME              | HAIR              |                   | 
kernel_lamp_emission.cl                                |      10 | 14.5930000000000000 |                     |                   | BUMP              | 
kernel_lamp_emission.cl                                |       7 | 12.1228571428571429 | VOLUME              |                   | BUMP              | 
kernel_lamp_emission.cl                                |      12 | 16.5375000000000000 |                     | HAIR              | BUMP              | 
kernel_lamp_emission.cl                                |       4 | 13.6375000000000000 | VOLUME              | HAIR              | BUMP              | 
kernel_lamp_emission.cl                                |       8 |  8.0575000000000000 |                     |                   |                   | BUMP_STATE
kernel_lamp_emission.cl                                |       9 |  9.8155555555555556 | VOLUME              |                   |                   | BUMP_STATE
kernel_lamp_emission.cl                                |       6 |  9.7766666666666667 |                     | HAIR              |                   | BUMP_STATE
kernel_lamp_emission.cl                                |       9 | 11.4177777777777778 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_lamp_emission.cl                                |       6 | 23.8083333333333333 |                     |                   | BUMP              | BUMP_STATE
kernel_lamp_emission.cl                                |       4 | 16.0425000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_lamp_emission.cl                                |       8 | 13.4762500000000000 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_lamp_emission.cl                                |       6 | 15.7100000000000000 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_shader_eval.cl                                  |       6 |  8.6333333333333333 |                     |                   |                   | 
kernel_shader_eval.cl                                  |       6 |  8.0450000000000000 | VOLUME              |                   |                   | 
kernel_shader_eval.cl                                  |       9 |  7.5122222222222222 |                     | HAIR              |                   | 
kernel_shader_eval.cl                                  |       7 | 11.5071428571428571 | VOLUME              | HAIR              |                   | 
kernel_shader_eval.cl                                  |       4 | 12.5775000000000000 |                     |                   | BUMP              | 
kernel_shader_eval.cl                                  |       6 | 20.2983333333333333 | VOLUME              |                   | BUMP              | 
kernel_shader_eval.cl                                  |      10 | 13.6550000000000000 |                     | HAIR              | BUMP              | 
kernel_shader_eval.cl                                  |       8 | 17.1725000000000000 | VOLUME              | HAIR              | BUMP              | 
kernel_shader_eval.cl                                  |       5 |  6.2040000000000000 |                     |                   |                   | BUMP_STATE
kernel_shader_eval.cl                                  |       6 |  8.2150000000000000 | VOLUME              |                   |                   | BUMP_STATE
kernel_shader_eval.cl                                  |       4 |  7.0275000000000000 |                     | HAIR              |                   | BUMP_STATE
kernel_shader_eval.cl                                  |       6 |  5.8000000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_shader_eval.cl                                  |       5 | 19.5220000000000000 |                     |                   | BUMP              | BUMP_STATE
kernel_shader_eval.cl                                  |       8 | 15.0650000000000000 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_shader_eval.cl                                  |       8 | 16.6200000000000000 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_shader_eval.cl                                  |       7 | 13.2771428571428571 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |      10 |  8.9390000000000000 |                     |                   |                   | 
kernel_shadow_blocked_ao.cl                            |       3 | 12.0033333333333333 | VOLUME              |                   |                   | 
kernel_shadow_blocked_ao.cl                            |      14 | 11.4742857142857143 |                     | HAIR              |                   | 
kernel_shadow_blocked_ao.cl                            |       7 |  4.6714285714285714 | VOLUME              | HAIR              |                   | 
kernel_shadow_blocked_ao.cl                            |       3 | 25.3566666666666667 |                     |                   | BUMP              | 
kernel_shadow_blocked_ao.cl                            |       5 | 21.1040000000000000 | VOLUME              |                   | BUMP              | 
kernel_shadow_blocked_ao.cl                            |       5 | 11.6740000000000000 |                     | HAIR              | BUMP              | 
kernel_shadow_blocked_ao.cl                            |       7 | 10.0642857142857143 | VOLUME              | HAIR              | BUMP              | 
kernel_shadow_blocked_ao.cl                            |       7 |  8.2000000000000000 |                     |                   |                   | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       5 | 10.1140000000000000 | VOLUME              |                   |                   | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       6 | 12.2850000000000000 |                     | HAIR              |                   | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       7 |  4.5857142857142857 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       8 | 15.8650000000000000 |                     |                   | BUMP              | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |      14 | 17.3207142857142857 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       3 |  9.9066666666666667 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_shadow_blocked_ao.cl                            |       7 | 11.7128571428571429 | VOLUME              | HAIR              | BUMP              | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       5 | 12.4680000000000000 |                     |                   |                   | 
kernel_shadow_blocked_dl.cl                            |       9 | 12.5255555555555556 | VOLUME              |                   |                   | 
kernel_shadow_blocked_dl.cl                            |       4 | 16.1900000000000000 |                     | HAIR              |                   | 
kernel_shadow_blocked_dl.cl                            |       8 | 13.5450000000000000 | VOLUME              | HAIR              |                   | 
kernel_shadow_blocked_dl.cl                            |       7 | 16.3928571428571429 |                     |                   | BUMP              | 
kernel_shadow_blocked_dl.cl                            |      11 | 23.4472727272727273 | VOLUME              |                   | BUMP              | 
kernel_shadow_blocked_dl.cl                            |       9 | 19.7722222222222222 |                     | HAIR              | BUMP              | 
kernel_shadow_blocked_dl.cl                            |      12 | 22.1775000000000000 | VOLUME              | HAIR              | BUMP              | 
kernel_shadow_blocked_dl.cl                            |      10 | 11.2430000000000000 |                     |                   |                   | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       8 | 13.4325000000000000 | VOLUME              |                   |                   | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       9 | 12.8100000000000000 |                     | HAIR              |                   | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       8 | 11.0175000000000000 | VOLUME              | HAIR              |                   | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       8 | 24.0175000000000000 |                     |                   | BUMP              | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       7 | 28.0685714285714286 | VOLUME              |                   | BUMP              | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |      15 | 20.4206666666666667 |                     | HAIR              | BUMP              | BUMP_STATE
kernel_shadow_blocked_dl.cl                            |       6 | 13.1983333333333333 | VOLUME              | HAIR              | BUMP              | BUMP_STATE


The sample count is currently too low to show the real difference. Will need to run the bench marks the whole week to find out more.

Conclusion

Conclusion so far:

  • Kernels what do shader evaluation take a long time to compile.
  • Other renderers compile shaders as source code, cycles uses a stack based approach. This saves the number of times that recompilations are needed, but the compilations are heavier.

Research experiments

Experiment 1: empty svn_eval_nodes

removed the whole contents of svn_eval_nodes reduces the compilation times (as expected) a lot. Even the base program compiles a lot faster. This because the base kernel also has the kernel for baking.

Compilation times for BMW scene With svm_eval_nodes:

Kernel compilation of split_path_init finished in 1.26s.
Kernel compilation of split_scene_intersect finished in 0.73s.
Kernel compilation of split_lamp_emission finished in 8.06s.
Kernel compilation of split_do_volume finished in 0.64s.
Kernel compilation of split_queue_enqueue finished in 0.65s.
Kernel compilation of split_indirect_background finished in 7.93s.
Kernel compilation of split_shader_setup finished in 0.81s.
Kernel compilation of split_shader_sort finished in 0.84s.
Kernel compilation of split_shader_eval finished in 8.01s.
Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 1.33s.
Kernel compilation of split_subsurface_scatter finished in 0.64s.
Kernel compilation of split_direct_lighting finished in 10.45s.
Kernel compilation of split_shadow_blocked_ao finished in 9.05s.
Kernel compilation of split_shadow_blocked_dl finished in 8.99s.
Kernel compilation of split_enqueue_inactive finished in 0.63s.
Kernel compilation of split_next_iteration_setup finished in 3.37s.
Kernel compilation of split_indirect_subsurface finished in 0.64s.
Kernel compilation of split_buffer_update finished in 1.41s.
Kernel compilation of base finished in 13.50s.
Kernel compilation of split_data_init finished in 0.66s.
Kernel compilation of split_state_buffer_size finished in 0.64s.

without svm_eval_nodes

Kernel compilation of split_path_init finished in 1.27s.
Kernel compilation of split_scene_intersect finished in 0.72s.
Kernel compilation of split_lamp_emission finished in 0.77s.
Kernel compilation of split_do_volume finished in 0.63s.
Kernel compilation of split_queue_enqueue finished in 0.64s.
Kernel compilation of split_indirect_background finished in 0.73s.
Kernel compilation of split_shader_setup finished in 0.80s.
Kernel compilation of split_shader_sort finished in 0.84s.
Kernel compilation of split_shader_eval finished in 0.65s.
Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 1.33s.
Kernel compilation of split_subsurface_scatter finished in 0.64s.
Kernel compilation of split_direct_lighting finished in 2.69s.
Kernel compilation of split_shadow_blocked_ao finished in 1.62s.
Kernel compilation of split_shadow_blocked_dl finished in 1.58s.
Kernel compilation of split_enqueue_inactive finished in 0.63s.
Kernel compilation of split_next_iteration_setup finished in 3.36s.
Kernel compilation of split_indirect_subsurface finished in 0.64s.
Kernel compilation of split_buffer_update finished in 1.41s.
Kernel compilation of split_data_init finished in 0.66s.
Kernel compilation of split_state_buffer_size finished in 0.63s.

The speedup of compilation times are measured in the following kernels

  • split_lamp_emission
  • split_indirect_background
  • split_shader_eval
  • split_direct_lighting
  • split_shadow_blocked_ao
  • split_shadow_blocked_dl

Note that this scene has no subsurface and volumetric, otherwise these would also be here.

This experiment shows us that we need to focus on the svm_eval. Possible options:

  • merge kernels to use the same svm_eval_nodes

Next steps

  • Comment out shader evaluation and see if compilation times change a lot.
  • Find kernels that can be merged into a single kernel so the compilation times are shared.
  • Find out if some kernels can be optimized by making the compilation directives more precise.