Cycles: Texture system improvements and better user feedback
Email / IRC / Social / Web
This project is about improving Cycles' texture system on CPU and GPU, by removing limitations and making it more memory efficient. Furthermore the user feedback should be improved, informing users more visible about (V)RAM usage.
Textures are an essential part of computer graphics, production scenes often contain hundreds of textures, a efficient system is needed here. Improving memory usage and removing limitations is only one part here though. Even with these improvements issues are possible in large scenes, due to limited (V)RAM. At the moment users often encounter cryptic messages, which are not clear, such as CUDA error: Out of memory in cuMemAlloc(&device_pointer, size). These messages neither show the exact amount of memory that is required, nor the parts of the scene which are responsible for the large amount of required RAM. Having clear error messages and statistics will help users to tweak their scenes in order to make them work.
- Add support for bindless textures on modern CUDA GPUs.
- Lower memory usage of single channel textures (e.g. Smoke density).
- Add support for Half Float textures, lowering memory usage for data like Vertex Normals.
- Improve memory statistics and make them available inside of Blenders UI.
- Add support for Mipmaps, to improve Texture filtering and lower memory usage.
Additional Stretch Goals
- Add support for float4 (HDRs) on OpenCL.
- Support single channel textures on OpenCL.
Bindless Textures on GPU (CUDA)
On older CUDA GPUs we can only use a certain amount of textures. These limits are 128 for Fermi cards (Geforce 4xx/5xx) and 256 on Kepler (Geforce 6xx and above). Not all textures here can be used for actual image textures though, as we need some for data as well (BVH, mesh, attributes...). We cannot lift these limits on old Fermi cards (4xx/5xx series), but we can for Kepler (6xx and above). I will add support for bindless textures, making it possible to use as many textures as users want (as long as they fit into the GPU memory of course).
I will have to carefully implement this, keeping support for Fermi cards alive (using the current code) while also supporting the new way. Some code reshuffling and refactor might be necessary here. This will also save some launch latency (0.5 μs per texture reference) on every kernel invoke. Probably that is negligible, but might be measurable in scenes with a lot of samples. More information about Bindless textures can be found here.
Lower memory usage of single channel textures
Single channel textures (like Smoke density or bw image textures) use 3 channels atm (RGB), while they only need one due to BW data. I will add support for one channel textures, decreasing memory usage of these. This will allow users to use more complex Smoke simulations or Point Density nodes in their scenes and more BW Textures (e.g. for bump mapping) without running out of memory.
This will be implemented for CPUs and modern CUDA GPUs (Kepler and above). I don't think adding support for this on Fermi cards is doable, as they don't support bindless textures and we should not lower the amount of usable image textures even more here.
Add support for half float textures
Some data like vertex normals or uv coordinates can be stored in half float precision. I will add support for half float textures in Cycles, and converting vertex normals and uv coordinates textures to the new system. This will lower memory usage during render.
This will be implemented for CPUs and GPUs.
Improve memory statistics and make them available in Blenders UI
No matter how optimized a system is, at some point and with a large scene, users will hit a hardware limit (running out of memory). At the moment Blender only displays cryptic messages, which are not clear. There are some memory statistics already, running blender with --debug-cycles but that is hidden for the regular user and the information there are more meant for developers. I will improve user feedback here, by exposing relevant information to the UI. Two things need to be done here.
1) Improve Cycles memory statistics internally, allowing us to collect more data and knowing better how much memory is actually used on a device. At the moment we only know the theoretical amount of memory that is used, in practise this amount can differ on the device due to memory fragmentation and other factors.
2) Display that data to users in an easy to understand format inside of Blenders UI. The actual place for this needs to be investigated with Blender users and the UI team. We could use the Info Header for example, have a panel in the render settings or maybe use the space inside of the Info editor.
Add support for Mipmaps
Mipmaps are important in order to improve texture filtering and lower memory usage. This is especially beneficial for objects that are far away from the camera, as we can use a lower resolution version of the texture there. We will make use of OpenImageIO for this feature. In order to support mipmaps in Cycles, several things need to be done:
- Add support for texture differentials in the SVM shading system. OSL already supports this.
- Automatically create .tx files (the mipmaps), using OIIOs maketx tool. Ideally this happens in the background, as part of Cycles scene preparation.
- Hookup OIIOs Image Cache system. This will make Mipmaps a CPU only feature for now, but that's the easiest approach on getting this to work. I can look into writing our own mipmap cache system, but that will likely be a stretch goal, in case I have more time than expected.
May 23 - June 5: Add support for bindless textures.
- Add support for bindless textures inside the CUDA device code in Cycles.
- Refactor code to support both new and old method, without duplicating too much code and having a clean codebase.
- Migrate textures to use the new system on Kepler GPUs. (Will be done for all textures types, e.g. byte, float, data...)
June 6 - June 12: Add support for single channel textures.
- Implement support for single channel textures for CPU and GPU (Kepler and above only).
- Migrate BW textures to use the new system, this includes e.g. Smoke Density, BW Image Textures (Bumpmaps..).
June 13 - June 19: Add support for half float textures.
- Implement basic support for half float textures (CPU and GPU).
- Migrate corresponding textures (Vertex Normals, UV coordinates..) to use them.
June 20 - July 3: Improve memory statistics, implement improved feedback in the UI.
- Improve Cycles' memory statistics.
- Exposing the statistics to Blenders UI.
July 4 - July 17: Implement Mipmapping.
- Add texture differentials to SVM.
- Hookup maketx to create the .tx mipmaps.
- Hookup OIIOs image cache system.
July 18 - July 31: Need some time off, due to exam preparations and actual exams.
August 1 - August 15: Finish lose ends.
- Finish and polish the features, fix bugs.
- Look into our own Mipmap caching system if there is time.
Note: The times for each of these projects might be a bit long, but I am just realistic here. Adding the features itself might not take that long, but these changes do not only affect CPU rendering, but also GPUs. On CUDA we have various architectures, and compiling the kernels always takes several minutes for one architecture. Code needs to be checked on all platforms, performance needs to be tested, potential GPU compiler bugt need to be worked around. This just takes time and careful checking on various system configurations.
My name is Thomas Dinges, I am 24 years old and I study computer science at the University of Tübingen. I started using Blender in 2007 (Blender 2.45) and became involved with Blender development during the 2.5x project. I helped with the new 2.5x interface code and RNA system. In 2011 I started to contribute to the Cycles render engine, making it my main code project since then. I enjoy writing rendering code, making Cycles faster and more efficient. In my free time I enjoy playing the piano or going for a walk.