Blender 2.76: OpenSubdiv
OpenSubdiv project is about making viewport animation of hi-res characters more realtime, hence helping animators to see final result on a higher FPS ratios.
In order to reach this goal OpenSubdiv library issued.
Long Story Short
OpenSubdiv is a new option of Subsurf modifier. When this option is enabled for Subsurf modifier from the very top of the modifier stack, evaluation will happen on the compute device selected in the User Preferences. Best performance will be achieved when using GLSL evaluation.
OpenSubdiv requires decent video card and latest video drivers installed. Recommended video cards are AMD and NVidia, few years old max.
This is completely new technology integrated in Blender, which still could have bugs and glitches, which we'll be happy to address in further Blender releases!
- Base mesh - pre-subsurfed mesh (input of the subsurf modifier)
- Coarse positions - coordinates of vertices on the base mesh
What is OpenSubdiv
OpenSubdiv is a library which implements Catmull-Clark surface subdivisions with the following strong points:
- Allows to do heavy computation once, and then refine it for an updated coarse positions much more efficient.
- GPU side hires mesh tessellation
- Implements evaluation API which could be used by renderers
- Implements nice support of edge sharpness (at least better than Blender was originally doing this)
- Supports multiple compute backends, such as CPU, GLSL, CUDA and OpenCL.
While OpenSubdiv has some really strong points it's still not a magic bullet and has its own limitations. Main of them is that it's optimized for cases where the topology of the base mesh doesn't change. If the base mesh is tempting to change its topology then OpenSubdiv has no performance benefit over the initial Blender's implementation.
In less technical terms it just means OpenSubdiv is aimed to help animators who need realtime viewport playback with final mesh subdivisions.
Integration into Blender
OpenSubdiv has been integrated into Blender and this section covers all the crucial information needed for both artists and developers.
NOTE: Blender is to be compiled with OpenSubdiv support, otherwise the following sections are not really relevant.
For artists OpenSubdiv is really easy to use. It is just a couple of new options in user preferences and subsurf modifier.
So first thing to do is to go to System section of User Preference and navigate to OpenSubdiv Compute option (it is right below Compute Device option used for Cycles).
The following compute devices are allowed:
- None - disables any OpenSubdiv compute devices, makes sure legacy subsurf code from Blender is used. Use this option when OpenSubdiv causes any bugs or regressions.
- CPU - single threaded CPU implementation. it is mainly useful in cases when GPU compute is possible and threaded CPU option causes artifacts (it is unlikely to happen, but still possible).
- OpenMP - multi-threaded CPU implementation. It is similar to threading model of old subsurf code. Use it for maximum performance in cases when GPU compute is not available.
- GLSL Transform Feedback - uses GPU to perform calculations, has minimal requirements to video card and driver.
- GLSL Compute - uses GPU to perform calculations, supposed to be more efficient than Transform Feedback but also has higher requirements to video card and driver.
Even for CPU evaluation your graphics card should support geometry shaders and uniform buffers at least in order to be able to visualize mesh in the viewport in an optimal way. If the GPU or it's driver doesn't have support of this features none of OpenSubdiv compute devices will be available.
After choosing fastest option which is supported by the system OpenSubdiv is ready to use!
In order to do so, simply go to the Subsurf modifier settings and enable Use OpenSubdiv option:
Once this option is enabled the modifier will start using OpenSubdiv for the computation, hopefully making playback of the scene much faster.
In order to utilize maximum performance form OpenSubdiv the following things are required:
- Subsurf modifier must be last in the stack.
- Since majority of performance improvement is coming from GPU, OpenSubdiv is not possible to use in the middle of the modifier stack.
- NOTE: Currently it is a known TODO when OpenSubdiv is not used in cases when there are disabled modifiers on top of the Subsurf. It'll be solved in the future.
- There should be no modifiers prior to Subsurf which changes mesh topology across the time (if input topology is changed Subsurf modifier is doomed to go into all the heavy compute parts of code, which ruins all the performance).
- Other objects should not use geometry of OpenSubdiv mesh. This is because such cases requires mesh to be on CPU memory, which isn't really easy/fast to do if the mesh was calculated on GPU.
There are number of limitations which are not resolvable with the current OpenSubdiv release and current OpenGL profile of Blender itself:
- Smooth normals aren't currently correct. This goes into limitations of GLSL level we can currently use in Blender and what OpenSubdiv can provide us. It's being worked on though with high priority.
- No OSX support, Historically Blender is using rather old OpenGL API and OpenSubdiv is using OpenGL 4. This makes it impossible to make Blender working with OpenSubdiv on OSX due to policy from Apple which forces everyone to drop older OpenGL code if newer API is used in application.
- No UV maps support in viewport. This limitation is caused by OpenSubdiv which currently doesn't have nice support of UV mapping evaluation on GPU and has no way to calculate UVs on CPU as well.
- No generated coordinates support. Generated coordinates in blender requires calculation of undeformed mesh on CPU which makes it rather really complicated to use OpenSubdiv in such configuration because all it's data is on GPU.
- Tools which requires having mesh on CPU (for example snap tools) will fail for OpenSubdiv mesh. This also includes areas like geometry primitive counter in the Info header.
- Currently shading is limited to a single material only. This is easy to resolve from Blender side and will happen sooner than later.
- Loose edges and vertices are not supported.
- Auto split is not working yet
- Loop normals are not supported yet as well
Currently GLSL Compute evaluator is disabled for AMD hardware. This is because of issue in the OpenSubdiv itself which is being worked on but didn't quite managed to be solved before Blender release.
Intel cards are also currently disabled for OpenSubdiv. This is because of various reported issues with this cards. It's also unclear if OpenSubdiv is any faster on such a hardware. However, it's possible to force enable Intel cards by setting OPENSUBDIV_ALLOW_INTEL environment variable.
Reaching Best Performance
There are some tricks to reach best performance out of OpenSubdiv:
- Disable selection outline
- With current approach of drawing outlines in Blender drawing outline for OpenSubdiv mesh is nearly 2x slower as drawing shaded mesh. This is because geometry and tessellation shaders needs to run for both outline and shaded models.
- Switch to OpenGL Occlusion Queries selection method
- Selecting OpenSubdiv meshes might be slower with default selection method. This is because OpenGL Select mode emulates geometry and tesselation shaders on CPU, which is surely slower than using real GPU for this.
This section will cover main topic which are needed for the new developers to get into the code.
Main idea behind the design is to eventually completely switch from current subsurf code to new one based on OpenSubdiv. So from this point of view OepnSubdiv is seemed as a transparent replacement for CCGSubSurf.c code. In practice it'll take some more time to reach this goal, but it'll happen eventually.
That being said, the main concepts are:
- OpenSubdiv replaces code in CCGSubSurf.c and does some extra code in CCGDerivedMesh needed for visualization
- OpenSubdiv supports both GPU pipeline (with evaluation on either CPU or GPU, tessellation on GPU) or fully CPU (for the compatibility reasons and for render engines).
- If subsurf is last modifier in the stack evaluation is default to GPU code path.
- If GPU compute is not supported, then CPU path will be used instead
- Both CPU and GPU pipelines are kept as low-level in Blender as possible. This is to make OpenSubdiv really transparent for the whole Blender.
- If GPU code path is used no CCG geometry (CCGVert, CCGEdge and CCGFace) is being created. This is to keep memory usage as low as possible.
Evaluation on CPU
Evaluation on CPU is needed in the following cases:
- Subsurf modifier is somewhere in the middle of modifier stack
- Object is being evaluated for render engine
This part of the new code path mimics legacy behavior of CCG really close. CCG geometry is still being created in the same exact way as it used to be before and OpenSubdiv is simply replaces surface evaluation code using EvaluationAPI from OpenSubdiv. So the process is roughly:
- Synchronize CCgSubSurf and fill it with verts, edges and faces.
- Ensure OpenSubdiv evaluator is up to date.
- basically, we're caching evaluator so we can re-use it on different frames. But if topology changed then we need to reconstruct evaluator.
- Iterate each faces's grids and invoke evaluation of the grid.
- Evaluation happens in the "knots" of the grid. Since OpenSubdiv operates with faces rather than with grids, conversion from grid coordinate to face coordinate is invoked.
- OpenSubdiv's Evaluation API is used to evaluate both coordinate and normal
- Result is stored in the CCG grids.
CCGSubSurf is now totally ready for use by CCGDerievdMesh and non of the areas in Blender even knew that new code path was used.
Evaluation on GPU
GPU code path is a bit more tricky, mainly because it tries to keep memory usage as low as possible.
That being said, when subsurf modifier is being evaluated by GPU code path, non of the CCG geometry is created, and base mesh is exported directly to OpenSubdiv mesh. This is done once and only happens when topology changes. This is actually the only thing which is being ensured by stack evaluation, rest of the magic is happening during the draw.
Drawing happens in several steps:
- First of all, updated coarse coordinates are uploaded onto the compute device.
- Device synchronization and refinement is invoked. This makes mesh ready for draw.
- This two steps can't be done on stack evaluation because they're operating with OpenGL buffers which can't happen from threads
- Drawing happens based on the patches, as few of glDrawElement calls are invoked as possible.
There are three main areas which are touched by OpenSubdiv project:
- intern/opensubdiv contains C-API bindings for the OpenSubdiv library together with some utilities. It also contains fair enough amount of patch drawing code.
- source/blender/blenkernel/intern/CCGSubSurf* contains all the code which makes old CCG (Catmull-Clark Grids) and new OpenSubdiv subdivision code live together, sharing as much code as possible, but also allowing clear separation of old and new code paths.
- source/blender/blenkernel/intern/subsurf_ccg* contains modifications for CCGDerivedMesh needed to support OpenSubdiv drawing.
- source/blender/gpu contains some modifications needed to support geometry shader and face-varying attributes.
Rest of the areas of code is simple to find by searching for WITH_OPENSUBDIV in the sources.
Building with OpenSubdiv
On Windows platforms we've got precompiled libraries and OpenSubdiv is enabled by default for both SCons and CMake. This means nothing special is needed to build Blender with OpenSubdiv other than just making sure lib folder is up to date.
On Linux OpenSubdiv is to be manually compiled from the latest development branch of OpenSubdiv. Here's an URL to the branch:
After that make sure all OpenSubdiv-related options in either SCons or CMake are set to a proper values, enable OpenSubdiv and build blender as usual.
The project is not really finished yet and there are some known things to do:
- Add support of generated coordinates
- Add support of UV maps
- Improve export of topology to OpenSubdiv (it might currently break due to possible lack of vert-edge and vert-face orientation)
- Performance improvements are also possible on OpenSubdiv mesh construction
- Performance improvements are also possible on topology changes detection
- Add support for tools like snap, AABB calculation and others which could be invoked after modifier stack evaluation
- Add support for loose edges and vertices
- Add support for CUDA and OpenCL evaluation backends
- Add support of OSX
- Add support for edge auto split
- Custom normals support