From BlenderWiki

Jump to: navigation, search
Note: This is an archived version of the Blender Developer Wiki. The current and active wiki is available on wiki.blender.org.

BMesh Branch Review

As of 36634.

Summery

First of all, BMesh has many TODO's - see:
http://wiki.blender.org/index.php/Dev:2.5/Source/Development/Todo/BMesh

Aside from those, this is my opinion regarding the status of BMesh and the main things that should be resolved before merging.

  • The BMesh API is generally fine however I don't think the BMO operator API should be so high level (details below), appreciate feedback from other developers here.
  • BMesh data structures need cleaning (~1.7x memory usage compared to trunk)
    looks like some members were added for single uses, would like to see this refined, remove unused members and remove members which were added for one off usage (where allocating an array for example can replace having these struct members), memory usage will still grow but best not be wasteful.

    Edit, Joe removed some members, usage is now ~1.5x.
  • BMesh is much slower at deforming subsurf, approx 2-3x overall slower. Should be resolved before merging.

--Ideasman42 09:25, 12 May 2011 (CEST)

Changes that should be reviewed separately

Where these features are not essential they could be removed from the BMesh branch and made into patches, or merged into trunk if the module owner is ok with it.

  • MEM_guardedalloc has its MemHead struct made public, adds function WMEM_freeN which is only used once in bmesh_tools.c. Otherwise trunk's MEM_ source works fine in BMesh, so need to see if this is worth merging.
  • GHOST C API adds GHOST_RecordEvents(), GHOST_StopRecording(), GHOST_PlaybackEvents(). Event recording seems completely separated from bmesh.
  • shape keys have an added Key->uidgen, Custom data layers & KeyBlock added a 'uid', seems unrelated to bmesh, what does this solve?

Style Suggestions

BM_TestHFlag

Macro BM_TestHFlag() includes a NULL check, IMHO this encourages defensive programming where there are NULL checks when really the developer should know when a variable is NULL and account for it explicitly, in a way others know the expected state, also means macros that don't have NULL check will crash anyway.

Selection checking

Selection and hidden states are both checked, in 2.4x there is a convention that if something is selected, then it cant be hidden, I think this is a good convention to keep, so unless theres a good reason to support this, could lines...

if(!BM_TestHFlag(eve, BM_HIDDEN) && BM_TestHFlag(eve, BM_SELECT))

Be replaced with this?

 if(BM_TestHFlag(eve, BM_SELECT))

BLI_array_*

BLI_array_* macros are IMHO too convoluted.

BLI_array_growone(verts);

expands into ...

((void *)(verts)==((void*)0) && (void *)(_verts_static) != ((void*)0) ? 
((verts=(void*)_verts_static), ++_verts_count) : ((signed int)(((void *)(verts) 
== (void *)_verts_static &&(void *)(verts) != ((void*)0)) ? (sizeof(_verts_static) 
/ sizeof(*verts)) : ((verts)==((void*)0) ? 0 : EM_allocN_len(verts) /
sizeof(*verts)))) > _verts_count ? ++_verts_count : ((_verts_tmp= 
MEM_callocN(sizeof(*verts)*(_verts_count*2 +2), "verts" " "
"/data/src/blender/bmesh/source/blender/bmesh/operators/removedoubles.c"
" ")), (verts && memcpy(_verts_tmp, verts, sizeof(*verts) * _verts_count)), 
(verts && ((void *)(verts) != (void*)_verts_static ? (_MEM_freeN(verts, 
"/data/src/blender/bmesh/source/blender/bmesh/operators/removedoubles.c",
 462), verts) : verts)), 
(verts = _verts_tmp), _verts_count++))

these can be made into functions.

bmesh_class.h struct defintions

bmesh_class.h splits up structs in a way which makes them hard to read & IMHO is too fine grained. Being able to open a header and know what the stricture is. is nice, but with this you need to follow each define.

If these defines are useful to keep then a commented, expanded version if the struct would still be useful for devs reading the code, these shouldnt change often so while its a kludge is still an improvement IMHO.

BM_ADJ_* macros are all only used once, so they could be removed and copied into the definitions - surrounded by comments.

#define BM_BASE_VHEAD\
	BMHeader head;\
	float co[3];\
	float no[3];
 
typedef struct BMBaseVert {
	BM_BASE_VHEAD
} BMBaseVert;
 
/* --- snip --- */
 
#define BM_ADJ_VHEAD(etype)\
	BM_BASE_VHEAD\
	struct etype *e;
 
typedef struct BMVert {
	BM_ADJ_VHEAD(BMEdge)
} BMVert;
  • I removed all that crap; it was part of a plan to make bmesh more usable for modifiers, but that ended up falling through. Joeedh 21:39, 16 May 2011 (CEST)

API

Operator API

BMO - bmeshes internal operator api (not to be confused with blenders operators), has some areas I would like to see changed.

Having a BMO I can see being useful, to handle mesh operators in a consistent way.

But In general it seems to me that BMO's are implemented too high level, (closer to wmOperator's), where this could be avoided since they will be called in C and IMHO there such abstractions can be avoided.

Calling

Firstly calling the operators with strings IMHO is not a good design choice since it allows errors to pass silently and doesn't fit well with a C api. unlike printf formatting the compiler cant find errors in it.

BMO_CallOpf(bm, "removedoubles verts=%fv dist=%f", VERT_MARK, 0.001f);

This could be made into an api call, or, if it must be wrapped by a general operator call it could be done differently.

// typical api call 
BMO_call_removedoubles(bm, VERT_MARK, 0.001f);
 
// if wrapping is needed an intermediate variable can be passed to the caller.
BMO_CallOp(bm, BMO_removedoubles, VERT_MARK, 0.001f));
Type System

BMO's have their own basic type system, a little like IDProperties.

Rather then having a concept of bmesh-operator-slots, why not just have a struct allocated for each bmo (like modifier settings), it can be memcpy'd and we avoid having to define an api to convert, copy, typecheck arguments.

  • That violates the operator design, where slots are meant to be piped to each other. Joeedh 21:39, 16 May 2011 (CEST)

note, slots are analogous to operator arguments.

BMOpSlot *BMO_GetSlot(struct BMOperator *op, const char *slotname);
void BMO_CopySlot(struct BMOperator *source_op, struct BMOperator *dest_op, const char *src, const char *dst);
 
void BMO_Set_Float(struct BMOperator *op, const char *slotname, float f);
float BMO_Get_Float(BMOperator *op, const char *slotname);
void BMO_Set_Int(struct BMOperator *op, const char *slotname, int i);
int BMO_Get_Int(BMOperator *op, const char *slotname);
 
void BMO_Set_Pnt(struct BMOperator *op, const char *slotname, void *p);
void *BMO_Get_Pnt(BMOperator *op, const char *slotname);
void BMO_Set_Vec(struct BMOperator *op, const char *slotname, float *vec);
void BMO_Get_Vec(BMOperator *op, const char *slotname, float *vec_out);
 
void BMO_Set_Mat(struct BMOperator *op, const char *slotname, float *mat, int size);
void BMO_Get_Mat4(struct BMOperator *op, const char *slotname, float mat[4][4]);
void BMO_Get_Mat3(struct BMOperator *op, const char *slotname, float mat[3][3]);
 
void BMO_HeaderFlag_To_Slot(struct BMesh *bm, struct BMOperator *op, const char *slotname, int flag, int type);
 
int BMO_CountSlotBuf(struct BMesh *bm, struct BMOperator *op, const char *slotname);
int BMO_CountSlotMap(struct BMesh *bm, struct BMOperator *op, const char *slotname);

Bugs/General Issues

This excludes the items already marked BMESH_TODO.

Memory Usage

A cube subdivided 9 times with BMesh uses 1.6gig of ram. the same test in trunk uses 0.93gig.

While higher memory usage is inevitable with adjacency info it looks like struct members have been added for convenience and not always used that often.

The 'BMHeader' struct used for verts, loops, edges, faces is quite heavy.

  • int eid; // can be removed, its not used currently.
  • int sysflag; // only checked in 1 place and not set anywhere.
  • short type; // could be a char
  • short flag; // could be made a char if some unused flags are removed.

The 'BMFlagLayer' has unused member 'int index'

Subsurf Slowdown

  • Subsurf is slower in the BMesh branch. simple 2 bone animating suzzane, subsurf level 4 is 5.5-FPS in trunk and 1.2-FPS on the bmesh branch. Also tested in editmode where its slower too.

Benchmark comparison of subsurf-4 Suzzane deformed & hidden, playing the animation, only calculating deform. results from sysprof, release-with-debug-info, without openmp

trunk 36590
[./bin/blender]                                           0.00%  85.96%
  In file /lib/libc-2.13.so                              10.61%  12.69%
  calloc                                                  5.94%   5.94%
  In file /lib/libm-2.13.so                               5.65%   5.65%
  free                                                    5.33%   5.33%
  accumulate_vertex_normals                               4.88%   4.88%
  ccgSubSurf__calcVertNormals.isra.19                     4.50%   4.50%
  ccgSubSurf__calcSubdivLevel                             4.25%   4.25%
  MEM_freeN                                               4.12%   4.12%
  getFaceIndex                                            4.02%   4.02%
  memset                                                  2.41%   3.96%
  In file [heap]                                          0.00%   3.60%
  armature_deform_verts                                   3.13%   3.13%
  memcpy                                                  1.80%   1.80%
  malloc                                                  1.76%   1.76%
  mul_m4_v3                                               1.75%   1.75%
  make_memhead_header                                     1.67%   1.67%
  CustomData_interp                                       1.36%   1.36%
  ccgSubSurf_getFaceUserData                              1.34%   1.34%
  normal_quad_v3                                          1.15%   1.15%
  acosf                                                   1.15%   1.15%
  mesh_calc_normals                                       1.14%   1.14%
bmesh 36589
[./blender.bin]                                           0.00%  94.60%
  ccgDM_getFinalVert                                     30.93%  30.93%
  memset                                                 13.18%  13.20%
  No map [./blender.bin]                                  0.00%  12.04%
  cgdm_loopIterStep                                       4.41%   4.41%
  getCCGDerivedMesh                                       3.37%   3.37%
  mesh_recalcTesselation                                  2.12%   2.33%
  In file [heap]                                          0.00%   1.86%
  In file /lib/libm-2.13.so                               1.67%   1.67%
  accumulate_vertex_normals                               1.65%   1.65%
  new_mem_element                                         1.54%   1.54%
  ccgDM_copyFinalFaceArray                                1.44%   1.44%

Since ccgDM_getFinalVert is identical in both, it seems likely that its being called a lot more though this needs to be further tested.

Derived Mesh CDDM Assumed

Derived mesh type CDDM is assumed in various places when it may not be a CDDM derived mesh type.

  • subsurf's cgdm_newFaceIter(), cgdm_getMinMax() only takes CCGDerivedMesh*, but the function argument is DerivedMesh*.
  • arrayModifier_doArray(), asserts 'All modifiers return CDDM' - is this true?


Update r39987

After looking over BMesh code some more I've come up with some more suggested changes.

Reduce BMO String Lookups

These functions all use string loop lookups (strcmp in a loop to find the named operator slot) ...

  • BMO_Insert_MapInt
  • BMO_Insert_MapFloat
  • BMO_Insert_MapPointer
  • BMO_Get_MapFloat
  • BMO_Get_MapInt
  • BMO_Get_MapPointer

For passing settings to bmesh operators this is ok, but when its called on mesh data - each vertex/edge/face; I think this is really not efficient, especially when its looking up the same slot every time.

Example from: source/blender/bmesh/operators/removedoubles.c

BM_ITER(v, &iter, bm, BM_VERTS_OF_MESH, NULL) {
  if (BMO_Get_MapPointer(bm, op, "targetmap", v)) /* string lookup on every vert! */
    BMO_SetFlag(bm, v, ELE_DEL);
}

Talked with Brigg's and apparently this used to use fixed slot values rather than string lookups, think going back to something like this would be better, or at very least cache the string lookups so they don't get called in loops.

Replace BMO "mesh" and "object" pointer use

Replace "mesh" and "object" Operator pointers.

Object *ob = BMO_Get_Pnt(op, "object");
Mesh *me = BMO_Get_Pnt(op, "mesh");

...with

Object *ob = bm->ob;
Mesh *me = bm->ob->data;

Since these are available from 'bm' I dont see the need to use operator pointers.

Don't inline BLI_smallhash.h

IMHO these functions are too big to inline, BLI_smallhash_insert for example is 45sloc, think these would be better suited to being normal functions.

This won't stop it from using stack memory.

CDDM In-efficiency

Note that these are not high priority compared to crash fixes but think they should be resolved still.

  • CDDM_calc_normals tessellates all polygons twice (calling both cdDM_recalcTesselation2 and cdDM_recalcTesselation).
  • CDDM_tessfaces_to_faces calls CDDM_calc_edges() every time, IMHO this istoo much defensive programming, were a fairly slow function is called just because it *might* be needed, probably callers should be responsible for ensuring there is correct edge data. (in some cases we know the edges are correct - MOD_screw.c is one case) - or this could be made an argument.
    Note, perhaps we can skip optimizing this if it will be replaced eventually by code that writes polygons and loops directly.


Update 43694

Nearing merge, I was still unsure as to what advantages cellalloc gave, so I tested release builds.

The test case was sintel with heavy use of deform groups.

  • all non mesh objects removed
  • all meshes without vertex groups removed.
  • full copy of the scene 8 times.
  • without cellalloc: 721mb
  • with cellalloc: 681mb
  • without cellalloc, with jemalloc: 568mb

Tested on 64bit linux

So to avoid complications from cellmalloc and since we use jemalloc on linux, I'm removing cellalloc from the bmesh branch, the issue of many small allocations can be solved separately.

--Ideasman42 21:15, 25 January 2012 (CET)