GSoC 2016 Proposal
Multi-view Camera Reconstruction for Blender
Hong Kong University of Science and Technology
<irc: hlzz001 on #blendercoders>
The estimation of camera motion plays an important role in visual effects (VFX) industry, which enables the interaction between the virtual and the reality. Currently Blender only supports solving camera motion from a single view, which sometimes does not give satisfactory solutions. In this project, I will implement a generic-purpose multi-view tracking system that incorporates witness cameras to strengthen the estimation result. This project is composed of a front-end UI integration mainly for users to specify matched marks across cameras, and back-end optimization engine that operates on the user input. This multi-view reconstruction system will increase the stability of the camera tracking solver and help artists create high-quality visual works.
Match moving  is a commonly-used technique in visual effects industry that allows the insertion of computer graphics into live-action footages. To achieve that, the movement of the camera through a shot must be accurately recovered to ensure that the inserted objects appears in perfectly matched perspectives. The reconstruction of camera motions from 2D feature points is also known as Structure-from-Motion (SfM). Blender currently supports camera motion estimation from a single shot, with an interactive front-end for users to specify tracks. However, this solution is not robust enough due to small baselines between adjacent frames in a single shot . Witness cameras are sometimes brought in which forms wide baseline with the primary camera to overcome the instability in the solver.
In this project, I will implement an extension for blender which supports camera motion estimation from multiple views. This project contains a front-end user interface and a back-end processing engine. The interactive user interface is responsible for assigning correspondences between different cameras, in a way much similar to the space_clip editor. The back-end engine takes the user-defined markers and their correspondences across different cameras as inputs, and conducts track selection, camera resection, and bundle adjustment successively. This complete set of pipeline would benefit artists to produce high-quality visual effects.
Benefits to the Community
The movie industry is in need of a high-quality matching moving editor to deliver magnificent visual artworks. Such an increased tracking system which supports multiple cameras would lead the reconstruction accuracy to a new level, thus enabling the perfect fusion between the virtual and the reality. Moreover, it opens up the potentials for further enhancement of the whole motion tracking system, such as the support of automatic tracking without user intervention.
Related Components in Blender
(1) intern/libmv: the blender branch of libmv , which is the underlying SfM engine.
(2) source/blender/blenderkernel/intern/tracking*: several internal lower-end source files related to the tracking stuffs in blender.
(3) source/blender/blenderkernel/intern/space_clip*: the editor of motion tracking in blender, also serves as the wrapper between the back-end engine and the user interface.
The multi-view reconstruction system will be largely in consistent with the current system, without introducing any additional third-party library. The current libmv library in blender partially supports auto-tracking across different cameras, but more functionalities need to be implemented. Literally, the following aspects need to be taken good care of:
- Extend AutoTrack class from one-shot version to multi-view version, e.g. the tracker should now keep a record of a vector of CameraIntrinsics, instead of a single CameraIntrinsics, since the witness cameras may have different focal length and distortions parameters as the primary shot. libmv has already conveniently reserved the interfaces such as TrackMarkerToFrame, but several classes and methods need to be re-implemented. Some ‘TODO’s marked by Keir should also be finished, such as handling varying focal lengths, matching frame across clips to detect loop closure, etc.
- A TrackConsole management class to record matched frames across clips. A TrackConsole management class to record matched frames across clips. This management class records marker correspondences in different clips and various operational methods, such as AddCamera(), GetCorrespondences(), etc.
- The reconstruction engine needs minor revision to accommodates multiple views.
- A user interface to specify matches across clips. This UI will be largely consistent with the space_clip editor.
Deliverables and Tentative Schedule
Pre-stage (14 March – 22 May): Get familiar with the blender codebase, with a focus on the camera tracking component.
Week 1 (22 May – 29 May): Sketch out the pipeline in the backend, including organizing tracks in different clips, the extension of the bundle adjustment solver to multiple clips in the form of a wrapper, etc. Keep it as a independent project for testing. Some of the work may be done in the pre-stage phase.
Week 2 (29 May – 5 June): Design the front-end UI and integrate it into the movie clip editor, or make it a new editor to keep each component separated.
Week 3 (5 June – 12 June): Finish the logic code of the UI.
Week 4~6 (12 June – 26 June): Connect the front side UI with the back-side SfM engine. By the end of week 6, a complete pipeline should be ready to go.
Week 7 (3 July – 10 July): Performance tuning. We need to pick a set of clips containing a primary camera video and several witness camera videos as the benchmark dataset. Then the goal is to lower the re-projection error of the reconstructed scene as much as possible.
Week 8 (10 July – 17 July): Finish several TODOs marked by Keir and Sergey, such as the loop closure detection, as to improve the auto-tracking performance.
Week 9 (17 July – 24 July): Iterative development on the UI part, revise the UI according to user feedbacks.
Week 10 (24 July – 31 July): A flexible week to finish incomplete modules.
Week 11~12 (31 July – 14 August): Performance tuning, debugging and benchmarking
Final Week (15 August – 23 August): Tidy the code, write test cases and documentation, cross-platform tests
Biographical Information and Related Experience
I am a second-year Ph.D. student in Hong Kong University of Science and Technology (HKUST), mainly working on 3D computer vision. Prior to that, I obtained my bachelor degree from Peking University, double major in artificial intelligence and psychology. I have been a visiting student researcher in UC Berkeley during the fall of 2013, working on CAPTCHA recognition. Please visit my homepage (www.tianweishen.com) for more details.
I am generally interested in techniques related to computer vision and graphics. I am familiar with geometric computation in SfM and have open-sourced a multi-thread image retrieval library libvot (https://github.com/hlzz/libvot), which can be used to accelerate the matching process in SfM.
 Wikipedia: Match moving: https://www.wikiwand.com/en/Match_moving
 Morris, Daniel D. Gauge freedoms and uncertainty modeling for 3d computer vision. Diss. Stanford University, 2001.
 libmv: https://github.com/libmv/libmv