UPDATE: Recode in progress, see the schedule. Right now NOTHING WORKS!
The new project is now being developed as a separate library, . Once it works well, a interface to it will be added to blender for motion tracking. Separating makes testing much easier, as well as adding other fancy Python integration allowed by using Boost.
Is the motion tracker ready for testing? - I know you guys are excited, but sadly not yet. While the preliminary tracking of features in video was working, it's broken now halfway through a refactor. Also, there is currently no reconstruction of the camera path (that's the hard part!).
When is this going to be ready? - Hopefully it will be ready for preliminary testing (where it won't crash, but may not give amazing track results) by mid June.
How can I help? - Later on I'll need some sample videos people want to track for testing. I'm (Keir) not ready for testing at the moment, but there's no reason I can't collect test videos now. I'd like a collection of HD and low res stuff, but preferably video people are actually interested in tracking. I'm on IRC most of the time #blendercoders as 'keir', or available via email (lastname at gmail).
Wait, you're not an animator, you have no idea what us animators need! - Please add a use case below illustrating how you want to use the tracking software, so we can evaluate our implementation critical-path (sadly there is not time to do absolutely everything).
I'm so excited about optical mocap! - Sorry, this is for recovering camera motion and scene structure. Mocap is an entirely different can of worms. Maybe someday, but not for a while.
Where can I hear what's happening on this project? - See my tracking diary and status page.
Motion Tracking and Reconstruction
One of the areas identified as missing from Blender (see Competitive Analysis) is that of motion tracking, i.e. replicating the motion of a camera in space given a video sequence. This project aims to solve that problem, and some others along the way. For an example of a commercial offering, see 2d3. While it is unlikely that blender's reconstruction and tracking will be as sophisticated as 2d3, it will still be useful in real situations.
The previous version of this page is available at User:Damiles/MotionTracking. It has more links to papers, a demo video, and a preliminary version of this patch.
The user has a video sequence of a static scene captured with a handheld, low quality, CCD camera. They wish to re-create the camera motion in Blender, so they can insert a simple text message. In order to accomplish this, they perform the following:
- They load Blender, and load a file.
- They select one of the scene's cameras.
- In the buttons window, they click the 'edit mode' button, revealing 'Link and Materials' and 'Camera' panels.
- In the 'Camera' panel, they toggle the (new) 'Tracked' button, which brings up another panel, 'Tracking'.
- In the tracking panel, they see the following items:
- Image sequence - A library selector which can also load from file.
- Display Markers - When toggled, shows markers (tracked features) when playing sequence in Blender.
- Calibration - Can be set to auto, or can select existing calibration information from the library.
- Solve - Runs the camera solve engine
- They click the 'Solve' button without changing any settings, and watch as features are automatically placed on the scene, and tracked through the video. Features that drop off the screen edge or cannot be tracked are replaced by newly found ones.
- Empties in the scene are placed at the estimated locations of tracked points, and lines (which are set not to render) are placed in the scene at their estimated locations.
- The solver plays through the video again, this time moving the camera at in sync with the video. The motion is not perfectly matched, but is sufficient for the user.
- The user adds their CG elements based on the positions of tracked points, and renders the scene.
- The user notices that the solve process discovered the camera's internal parameters and radial distortion coefficients. They plan on using them in the future for quicker solves.
Currently, there are some issues on how to handle the automatic combination of tracking with camera background display in 3D mode. Right now, if you want to have a backbuffer which is a video sequence, you set it in the Render buttons. But this is unconnected to the 'Use Background Image' which is settible in a 3D view by clicking View -> Background Image, and setting it to a static image or movie. Which frames to use are not tied to the camera, but to that specific view. This is unsuitable for tracking.
Camera tracking with constraints
Camera tracking with simple planar reconstruction
- Mention possibilities when used with re-topo tool.
Camera tracking with simple planar reconstruction
Using reconstructions as masks
This part is more thinking about the far future
Consider the case where the user is inserting computer generated elements into the scene (as usual), however they face difficulty when some part of the scene (captured in the video sequence) should appear as in front of the rendered object (i.e. the rendered object is partially occluded by the scene object). In cases where the occluding object is mostly piecewise planar, the user can simply reconstruct it and add it into the 3D scene. However, in other cases this is not possible; what is needed is a mask on the object. Conceivably the optical flow algorithm we use for tracking could be applied to smart mask building.
The interface will consist of the following:
- A new 'Tracks Video' toggle button on the Camera panel under edit buttons.
- A new edit panel, which is visible on buttons windows when in edit mode with a camera selected.
- A new constraints panel under edit mode which lets the user specify solver constraints (such as 'selected points are coplanar')
- A new node which outputs each image sequence frame in sync with the camera movements. The single node will have multiple image outputs, allowing flexibility in how the tracker is used:
- A image output for the frame
- A image output for tracked points (which may not be exactly the same as the recovered scene objects)
- A image output for tracked lines.
- (Damiles, can you do this?) A new menu entry 'Display tracks' under the 3D view which is only active when the 3D view is viewing a camera which also has a motion track. This lets the user see the quality of the tracked features (not necessarily the solve) without rendering.
- (Damiles, can you do this?) When tracks are displayed, the user can select them and delete/modify them. This should perhaps be implemented as empties living in the image plane.
- (undecided how this will work) A panel which lets the user connect empties from a previous solve to arbitrary features in the video. This is necessary because it is unlikely the feature tracker will be able to re-connect tracked points from the start of the video with the same scene points if they reappear (i.e an object is panned off screen, then panned back in).
Tracking feature correspondence
Any system for automatically solving simultaneous motion and structure must somehow track the movement of scene structure across each frame of the video sequence; either by tracking the positions of specific points (i.e. the corner of a table, or a dot on a cup) or lines (edge of a desk) or higher-order objects (planar surfaces, piece-wise planar objects).
The initial implementation will track corners (detected by OpenCV's cvGoodFeaturesToTrack) via optical flow; in the (hopefully near) future, it will also track lines.
The current implementation tracks points via the Kanade-Lucas-Tomasi Feature Tracker found in OpenCV.
A far more stable feature than points is lines. While the exact endpoints of lines are not usually known to high precision, the orientation of a line in a scene can be computed with high (sub-pixel) precision. By using lines, higher quality reconstructions are possible.
A compelling interest point detector and localizer is David Lowe's SIFT. It offers the possibility of re-establishing some correspondences in long camera tracks where features disappear and reappear from view; for example, if the camera starts at the front of the house and walks entirely around back to it's starting place, knowing correspondences from the first frame to the last frame is a significant constraint to improve the quality of camera solves.
Solving and Optimization Engine
Sparse solver, constrained optimization, RANSAC, MLESAC, links to HZ book, sparse bundle adjustment library, factorization methods,
A good overview of how state-of-the-art tracking engines work is available. It is a very good description of the practical issues encountered while building industrial-strength camera tracking systems.
After consideration of the different methods of coming up with an initial projective estimate, I will implement a method based on hierarchical estimation of trifocal tensors, which has the nice property of distributing the error across the entire sequence (rather than recursively estimating structure based on consecutive frames).
An implementation of the following papers will need to happen for the tracker to work:
- A Six Point Solution for Structure and Motion (2000). F. Schaffalitzky, A. Zisserman, R. I. Hartley, and P.H.S.Torr Robotics. ECCV.
- More soon.
- Storing camera track information (i.e. results of computation and optimization)
- Storing camera calibration information from autocalibration (radial distortion, focal length)
The solver needs / finds many parameters; each one becomes part of a large scale optimization procedure. Where they are stored is described below:
- Each camera has a matrix associated with it (in general a 3x4 homogeneous matrix, generally P in the literature) stored as TrackCamera.P. Since the goal in this case is a Euclidean reconstruction, each TrackCamera also has a R (3x3) matrix and a t (3x1) vector stored as TrackCamera.R and TrackCamera.t. R may be changed into quaternions in the future.
Real cameras generally have non-trivial radial and transverse distortion. See, for example, the camera calibration toolbox for more information. However, when rendering new objects into a scene recorded with a camera that has distortion, the object will appear (sligthly) wrong, as it is perfectly projected with no distortion. There are two solutions. The first is to invert the camera distortion in the source image sequence, which results in a non-rectangular image sequence (i.e. which will require cropping) with normal rendering on top. Alternately, the rendered images to be composited with the original sequence can be distorted in the same manner as the original camera distorts the world; this would make the rendered objects appear as though they were captured by the same real camera.
Chromatic Aberration node
In the same way that real cameras distort the world, they also have chromatic aberration (usually purple fringing). This is also something which could be simulated and applied to rendered objects.
Real CCD cameras generally have noticeable camera noise. It is conceivable that Blender could estimate a generative noise model from the camera sequence, and apply it to the rendered objects. This would further enhance the realism of final rendered sequence.
Better optical flow code
It'd be really nice to have a fast implementation of the state of the art optical flow technique. It doesn't look too crazy difficult to implement, but it would take some time. Volunteers? This would be handy for all sorts of tracking other than purely multiview reconstruction.
Getting the code
The code is maintained in Bazaar ('bzr'). It is a branch from the Launchpad bf-blender import, which can be found at: http://bazaar.launchpad.net/~vcs-imports/blender/bf-blender/. The suggested method for contributing is to the motion tracking project is the following:
- Initialize a shared repository somewhere with lots of space. Using a shared repository saves by storing revisions between branches in a common location.
- mkdir blender-repository
- cd blender-repository
- bzr init-repo --trees
- Branch bf-blender
- bzr branch http://bazaar.launchpad.net/~vcs-imports/blender/bf-blender/
- Wait patiently. This takes a long time. The progress bar will appear to hang for a long time on Fetching (1/4).
- Once that is complete, you are ready to pull from one of the motiontrack branches (do this from within blender-repository but not bf-blender):
- You are ready to contribute. Edit the code in motiontrack/, optionally commiting as necessary. Please publish your shared repository somewhere via http; it is a simple matter of exposing the blender-repository director via http and putting the URL below. Then others can merge from your version.
The reason to branch from launchpad first is that when you later branch from my machine (geex.ath.cx) bzr will only pull revisions which are not present in the launchpad branch, which is most of them. This saves my poor residential dsl on bandwidth and you time.
Note that now you can easily create new branches. From within the blender-repository directory:
- bzr branch bf-blender my-new-branch-here
List of branches
- Keir's Branch: The main coder for camera solving is Keir, and his branch is found at:
- Please do not pull directly from this repository, because this is a residential DSL. Instead follow the above instructions.
- David's branch: David is doing the UI and node / compositing component.
- Add your branch here please!
- Add variable which toggles drawing features overtop of image viewer for tracked image sequences
- Add a menu item in the image viewer under View which toggles drawing tracked features
- Draw tracked features in drawimage.c's drawimagespace() function
- When clicking Tracker Properties menu item (which shows the tracker properties panel in image viewer) this should not cause the addition of a tracker object. This should be somewhere else.
- Analyze the competition!
- Keir Mierle is a masters student at the University of Toronto's Edward S. Rogers Department of Electrical and Computer Engineering. He is working on this project as part of his thesis.
- Damiles is a superior technical engineer of computer science at polytechnical university of Valencia (UPV).
- _Po_: Emmanuel Pognant
- ibkanat: Ken Williamson
- User:Damiles/MotionTracking - Project Goals, Process, Progress, Papers, etc...