Documente Academic
Documente Profesional
Documente Cultură
new frame added marginalized new frame added marginalized new frame added
Fig. 3: Evolution of the factor graph during tracking. After adding a new frame and optimizing the estimates of the variables
in the graph, all variables except the keyframe pose and the current frame pose, velocity and biases are marginalized out.
This process is repeated with each new frame.
where is the Huber norm. This objective is minimized using frame. According to the IMU measurements of rotational
the iteratively re-weighted Levenberg-Marquardt algorithm. velocities z and linear accelerations az the pose of the
2) Depth Estimation: Scene geometry is estimated for IMU evolves as
pixels of the key frame with high image gradient, since they
provide stable disparity estimates. Fig. 1 shows an example p = v (6)
of such a semi-dense depth reconstruction. We estimate depth v = R (az + a ba ) + g (7)
both from static stereo (i.e., using images from different R = R [ z + b ] (8)
physical cameras, but taken at the same point in time) as
well as from temporal stereo (i.e., using images from the where [] is the skew-symmetric matrix such that for vectors
same physical camera, taken at different points in time). a, b, [a] b = a b. The process noise a , , b,a , and b,
We initialize the depth map with the propagated depth affect the measurements and their biases ba and b with
from the previous keyframe. The depth map is subsequently Gaussian white noise. Hence, for the biases ba = b,a and
updated with new observations in a pixel-wise depth-filtering b = b, . Note that we neglect the effect of Coriolis force
framework. We also regularize the depth maps spatially and in this model.
remove outliers [4]. IMU measurements typically arrive at a much higher
a) Static Stereo: We determine the static stereo dispar- frequency than camera frames. We do not add independent
ity at a pixel by a correspondence search along its epipolar residuals for each individual IMU measurement, but integrate
line in the other stereo image. In our case of stereo-rectified the measurements into a condensed IMU measurement be-
images, this search can be performed very efficiently along tween the image frames. In order to avoid frequent reintegra-
horizontal lines. We use the SSD photometric error over five tion if the pose or bias estimates change during optimization,
pixels along the scanline as a correspondence measure. If a we follow the pre-integration approach proposed in [22]
depth estimate with uncertainty is available, the search range and [14]. We integrate the IMU measurements between
along the epipolar line can be significantly reduced. Due to timestamps i and j in the IMU coordinate frame and obtain
the fixed baseline, we limit disparity estimation to pixels with pseudo-measurements pij , v ij , and Rij .
significant gradient along the epipolar line, making the depth We initialize pseudo-measurements with pii = 0,
reconstruction semi-dense. v ii = 0, Rii = I, and assuming the time between IMU
Static stereo is integrated in two ways. If a new stereo measurements is t we integrate the raw measurements:
keyframe is created, the static stereo in this keyframe stereo
pair is used to initialize the depth map. During tracking, static pik+1 = pik + v ik t (9)
stereo in the current frame is propagated to the reference v ik+1 = v ik + Rik (az ba ) t (10)
frame and fused with its depth map.
Rik+1 = Rik exp([ z b ] t). (11)
b) Temporal Stereo: For temporal stereo we estimate
disparity between the current frame and the reference Given the initial state and integrated measurements the state
keyframe using the pose estimate obtained through tracking. at the next time-step can be predicted:
These estimates are fused in the keyframe. Only pixels
are updated with temporal stereo, whose expected inverse 1
pj = pi + (tj ti )v i + (tj ti )2 g + Ri pij (12)
depth error is sufficiently small. This also constrains depth 2
estimates to pixels with high image gradient along the v j = v i + (tj ti )g + Ri v ij (13)
epipolar line, producing a semi-dense depth map. Rj = Ri Rij . (14)
B. IMU Integration For the previous state si1 and IMU measurements ai1 ,
Underlying our IMU error function terms is the following i1 between frames i and i 1, the method yields a
nonlinear dynamical model. Let the pose consist of the prediction
position p and rotation R of the IMU expressed in the world
frame. Note that the velocity estimate v also is in the world si := h i1 , v i1 , bi1 , ai1 , i1
b (15)
Orientation error [deg]
Fig. 4: Orientation error evaluated over different segment Fig. 5: Translational drift evaluated over different segment
lengths for the three EuRoC dataset sequences (from top lengths for the three EuRoC sequences (from top to bottom:
to bottom: stage 1 task 2.1, 2.2, 2.3). While both loosely- stage 1 task 2.1, 2.2, 2.3). As for rotation (Fig. 4), the tightly
coupled and tightly-coupled IMU integration significantly coupled approach clearly performs best.
decrease the error as global roll and pitch become observable,
the tightly coupled approach is clearly superior. In particular
in the last sequence which includes strong motion blur and the error function E(s) can be approximated around the
illumination changes direct tracking directly benefits from current state s with a quadratic function
tight IMU integration. See also Fig. 5. 1
E(s s) = Es + sT bs + sT Hs s (21)
2
bs = JTs Wr(s) (22)
of the pose, velocity, and biases in frame i with associated Hs = JTs WJs (23)
covariance estimate b s,i . Hence, the IMU error function
terms are where bs is the Jacobian and Hs is the Hessian approxi-
mation of E(s) and s is a right-multiplied increment to
T b 1
E IMU (si1 , si ) := (si b
si ) s,i (si b
si ) . (16) the current state. This function is minimized through s =
H1s bs , yielding the state update s s s. This update
and relinearization process is repeated until convergence.
C. Optimization D. Partial Marginalization
The error function in eq. (3) can be written as To constrain the size of the optimization problem, we
perform partial marginalization and keep the set of optimized
1 T
E(s) = r Wr (17) states at a small constant size. Specifically, we only optimize
2 for the current frame state si , the state of the previous frame
1 T WI 0 rI
= r I r TIMU . (18) si1 , and the state sref of the reference frame used for
2 0 WIMU r IMU
tracking. If we split our state space s into s and s , where
The weights either implement the Huber norm on the pho- s are the state variables we want to keep in the optimization,
tometric residuals r I using iteratively re-weighted least- and s are the parts of the state that we want to marginalize
squares, or correspond to the inverse covariances of the IMU out, we can rewrite the update step as follows
residuals r IMU (eq.(16)). We optimize this objective using
the Levenberg-Marquardt method. Linearizing the residual H H s b
= . (24)
around the current state H H s b
Applying the Schur complement to the upper part of the
r(s s) = r(s) + Js s (19) system we find
1
where s = (H ) b , (25)
H = H H H1
H , (26)
dr(s s)
Js = , (20)
ds
s=0
b = b H H1
b . (27)
which represents a system for E (s ) with states s 4
aslam-mono
marginalized out. Figure 3 shows the evolution of the graph 3 msckf-mono
aslam
with the marginalization procedure applied after adding every
2 our method
new frame to the graph. ground thruth
1
E. Changing the Linearization Point
y [m]
0
Partial marginalization fixes the linearization point of s
for the quantities involving both s and s in eq. (25). 1
Further optimization, however, changes the linearization 2
point such that a relinearization would be required. We
3
avoid the tedious explicit relinearization using a first-order
approximation. If we represent the new linearization point 4
4 3 2 1 0 1 2 3 4
s0 by the old linearization point s and an increment s , x [m]
2.5
s0 = s s1 s0 , (28) aslam-mono
| {z } 2.0 msckf-mono
aslam
1.5
z [m]
=:s our method
1.0 ground thruth
0.5
we can change the linearization point of the current quadratic 0.0
approximation of E through 0 100 200 300 400
t [s]
500 600 700 800 900
1.6
1.4 our method
1.2 aslam
1 1.0 aslam-mono
E (s0 s ) = E0 + sT b0 + sT H0 0 s , (31) 0.8 msckf-mono
2 0.6
1 0.4
0.2
E0 = E + s T H s + sT b , (32) 0.0
2 60 180 300 420 540 660 780 900 1020 1140
b0 = b + H s , (33) Path length [m]