Course:CPSC522/FastSLAM

Title

SLAM (simultaneous localization and mapping) is the problem of simultaneously localizing the position of agent or robot and mapping or modeling the surrounding area. FastSLAM is a probabilistic factored approach to the problem.

Principal Author: Hooman Hashemi
Secondary Author:

Abstract

In the first part of this page the method FastSLAM is discussed. FastSLAM first presents a graphical model for the problem of SLAM. At the core of the graphical model there is a Markov chain of hidden variables representing the position of the robot. A variation of particle filtering is used to find the high probability samples or paths from this chain which are compatible with the evidence (localization). The conditional probability of landmark positions given a path or sample from the previous section is computable in a feasible manner (mapping). Though both these processes are done together and use output of each other.^[1]

In the second part various problems and approaches to the task of SLAM is presented from a survey ^[2].

Builds on

This page presents a solution to a problem that uses Particle Filtering at the core. The method also relies on basic Probabilistic rules and Graphical Models.

Content

FastSLAM

Here we first present the notations and the graphical model presented for the task. Then explain the approach.

Model

We have a robot that takes control commands as input, moves in the environment and observers and measures the position of distinguishable landmarks (e.g. sift features in computer vision) using its sensors. The following sections try to provide a formal notation for this scenario. The problem is to calculate both the position of the robot and the observed landmarks.

Notation

The robot at time $t$ is at the position $s_{t}$ after receiving the control $u_{t}$ . We assume exactly one landmark $n_{t}$ is observed at time $t$ . The sensor measurement is denoted by $z_{t}$ and depends on position of the robot and the position of the observed landmark $\theta _{n_{t}}$ . For all mentioned variables, when index is at the superscript it points to the history up to time $t$ and when superscripted by $[i]$ it represents the $i$ th sample of particle sampling. For example $s^{t}=\{s_{1},...,s_{t}\}$ is history of one path or sample and $s^{t,[m]}=\{s_{1}^{[m]},...,s_{t}^{[m]}\}$ is the $m$ th path or sample up to time $t$ . The graphical model of the problem with the mentioned notation is presented in Figure 1.^[1]

A graphical model for the fastSLAM approach, the notation is explained in the text.

Factors

Here we assume the poses are determined by the motion model $p(s_{t}|u_{t},s_{t-1})$ , and the sensor measurements are governed by the measurement model $p(z_{t}|\theta ,s_{t},n_{t})$ . In the scenario that the correspondences $n_{t}$ are given, the problem of SLAM can be expressed as $p(s^{t},\theta |z^{t},u^{t},n^{t})$ . One can decompose the above model into an alternative factored representation of,

p(s^{t},\theta |z^{t},u^{t},n^{t})=p(s^{t}|z^{t},u^{t},n^{t})\prod _{k}p(\theta _{k}|s^{t},z^{t},u^{t},n^{t})

This representation breaks the problem into separate path estimation (localization) and landmark estimation (mapping) tasks and might be useful for intuition.

Approach

General Approach

Landmarks conditional probability is efficiently computable given the robot poses, but their value is used during the robot pose estimation. That is for path estimation, at each step samples are updated by the motion model, and then evidence is incorporated using the measurement model. To incorporate the evidence and calculate the importance weights (the amount a path supports the evidence and therefore has the right to stay in the next iteration, be dropped, or duplicated) the landmark positions conditioned on the previous parts of the path is used to calculate the ratio between proposal distribution and the target distribution as the importance weight. The proposal model uses only the motion model but the target distribution incorporates both the motion and the measurement information. It is useful to note here that since landmark positions are computed from the robot poses, each sampled path in the particle filtering has it's own landmark positions.

Path Estimation Using Particle Filtering

A prior knowledge of Particle Filtering is necessary for this part. Here we maintain a set of $m$ particles $S_{t}$ each particle denoted by $s^{t,[m]}=\{s_{1}^{[m]},...,s_{t}^{[m]}\}$ . Each particle in $s^{t-1,[m]}\in S_{t-1}$ generates a guess using the motion model,

s_{t}^{[m]}\sim p(s_{t}|u_{t},s_{t-1}^{[m]})

The set

S_{t}

is generated by re-sampling these guesses proportional to the importance factor

w_{t}^{[m]}

. The weight supports guesses that match the measurement evidence.

w_{t}^{[m]}={\frac {\text{target distribution}}{\text{proposal distribution}}}={\frac {p(s^{t,[m]}|z^{t},u^{t},n^{t})}{p(s^{t,[m]}|z^{t-1},u^{t},n^{t-1})}}

To calculate this ratio the probability is marginalized over all possible

\theta _{n_{t}}

based on previously available evidences. This will be shown in more details after knowing how to calculate position of landmarks in the next section. Although performing an integration is required, the integration or the dot product of two Gaussian has closed form and is easy to compute.

Landmark Locations

The landmarks are given using Gaussian conditional distributions for each path separately. The completed sample set is represented by $S_{t}=\{s^{t,[m]},\mu _{1}^{[m]},\Sigma _{1}^{[m]},...,\mu _{k}^{[m]},\Sigma _{k}^{[m]}\}_{m}$ .

For a single path, using Bayes rule the conditional probability can be computed or updated in an iterative as follows,

p(\theta _{k}|s^{t},z^{t},u^{t},n^{t})\propto _{(Bayes)}p(z_{t}|\theta _{k},s^{t},z^{t-1},u^{t},n^{t})p(\theta _{k}|s^{t},z^{t-1},u^{t},n^{t})

=_{(Independences)}p(z_{t}|\theta _{k},s_{t},n^{t})p(\theta _{k}|s^{t-1},z^{t-1},u^{t-1},n^{t-1})

Since

z_{t}

is not given, it results in the independences used in the second term. And for the landmarks not observed probability of

\theta _{k}

remains unchanged.

The integration

Calculating the importance weight in the previous sections now is possible after knowing how to calculate landmark location probabilities. The intuition was mentioned before, that is to incorporate the evidence $\theta _{n_{t}}$ is required to see how likely the measurement was. But we only have a distribution for $\theta _{n_{t}}$ , so probability of evidence is the marginalization of the joint probability with all possible $\theta _{n_{t}}$ . This can more formally be shown bellow,

w_{t}^{[m]}={\frac {p(s^{t,[m]}|z^{t},u^{t},n^{t})}{p(s^{t,[m]}|z^{t-1},u^{t},n^{t-1})}}\propto _{(Bayes)}{\frac {p(z_{t},n_{t}|s^{t,[m]},z^{t-1},u^{t},n^{t-1})}{p(z_{t},n_{t}|z^{t-1},u^{t},n^{t-1})}}

\propto _{(Bayes)}p(z_{t},n_{t}|s^{t,[m]},z^{t-1},u^{t},n^{t-1})=\int _{\theta }p(z_{t},n_{t}|\theta ,s^{t,[m]},z^{t-1},u^{t},n^{t-1})p(\theta |s^{t,[m]},z^{t-1},u^{t},n^{t-1})d\theta

=_{(independences)}\int _{\theta }p(z_{t},n_{t}|\theta ,s_{t}^{[m]})p(\theta |s^{t-1,[m]},z^{t-1},u^{t-1},n^{t-1})d\theta

=\int _{\theta }p(z_{t}|n_{t},\theta ,s_{t}^{[m]},)p(n_{t}|\theta ,s_{t}^{[m]})p(\theta |s^{t-1,[m]},z^{t-1},u^{t-1},n^{t-1})d\theta \propto _{assumption}

\propto _{(assumption)}\int _{\theta }p(z_{t}|n_{t},\theta ,s_{t}^{[m]})p(\theta |s^{t-1,[m]},z^{t-1},u^{t-1},n^{t-1})d\theta

\approx \int _{\theta _{n_{t}}^{[m]}}p(z_{t}|n_{t},\theta _{n_{t}}^{[m]},s_{t}^{[m]})p(\theta _{n_{t}}^{[m]})d\theta _{n_{t}}^{[m]}

Here we assume that

p(n_{t}|\theta ,s_{t}^{[m]})

is uniform. Assuming both probabilities are Gaussian, the integration has a closed form.^[1]

The Task of SLAM From a Survey

In this part various problems and approaches to the task of SLAM are listed, each one is accompanied with a really short description.^[2]

Various Problems

Short vs Long Baseline Matching

In visual SLAM in some scenarios the sensors are a pair of cameras and the landmarks are 3D coordinates resulted from finding corespondent pairs in the two views.

The baseline is the line connecting the origin of the two cameras. A short baseline results in inaccurate 3D coordinates while A long baseline results in matched pairs having different scales or positions which gives a harder correspondence problems. Each of these scenario are faced with different problems as mentioned.

Loop Closure Detection

One of the impedances faced during large scale SLAM is detecting that if the same place has been visited again after a cyclic path or a similar place is being visited,

This problem is paired with another one called perceptual aliasing when similar landmarks from different places are matched to the same landmark because of similarity or for example due to repetitive features of the environment. In general the detection methods for loop closure may be classified into three categories of map to map, image to map and image to image where each have advantages. But the ideal would be to build a system that combines the advantages of all three.

Kidnapped robot

Another problem is that in some scenarios there is a gap between current position and the previous position of robot and less prior information is available on the current location. The robot can get lost because of fast camera motion, movement with occlusion (blind movement), temporary sensor malfunction, or being transferred to an already mapped zone. One of the approaches to this problem uses various features of the environment and visible landmarks to find the current position and resume tracking when the condition is improved.

Various Approaches

Probabilistic Filters

Most SLAM solutions up to the date of the survey are based on probabilistic methods. From this category one can name Extended Kalman Filter(EKF), Factored solution to SLAM (FastSLAM), maximum likelihood and EM based approaches. The first two are mostly used because they provide the best result when they jointly minimize uncertainty of positions and the landmarks. These methods are more naturally successful in the small scale.

Structure From Motion

The SfM techniques allow to compute 3D structure of the scene and camera position from a set of images. The origin is in Computer Vision. The standard procedure is to extract and match features of incoming images, and perform a non-linear optimization. SfM gives high precision in location of the camera but it does not necessary give an accurate map. There are papers that use similar methods but in real time.

Bio-inspired Models

There are other methods that use biologically inspired models. For example RatSLAM is inspired by models of hippocampus of rodents and can generate simple and consistent representations of complex environments using a single camera and close loops as long as 5km in length.^[2]

Annotated Bibliography

Put your annotated bibliography here. Add links where appropriate.

To Add

Put links and content here to be added. This does not need to be organized, and will not be graded as part of the page. If you find something that might be useful for a page, feel free to put it here.

Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-NonCommercial-ShareAlike 3.0. The full text of this license may be found here: CC by-nc-sa 3.0

↑ ^1.0 ^1.1 ^1.2 M. Montemerlo and S. Thrun and D. Koller and B. Wegbreit, FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem. AAAI, 2002.
↑ ^2.0 ^2.1 ^2.2 J. Fuentes and J. Ruiz and J. Manuel, Visual simultaneous localization and mapping: a survey. AI review, 2015.

[FastSLAM-1] 1.0 ^1.1 ^1.2 M. Montemerlo and S. Thrun and D. Koller and B. Wegbreit, FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem. AAAI, 2002.

[SLAMSurvey-2] 2.0 ^2.1 ^2.2 J. Fuentes and J. Ruiz and J. Manuel, Visual simultaneous localization and mapping: a survey. AI review, 2015.

[1]

[2]