IPIN2016/tex/chapters/smoothing.tex

\section{Smoothing}
\label{sec:smoothing}

The main purpose of this work is to provide MC smoothing methods in context of indoor localisation.
\commentByFrank{algorithms?}
As mentioned before, those algorithm are able to compute probability distributions in the form of $p(\mStateVec_t \mid \mObsVec_{1:T})$ and are therefore able to make use of future observations between $t$ and $T$.
\commentByFrank{evtl nochmal das $t << T$ dazu? is ne weile her und verwirrt vlt mit groß und klein t}

%Especially fixed-lag smoothing is very promising in context of pedestrian localisation.
In the following we discuss the algorithmic details of the forward-backward smoother and the backward simulation.
Further, a novel approach for incorporating them into the localisation system is shown.

\subsection{Forward-backward Smoother}

\commentByFrank{Smoother (grosses S) wie in der caption?}
The forward-backward smoother (FBS) of \cite{Doucet00:OSM} is a well established alternative to the simple filter-smoother. The foundation of this algorithm was again laid by Kitagawa in \cite{kitagawa1987non}.
An approximation is given by
\begin{equation}
p(\vec{q}_t \mid \vec{o}_{1:T}) \approx \sum^N_{i=1} W^i_{t \mid T} \delta_{\vec{X}^i_{t}}(\vec{q}_{t}) \enspace,
\label{eq:approxFBS}
\end{equation}
\commentByFrank{support?}
\commentByFrank{ist $\delta$ irgendwo erklaert?}
\commentByFrank{ist klein $\vec{x}$ irgendwo erklaert?}
\commentByFrank{ist die notation $A_{b \mid c}$ bekannt? mir sagt das noch garnix}
where $p(\vec{q}_t \mid \vec{o}_{1:T})$ has the same support as the filtering distribution $p(\vec{q}_t \mid \vec{o}_{1:t})$, but the weights are different.
This means, that the FBS maintains the original particle locations and just reweights the particles to obtain a smoothed density.
The complete FBS can be seen in algorithm \ref{alg:forward-backwardSmoother} in pseudo-algorithmic form.
\commentByFrank{forward step vlt etwas genauer erklaeren weil 1. mal benutzt? oder is das hinlaenglich bekannt? :P}
At first, the algorithm obtains the filtered distribution (particles) by deploying a forward step at each time $t$.
\commentByFrank{pfennigfuchserei: sagt man smoothing distribution oder smoothed distribution? bin da ned drin}
Then the backward step for determining the smoothing distribution is carried out.
The weights are obtained through the backward recursion in line 9.
\commentByFrank{mir (als laie) wird nicht klar: mache ich erst alle forwaertsschritte (also alles bis zum pfadende durchlaufen) und gehe dann von da rueckwaerts (so klingts etwas im text), oder gehe ich nach jedem forwartsschritt rueckwarts (so klingts im pseudocode)}

\begin{algorithm}[t]
    \caption{Forward-Backward Smoother
		\commentByFrank{reihenfolge von $ \{ W^i_t, \vec{X}^i_t\}^N_{i=1}$ war oben andersrum. ned schlimm. nur wegen konsistenz :P}
    }
    \label{alg:forward-backwardSmoother}
    \begin{algorithmic}[1] % The number tells where the line numbering should start
			\For{$t = 1$ \textbf{to} $T$} \Comment{Filtering}
				\State{Obtain the weighted trajectories $ \{ W^i_t, \vec{X}^i_t\}^N_{i=1}$}
			\EndFor
			\For{ $i = 1$ \textbf{to} $N$} \Comment{Initialization}
				\commentByFrank{$t \mid T$ oder $T \mid T$?}
				\State{Set $W^i_{T \mid T} = W^i_T$}
			\EndFor
			\For{$t = T-1$ \textbf{to} $1$} \Comment{Smoothing}
				\For{$i = 1$ \textbf{to} $N$}
\vspace*{0.1cm}
					\State{
$
W^i_{t \mid T} = W^i_t \left[ \sum^N_{j=1} W^j_{t+1 \mid T} \frac{p(\vec{X}^j_{t+1} \mid \vec{X}^i_t)}{\sum^N_{k=1} W^k_t ~ p(\vec{X}^j_{t+1} \mid \vec{X}^k_t)} \right]
$}
				\EndFor
			\EndFor
    \end{algorithmic}
\end{algorithm}


%Probleme? Nachteile? Komplexität etc.
\commentByFrank{muss ich die quelle gelesen haben ums zu verstehen? wird mir naemlich so ned klar}
By reweighting the filter particles, the FBS improves the simple filter-smoother by removing its dependence on the inheritance (smoothed) paths \cite{fearnhead2010sequential}. However, by looking at algorithm \ref{alg:forward-backwardSmoother}  it can easily be seen that this approach computes in $\mathcal{O}(N^2)$, where the calculation of each particle's weight is an $\mathcal{O}(N)$ operation. To reduce this computational bottleneck, \cite{klaas2006fast} introduced a solution using algorithms from N-body simulation. By integrating dual tree recursions and fast multipole techniques with the FBS, a run-time cost of $\mathcal{O}(N \log N)$ can be achieved.

\subsection{Backward Simulation}
For smoothing applications with a high number of particles, it is often not necessary to use all particles for smoothing.
\commentByFrank{certain = accurate?}
This decision can for example be made due to a high sample impoverishment and/or highly certain sensors.
By choosing a good sub-set for representing the posterior distribution, it is theoretically possible to further improve the estimation.

Therefore, \cite{Godsill04:MCS} presented the backward simulation (BS). Where a number of independent sample realisations
from the entire smoothing density are used to approximate the smoothing distribution.
%
\begin{algorithm}[t]
    \caption{Backward Simulation Smoothing}
    \label{alg:backwardSimulation}
    \begin{algorithmic}[1] % The number tells where the line numbering should start
			\For{$t = 1$ \textbf{to} $T$} \Comment{Filtering}
				\State{Obtain the weighted trajectories $ \{ W^i_t, \vec{X}^i_t\}^N_{i=1}$}
			\EndFor
			\For{ $k = 1$ \textbf{to} $N_{\text{sample}}$}
				\State{Choose $\tilde{\vec{q}}^k_T = \vec{X}^i_T$ with probability $W^i_T$} \Comment{Initialize}

				\For{$t = T-1$ \textbf{to} $1$} \Comment{Smoothing}
					\For{$j = 1$ \textbf{to} $N$}
						\State{$W^j_{t \mid t+1} = W^j_t ~ p(\tilde{\vec{q}}_{t+1} \mid \vec{X}^j_{t})$}
					\EndFor
				\State{Choose $\tilde{\vec{q}}^k_t = \vec{X}^j_t$ with probability $W^j_{t\mid t+1}$}
				\EndFor
				\State{$\tilde{\vec{q}}^k_{1:T} = (\tilde{\vec{q}}^k_1, \tilde{\vec{q}}^k_2, ..., \tilde{\vec{q}}^k_T)$ is one approximate realisation from $p(\vec{q}_{1:T} \mid \vec{o}_{1:T})$}
			\EndFor
    \end{algorithmic}
\end{algorithm}
%
This method can be seen in algorithm \ref{alg:backwardSimulation} in pseudo-algorithmic form.
Again, a particle filter is performed at first and then the smoothing procedure gets applied.
\commentByFrank{das klingt so, als waeren particle-filter und smoothing zwei komplett verschiedene sachen.}
\commentByFrank{was heisst 'drawn approximately'? nach welchen gesichtspunkte?}
Here, $\tilde{\vec{q}}_t$ is a random sample drawn approximately from $p(\vec{q}_{t} \mid \tilde{\vec{q}}_{t+1}, \vec{o}_{1:T})$.
Therefore $\tilde{\vec{q}}_{1:T} = (\tilde{\vec{q}}_{1}, \tilde{\vec{q}}_{2}, ...,\tilde{\vec{q}}_{T})$ is one particular sample
realisation from $p(\vec{q}_{1:T} \mid \vec{o}_{1:T})$.
Further independent realisations are obtained by repeating the algorithm until the desired number $N_{\text{sample}}$ is reached.
The computational complexity for one particular realisation is $\mathcal{O}(N)$.
However, the computations are then repeated for each realisation drawn \cite{Godsill04:MCS}.

\subsection{Transition for Smoothing}
As seen above, both algorithms are reweighting particles based on a state transition model.
Unlike the transition presented in section \ref{sec:transition}, it is not possible to just draw a set of new samples.
Here, $p(\vec{q}_{t+1} \mid \vec{q}_{t})$ needs to provide the probability of the \textit{known} future state $\vec{q}_{t+1}$ under the condition of the current state $\vec{q}_{t}$.
In case of indoor localisation using particle filtering, it is necessary to not only provide the probability of moving to a particle's position under the condition of its ancestor, but also of all other particles at time $t$.
The smoothing transition model therefore calculates the probability of being in a state $\vec{q}_{t+1}$ in regard to previous states and the pedestrian's walking behaviour.
This means that a state $\vec{q}_t$ is more likely if it is a proper ancestor (realistic previous position) of a future state $\vec{q}_{t+1}$.
In the following a simple and inexpensive approach for receiving this information will be described.

By writing
\begin{equation}
p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{step}} = \mathcal{N}(\Delta d_t \mid \mu_{\text{step}}, \sigma_{\gDist}^2)
\label{eq:smoothingTransDistance}
\end{equation}
we receive a statement about how likely it is to cover a distance $\Delta d_t$ between two states $\vec{q}_{t+1}$ and $\vec{q}_{t}$.
In the easiest case, $\Delta d_t$ is the linear distance between two states.
\commentByFrank{summarize: sum up?}
Of course, based on the graph structure, one could calculate the shortest path between both and summarize the respective edge lengths.
However, this requires tremendous calculation time for negligible improvements.
Therefore this is not further discussed within this work.
The average step length $\mu_{\text{step}}$ is based on the pedestrian's walking speed and $\sigma_{\gDist}^2$ denotes the step length's variance.
Both values are chosen depending on the activity $x$ recognized at time $t$.
\commentByFrank{then oder than?}
For example $\mu_{\text{step}}$ gets smaller while a pedestrian is walking upstairs, then just walking straight.
This requires to extend the smoothing transition by the current observation $\mObsVec_t$.
Since $\mStateVec$ is hidden and the Markov property is satisfied, we are able to do so.


The heading information is incorporated using
\begin{equation}
p(\mStateVec_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{turn}} =  \mathcal{N}(\Delta\alpha_t \mid \mObsHeading, \sigma^2_{\text{turn}})\enspace ,
\label{eq:transHeadingSmoothing}
\end{equation}
where $\Delta\alpha_t$ is the absolute angle between $\vec{q}_{t+1}$ and $\vec{q}_{t}$ in the range of $[0, \pi]$.
The relative angular change $\mObsHeading$ is then used to receive a statement about how likely it is to walk in that particular direction.
Again the normal distribution of \refeq{eq:transHeadingSmoothing} does not integrate to $1.0$. Therefore the same assumption as in \refeq{eq:transHeading} has to be made.


To further improve the results, especially in 3D environments, the vertical (non-absolute) distance $\Delta z$ between two successive states is used as follows:
\begin{equation}
p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{baro}} = \mathcal{N}(\Delta z \mid \mu_z, \sigma^2_{z}) \enspace .
\label{eq:smoothingTransPressure}
\end{equation}
This assigns a low probability to false detected or misguided floor changes.
Similar to \refeq{eq:smoothingTransDistance} we set $\mu_z$ and $\sigma^2_{z}$ based on the activity recognised at time $t$.
Therefore, $\mu_z$ is the expected change in $z$-direction between two time steps.
This means, if the pedestrian is walking alongside a corridor, we set $\mu_z = 0$.
In contrast, $\mu_z$ is positive while walking downstairs or otherwise negative for moving upstairs.
The size of $\mu_z$ and also $\mu_{\text{step}}$ could be a predefined value or set dynamically based on the measured vertical and linear acceleration.

Looking at \refeq{eq:smoothingTransDistance} to \refeq{eq:smoothingTransPressure}, obvious similarities to a sensor fusion process can be seen. By assuming statistical independence between those three, the probability density of the smoothing transition is given by
\begin{equation}
		\arraycolsep=1.2pt
		\begin{array}{ll}
			p(\vec{q}_{t+1} \mid \vec{q}_t) =
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{step}}\\
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{turn}}\\
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{baro}}
		\end{array}
\enspace .
\end{equation}
%
It is important to notice, that all particles at each time step $t$ of the forward filtering need to be saved.
\commentByFrank{increases?}
Therefore, the memory requirement increasing proportional to the processing time.