\section{Smoothing}
\label{sec:smoothing}

The main purpose of this work is to provide MC smoothing methods in context of indoor localisation.
As mentioned before, those algorithms are able to compute probability distributions in the form of $p(\mStateVec_t \mid \mObsVec_{1:T})$ and are therefore able to make use of future observations between $t$ and $T$, where $t << T$.
%Especially fixed-lag smoothing is very promising in context of pedestrian localisation. 
In the following we discuss the algorithmic details of the forward-backward smoother and the backward simulation. 
Further, a novel approach for incorporating them into the localisation system is shown.

\subsection{Forward-backward Smoother}

The forward-backward smoother (FBS) of \cite{Doucet00:OSM} is a well established alternative to the simple filter-smoother. The foundation of this algorithm was again laid by Kitagawa in \cite{kitagawa1987non}. 
An approximation is given by
\begin{equation}
p(\vec{q}_t \mid \vec{o}_{1:T}) \approx \sum^N_{i=1} W^i_{t \mid T} \delta_{\vec{X}^i_{t}}(\vec{q}_{t}) \enspace, 
\label{eq:approxFBS}
\end{equation}
%\commentByFrank{support?}
where $p(\vec{q}_t \mid \vec{o}_{1:T})$ has the same support as the filtering distribution $p(\vec{q}_t \mid \vec{o}_{1:t})$, but the weights are different. 
This means, that the FBS maintains the original particle locations and just reweights the particles to obtain a smoothed density. 
$\delta_{\vec{X}^i_{t}}$ denotes the Dirac delta function.
The complete FBS can be seen in algorithm \ref{alg:forward-backwardSmoother} in pseudo-algorithmic form. 
%\commentByFrank{forward step vlt etwas genauer erklaeren weil 1. mal benutzt? oder is das hinlaenglich bekannt? :P}
%\commentByToni{Das ist natuerlich eine ueberlegung... Mal schaun was Frank. D. dazu sagt.}
At first, the algorithm obtains the filtered distribution (particles) by deploying a forward step at each time $t$. 
%\commentByFrank{pfennigfuchserei: sagt man smoothing distribution oder smoothed distribution? bin da ned drin}
%\commentByToni{smoothing hat eine andere distri als filtering also smoothing distribution}
Then the backward step for determining the smoothing distribution is carried out.
The weights are obtained through the backward recursion in line 9.
%\commentByFrank{mir (als laie) wird nicht klar: mache ich erst alle forwaertsschritte (also alles bis zum pfadende durchlaufen) und gehe dann von da rueckwaerts (so klingts etwas im text), oder gehe ich nach jedem forwartsschritt rueckwarts (so klingts im pseudocode)}
%\commentByToni{klingt fuer mich ueberall so? wie kommst du auf die zweite annahme? im endeffekt haengt das aber auch davon ab ob die fixed-lag oder fixed-interval machst.}

%\commentByFrank{reihenfolge von $\{ W^i_t, \vec{X}^i_t\}^N_{i=1}$ war oben andersrum. ned schlimm. nur wegen konsistenz :P}
    \label{alg:forward-backwardSmoother}

\begin{algorithm}[t]
    \caption{Forward-Backward Smoother}
    \begin{algorithmic}[1] % The number tells where the line numbering should start
			\For{$t = 1$ \textbf{to} $T$} \Comment{Filtering}
				\State{Obtain the weighted trajectories $ \{ W^i_t, \vec{X}^i_t\}^N_{i=1}$}
			\todo{Filtering hier genauer beschreiben?}
			\EndFor
			\For{ $i = 1$ \textbf{to} $N$} \Comment{Initialization}
				\State{Set $W^i_{T \mid T} = W^i_T$} 
			\EndFor
			\For{$t = T-1$ \textbf{to} $1$} \Comment{Smoothing}
				\For{$i = 1$ \textbf{to} $N$}
\vspace*{0.1cm}
					\State{
$
W^i_{t \mid T} = W^i_t \left[ \sum^N_{j=1} W^j_{t+1 \mid T} \frac{p(\vec{X}^j_{t+1} \mid \vec{X}^i_t)}{\sum^N_{k=1} W^k_t ~ p(\vec{X}^j_{t+1} \mid \vec{X}^k_t)} \right] 
$}
				\EndFor	
			\EndFor
    \end{algorithmic}
\end{algorithm}


%Probleme? Nachteile? Komplexität etc.
%\commentByFrank{muss ich die quelle gelesen haben ums zu verstehen? wird mir naemlich so ned klar}
%\commentByToni{Sollte man gelesen haben.}
By reweighting the filter particles, the FBS improves the simple filter-smoother by removing its dependence on the inheritance (smoothed) paths \cite{fearnhead2010sequential}. However, by looking at algorithm \ref{alg:forward-backwardSmoother}  it can easily be seen that this approach computes in $\mathcal{O}(N^2)$, where the calculation of each particle's weight is an $\mathcal{O}(N)$ operation. To reduce this computational bottleneck, \cite{klaas2006fast} introduced a solution using algorithms from N-body simulation. By integrating dual tree recursions and fast multipole techniques with the FBS, a run-time cost of $\mathcal{O}(N \log N)$ can be achieved. 

\subsection{Backward Simulation}
For smoothing applications with a high number of particles, it is often not necessary to use all particles for smoothing.
This decision can for example be made due to a high sample impoverishment and/or highly accurate sensors.
By choosing a good sub-set for representing the posterior distribution, it is theoretically possible to further improve the estimation. 

Therefore, \cite{Godsill04:MCS} presented the backward simulation (BS). Where a number of independent sample realisations
from the entire smoothing density are used to approximate the smoothing distribution. 
%
\begin{algorithm}[t]
    \caption{Backward Simulation Smoothing}
    \label{alg:backwardSimulation}
    \begin{algorithmic}[1] % The number tells where the line numbering should start
			\For{$t = 1$ \textbf{to} $T$} \Comment{Filtering}
				\State{Obtain the weighted trajectories $ \{ W^i_t, \vec{X}^i_t\}^N_{i=1}$}
			\EndFor
			\For{ $k = 1$ \textbf{to} $N_{\text{sample}}$} 
				\State{Choose $\tilde{\vec{q}}^k_T = \vec{X}^i_T$ with probability $W^i_T$} \Comment{Initialize}
			
				\For{$t = T-1$ \textbf{to} $1$} \Comment{Smoothing}
					\For{$j = 1$ \textbf{to} $N$}
						\State{$W^j_{t \mid t+1} = W^j_t ~ p(\tilde{\vec{q}}_{t+1} \mid \vec{X}^j_{t})$}
					\EndFor	
				\State{Choose $\tilde{\vec{q}}^k_t = \vec{X}^j_t$ with probability $W^j_{t\mid t+1}$}
				\EndFor
				\State{$\tilde{\vec{q}}^k_{1:T} = (\tilde{\vec{q}}^k_1, \tilde{\vec{q}}^k_2, ..., \tilde{\vec{q}}^k_T)$ is one approximate realisation from $p(\vec{q}_{1:T} \mid \vec{o}_{1:T})$}
			\EndFor
    \end{algorithmic}
\end{algorithm}
%
This method can be seen in algorithm \ref{alg:backwardSimulation} in pseudo-algorithmic form.
Again, a particle filter is performed at first and then the smoothing procedure gets applied.
%\commentByFrank{das klingt so, als waeren particle-filter und smoothing zwei komplett verschiedene sachen.}
%\commentByToni{Sind sie doch auch irgendwo.}
%\commentByFrank{was heisst 'drawn approximately'? nach welchen gesichtspunkte?}
Here, $\tilde{\vec{q}}_t$ is a random sample drawn approximately from $p(\vec{q}_{t} \mid \tilde{\vec{q}}_{t+1}, \vec{o}_{1:T})$.
For example $\tilde{\vec{q}}_t$ could be chosen by selecting particles within a cumulative frequency.
Therefore $\tilde{\vec{q}}_{1:T} = (\tilde{\vec{q}}_{1}, \tilde{\vec{q}}_{2}, ...,\tilde{\vec{q}}_{T})$ is one particular sample
realisation from $p(\vec{q}_{1:T} \mid \vec{o}_{1:T})$.
Further independent realisations are obtained by repeating the algorithm until the desired number $N_{\text{sample}}$ is reached.
The computational complexity for one particular realisation is $\mathcal{O}(N)$.
However, the computations are then repeated for each realisation drawn \cite{Godsill04:MCS}.

\subsection{Transition for Smoothing}
As seen above, both algorithms are reweighting particles based on a state transition model. 
Unlike the transition presented in section \ref{sec:transition}, it is not possible to just draw a set of new samples. 
Here, $p(\vec{q}_{t+1} \mid \vec{q}_{t})$ needs to provide the probability of the \textit{known} future state $\vec{q}_{t+1}$ under the condition of the current state $\vec{q}_{t}$. 
In case of indoor localisation using particle filtering, it is necessary to not only provide the probability of moving to a particle's position under the condition of its ancestor, but also of all other particles at time $t$. 
The smoothing transition model therefore calculates the probability of being in a state $\vec{q}_{t+1}$ in regard to previous states and the pedestrian's walking behaviour. 
This means that a state $\vec{q}_t$ is more likely if it is a proper ancestor (realistic previous position) of a future state $\vec{q}_{t+1}$. 
In the following a simple and inexpensive approach for receiving this information will be described.

By writing
\begin{equation}
p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{step}} = \mathcal{N}(\Delta d_t \mid \mu_{\text{step}}, \sigma_{\text{step}}^2)
\label{eq:smoothingTransDistance}
\end{equation}
we receive a statement about how likely it is to cover a distance $\Delta d_t$ between two states $\vec{q}_{t+1}$ and $\vec{q}_{t}$.
In the easiest case, $\Delta d_t$ is the linear distance between two states. 
Of course, based on the graph structure, one could calculate the shortest path between both and sum up the respective edge lengths. 
However, this requires tremendous calculation time for negligible improvements. 
Therefore this is not further discussed within this work. 
The average step length $\mu_{\text{step}}$ is based on the pedestrian's walking speed and $\sigma_{\text{step}}^2$ denotes the step length's variance. 
Both values are chosen depending on the activity $\mObsActivity$ recognized at time $t$. 
For example $\mu_{\text{step}}$ gets smaller while a pedestrian is walking upstairs, than just walking straight.
This requires to extend the smoothing transition by the current observation $\mObsVec_t$.
Since $\mStateVec$ is hidden and the Markov property is satisfied, we are able to do so.

 
The heading information is incorporated using
\begin{equation}
p(\mStateVec_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{turn}} =  \mathcal{N}(\Delta\alpha_t \mid \mObsHeading, \sigma^2_{\text{turn}})\enspace ,
\label{eq:transHeadingSmoothing}
\end{equation}
where $\Delta\alpha_t$ is the absolute angle between $\vec{q}_{t+1}$ and $\vec{q}_{t}$ in the range of $[0, \pi]$. 
The relative angular change $\mObsHeading$ is then used to receive a statement about how likely it is to walk in that particular direction.
Again the normal distribution of \refeq{eq:transHeadingSmoothing} does not integrate to $1.0$. Therefore the same assumption as in \refeq{eq:transHeading} has to be made.


To further improve the results, especially in 3D environments, the vertical (non-absolute) distance $\Delta z$ between two successive states is used as follows:
\begin{equation}
p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{baro}} = \mathcal{N}(\Delta z \mid \mu_z, \sigma^2_{z}) \enspace .
\label{eq:smoothingTransPressure}
\end{equation}
This assigns a low probability to false detected or misguided floor changes.
Similar to \refeq{eq:smoothingTransDistance} we set $\mu_z$ and $\sigma^2_{z}$ based on the activity recognised at time $t$.
Therefore, $\mu_z$ is the expected change in $z$-direction between two time steps. 
This means, if the pedestrian is walking alongside a corridor, we set $\mu_z = 0$. 
In contrast, $\mu_z$ is positive while walking downstairs or otherwise negative for moving upstairs. 
The size of $\mu_z$ and also $\mu_{\text{step}}$ could be a predefined value or set dynamically based on the measured vertical and linear acceleration.

Looking at \refeq{eq:smoothingTransDistance} to \refeq{eq:smoothingTransPressure}, obvious similarities to a sensor fusion process can be seen. By assuming statistical independence between those three, the probability density of the smoothing transition is given by  
\begin{equation}
		\arraycolsep=1.2pt
		\begin{array}{ll}
			p(\vec{q}_{t+1} \mid \vec{q}_t) = 
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{step}}\\ 
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{turn}}\\
			&p(\vec{q}_{t+1} \mid \vec{q}_t, \mObsVec_t)_{\text{baro}}  
		\end{array}
\enspace .
\end{equation}
%
It is important to notice, that all particles at each time step $t$ of the forward filtering need to be saved. 
Therefore, the memory requirement increases proportional to the processing time.