IPIN2018/tex/chapters/experiments.tex

\section{Experiments}

As explained at the very beginning of this work, we wanted to explore the limits of the here presented localization system.
By utilizing it to a 13th century historic building, we created a challenging scenario not only because of the various architectural factors, but also because of its function as a museum.
During all experiments, the museum was open to the public and had a varying number of \SI{10}{} to \SI{50}{} visitors while recording.

The \SI{2500}{\square\meter} building consists of \SI{6}{} different levels, which are grouped into 4 floors (see fig. \ref{fig:apfingerprint}).
Thus, the ceiling height is not constant over one floor and varies between \SI{2.6}{\meter} to \SI{3.6}{\meter}.
In the middle of the building is an outdoor area, which is only accessible from one side.
While most of the exterior and ground level walls are made of massive stones, the floors above are half-timbered constructions.
Due to different objects like exhibits, cabinets or signs not all positions within the building were walkable.
For the sake of simplicity we did not incorporate such knowledge into the floorplan.
Thus, the floorplan consists only of walls, ceilings, doors, windows and stairs.
It was created using our 3D map editor software based on architectural drawings from the 1980s.

Sensor measurements are recorded using a simple mobile application that implements the standard Android sensor functionalities.
As smartphones we used either a Samsung Note 2, Google Pixel One or Motorola Nexus 6.
The computation of the state estimation as well as the \docWifi{} optimization are done offline using an Intel Core i7-4702HQ CPU with a frequency of \SI{2.2}{GHz} running \SI{8}{cores} and \SI{16}{GB} main memory.
However, similar to our previous, award-winning system, the setup is able to run completely on commercial smartphones as well as it uses C++ code \cite{torres2017smartphone}.
%Sensor measurements are recorded using a simple mobile application that implements the standard Android SensorManager.

The experiments are separated into four sections:
At first, we discuss the performance of the novel transition model and compare it to a grid-based approach.
In section \ref{sec:exp:opti} we have a look at \docWIFI{} optimization and how the real \docAPshort{} positions differ from it.
Following, we conducted several test walks throughout the building to examine the estimation accuracy (in \SI{}{\meter}) of the localisation system and try to resolve sample impoverishment with the here presented methods.
Finally, the different estimation methods are compared in section \ref{sec:exp:est}.


\subsection{Transition}
To make a statement about the performance of our novel transition model presented within section \ref {}, we chose a simple scenario, in which a tester walks up and down a staircase three times.

\todo{Unser liebes Treppensteigen. Vergleich altes und neues Bewegungsmodell.}

\subsection{\docWIFI{} Optimization}
\label{sec:exp:opti}

%wie viele ap sind es insgesamt?
The \docAPshort{} positions as well as the fingerprints used for optimization can be seen in fig. \ref{fig:apfingerprint}.
As described in section \ref{sec:wifi} we used \SI{42}{} WEMOS D1 mini to provide a \docWIFI{} infrastructure throughout the building.
The position of every installed beacon was measured using a laser scanner.
This allows a comparison with the optimized \docAPshort{} positions.
Within all Wi-Fi observations, we only consider the beacons, which are identified by their well-known MAC address.
Other transmitters like smart TVs or smartphone hotspots are ignored as they might cause estimation errors.
%wie fingerprints aufgenommen, wie viele ...

\begin{figure}[bt]
	\centering
  \includegraphics[width=0.9\textwidth]{gfx/floorplanDummy.png}
	\caption{Floorplan Dummy}
	\label{fig:apfingerprint}
\end{figure}


%kurze beschreibung was wir jetzt alles testen wollen.


%was kommt bei der optimierung raus. vergleichen mit ground truth. auch den fehler gegenüberstellen.
%man sollte sehen das ohne optimierung gar nichts geht.

\subsection{Localization Error}

\begin{figure}[ht]
	\centering
  	\includegraphics[width=0.9\textwidth]{gfx/floorplanDummy.png}
	\caption{Floorplan Dummy}
	\label{fig:floorplan}
\end{figure}
%
The 4 chosen walking paths can be seen in fig. \ref{fig:floorplan}.
\todo{wie lang sind die walks meter und zeit?}
They were carried out be 4 different male testers using either a Samsung Note 2, Google Pixel One or Motorola Nexus 6 for recording the measurements.
All in all, we recorded \SI{28}{} distinct measurement series, \SI{7}{} for each walk.
The picked walks intentionally contain erroneous situations, in which many of the above treated problems occur.
Thus we are able to discuss everything in detail.
A walk is indicated by a set of numbered markers, fixed to the ground.
Small icons on those markers give the direction of the next marker and in some cases provide instructions to pause walking for a certain time.
The intervals for pausing vary between \SI{10}{\second} to \SI{60}{\second}.
The ground truth is then measured by recording a timestamp while passing a marker.
For this, the tester clicks a button on the smartphone application.
Between two consecutive points, a constant movement speed is assumed.
Thus, the ground truth might not be \SI{100}{\percent} accurate, but fair enough for error measurements.
The approximation error is then calculated by comparing the interpolated ground truth position with the current estimation \cite{Fetzer-16}.
An estimation on the wrong floor has a great impact on the location awareness of an pedestrian, but only provides a relatively small error.
Therefore, errors in $z$-direction are penalized by tripling the $z$-value.

%computation und monte carlo runs
For each walk we deployed 100 runs using \SI{5000}{particles} and set $N_{\text{eff}} = 0.85$ for resampling.
Instead of an initial position and heading, all walks start with a uniform distribution (random position and heading) as prior.
The overall localisation results can be see in table \ref{table:overall}.
Here, we differ between the respective anti-impoverishment techniques presented in chapter \ref{sec:impo}.
The simple anti-impoverishment method is added to the resampling step and thus uses the transition method presented in chapter \ref{sec:transition}.
In contrast, the $D_\text{KL}$-based method extends the transition and thus uses a standard cumulative resampling step.
We set $l_\text{max} =$ \SI{-75}{dBm} and $l_\text{min} =$ \SI{-90}{dBm}.
For a better overview, we only used the KDE-based estimation, as the errors compared to the weighted average estimation differ by only a few centimetres.

\newcommand{\STAB}[1]{\begin{tabular}{@{}c@{}}#1\end{tabular}}

\begin{table}[t]
	\centering
	\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
		\hline
		Method & \multicolumn{3}{c|}{none} & \multicolumn{3}{c|}{simple} & \multicolumn{3}{c|}{$D_\text{KL}$}\\
		\hline
		  & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ \\
		\hline \hline
     	Walk 0 & \SI{1340}{\centi\meter} & \SI{1115}{\centi\meter} & \SI{2265}{\centi\meter} & \SI{715}{\centi\meter} & \SI{660}{\centi\meter} & \SI{939}{\centi\meter} & \SI{576}{\centi\meter} & \SI{494}{\centi\meter} & \SI{734}{\centi\meter} \\ \hline
		Walk 1 & \SI{320}{\centi\meter} & \SI{242}{\centi\meter} & \SI{406}{\centi\meter} & \SI{322}{\centi\meter} & \SI{258}{\centi\meter} & \SI{404}{\centi\meter} & \SI{379}{\centi\meter} & \SI{317}{\centi\meter} & \SI{463}{\centi\meter} \\ \hline
		Walk 2 & \SI{834}{\centi\meter} & \SI{412}{\centi\meter} & \SI{1092}{\centi\meter} & \SI{356}{\centi\meter} & \SI{232}{\centi\meter} & \SI{486}{\centi\meter} & \SI{362}{\centi\meter} & \SI{234}{\centi\meter} & \SI{484}{\centi\meter} \\ \hline
		Walk 3 & \SI{704}{\centi\meter} & \SI{589}{\centi\meter} & \SI{1350}{\centi\meter} & \SI{538}{\centi\meter} & \SI{469}{\centi\meter} & \SI{782}{\centi\meter} & \SI{476}{\centi\meter} & \SI{431}{\centi\meter} & \SI{648}{\centi\meter} \\
		\hline
	\end{tabular}
		\caption{Overall localization results using the different impoverishment methods. The error is given by the \SI{75}{\percent}-quantil Used only kde for estimation, since kde and avg nehmen sich nicht viel. fehler kleiner als 10 cm im durchschnitt deshalb der übersichtshalber weggelassen. }
	\label{table:overall}
\end{table}

All walks, except for walk 1, suffer in some way from sample impoverishment.
We discuss the single results of table \ref{table:overall} starting with walk 0.
Here, the pedestrians started at the top most level, walking down to the lowest point of the building.
The first critical situation occurs immediately after the start.
While walking down the small staircase, many particles are getting dragged into the room to the right due to erroneous Wi-Fi readings.
At this point, the activity "walking down" is recognized, however only a for very short period.
This is caused by the short length of the stairs.
After this period, only a small number of particles changed the floor correctly, while a majority is stuck within the right-hand room.
The activity based evaluation $p(\vec{o}_t \mid \vec{q}_t)_\text{act}$ prevents particles from further walking down the stairs, while the resampling step mainly draws particles in already populated areas.
In \SI{10}{\percent} of the runs using none of the anti-impoverishment methods, the system is unable to recover and thus unable to finish the walk somewhere near the correct position or even on the same floor.
Yet, the other \SI{90}{\percent} of runs suffer from a very high error.
Only by using one of the here presented methods to prevent impoverishment, the system is able to recover in \SI{100}{\percent} of cases.
Fig. \ref{fig:errorOverTimeWalk0} compares the error over time between the different methods for an exemplary run.
The above described situation, causing the system to stuck after \SI{10}{\second}, is clearly visible.
Both, the simple and the $D_\text{KL}$ method are able to recover early and thus decrease the overall error dramatically.
Between \SI{65}{\second} and \SI{74}{\second} the simple method produces high errors due to some uncertain Wi-Fi measurements coming from an \docAP{} below, causing those particles who are randomly drawn near this \docAPshort{} to be rewarded with a very high weight.
This leads to newly sampled particles in this area and therefore a jump of the estimation.
The situation is resolved after entering another room, which is now shielded by stone walls instead of wooden ones.
Walking down the stairs at \SI{80}{\second} does also recover the localization system using none of the methods.
%
\begin{figure}
	\centering
	\input{gfx/errorOverTimeWalk0/errorOverTime.tex}
	\caption{Error development over time of a single Monte Carlo run of walk 0. Between \SI{10}{\second} and \SI{24}{\second} the Wi-Fi signal was highly attenuated, causing the system to get stuck and producing high errors. Both, the simple and the $D_\text{KL}$ anti-impoverishment method are able to recover early. However, between \SI{65}{\second} and \SI{74}{\second} the simple method produces high errors due to the high random factor involved.}
	\label{fig:errorOverTimeWalk0}
\end{figure}

A similar behaviour as the above can be seen in walk 3.
Without a method to recover from impoverishment, the system lost track in \SI{100}{\percent} of the runs due to a not detected floor change in the last third of the walk.
By using the simple method, the overall error can be reduced and the impoverishment resolved. Nevertheless, unpredictable jumps of the estimation are causing the system to be highly uncertain in some situations, even if those jumps do not last to long.
Only the use of the $D_\text{KL}$ method is able to produce reasonable results.

As described in chapter \ref{}, we use a Wi-Fi model optimized for each floor instead of a single global one.
A good example why we do this, can be seen in fig. \ref{}, considering walk 3.
Here, the system using the global Wi-Fi model makes a big jump into the right-hand corridor and requires \SI{5}{\second} to recover.
This happens through a combination of environmental occurrences, like the many different materials and thus attenuation factors, as well as the limitation of the here used Wi-Fi model, only considering ceilings and ignoring walls.
Following, \docAPshort{}'s on the same floor level, which are highly attenuated by \SI{2}{\meter} thick stone walls, are neglected and \docAPshort{}'s from the floor above, which are only separated by a thin wooden ceiling, have a greater influence within the state evaluation process.
Of course, we optimize the attenuation per floor, but at the end this is just an average value summing up the \docAPshort{}'s surrounding materials.
Therefore, the calculated signal strength predictions do not fit the measurements received from the above in a optimal way.
In contrast, the model optimized for each floor only considers the respective \docAPshort{}'s on that floor, allowing to calculate better fitting parameters.
A major disadvantage of the method is the reduced number of visible \docAPshort{}'s and thus measurements within an area.
This could lead to an underrepresentation of \docAPshort{}'s for triangulation.


\todo{fuer eins brauchen wir aber noch estimated path}

\todo{boxkde 0.2 point2(1,1);}

\todo{
BILD: Von einem Pfad der steckenbleibt und den beiden anderen verfahren mit fehler über die zeit.

BILD: WIFI-Fehler unten bei den Kellern.

BILD: Estimation Fehler
}


%To analyse the drawbacks and benefits of the here presented method to resolve sample impoverishment,

%The benefits of the here presented solution to resolve sample impoverishment can be seen in the example shown in fig. \ref{}.


%probleme mit impoverishment aufzeigen, wo bringt es was, was macht es kaputt etc pp

%%estimation
\subsection{Estimation Methods}
\label{sec:exp:est}

As discussed before, the single estimation methods only vary by a few centimetres in the overall localization error.
That means, they differ mainly in the representation of the estimated locations.
More easily spoken, in which way the estimated path is drawn and thus presented to the user.
Regarding the underlying particle set, different shapes of probability distributions need to be considered, especially those with multimodalities.

\begin{figure}
	\centering
	\input{gfx/walk.tex}
	\caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides a high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. }
	\label{fig:realWorldMulti}
\end{figure}

The main advantage of a KDE-based estimation is that it provides the "correct" mode of a density, even under a multimodal setting (cf. section \ref{sec:estimation}).
A situation in which the system highly benefits from this is illustrated in fig. \ref{fig:realWorldMulti}.
Here, a set of particles splits apart, due to uncertain measurements and multiple possible walking directions.
Indicated by the black dotted line, the resulting bimodal posterior reaches its maximum distance between the modes at \SI{13.4}{\second}.
Thus, a weighted average estimation (blue line) results in a position of the pedestrian somewhere outside the building (light green area).
The ground truth is given by the black solid line.
The KDE-based estimation (orange line) is able to provide reasonable results by choosing the "correct" mode of the density.
After \SI{20.8}{\second} the setting returns to be unimodal again.
Due to a right turn the lower red particles are walking against a wall and thus punished with a low weight.

Although, situations as displayed in fig. \ref{fig:realWorldMulti} frequently occur, the KDE-estimation is not able to improve the overall estimation results.
This can be seen in the corresponding error development over time plot given by fig. \ref{fig:realWorldTime}.
Here, the KDE-estimation performs slightly better then the weighted-average, however after deploying \SI{100}{} Monte Carlo runs, the difference becomes insignificant.
It is obvious, that the above mentioned "correct" mode, not always provides the lowest error.
In some situations the weighted-average estimation is often closer to the ground truth.
Within our experiments this happened especially when entering or leaving thick-walled rooms, causing slow and attenuated Wi-Fi signals.
While the system’s dynamics are moving the particles outside, the faulty Wi-Fi readings are holding back a majority by assigning corresponding weights.
Only with new measurements coming from the hallway or other parts of the building, the distribution and thus the KDE-estimation are able to recover.

\begin{figure}
	\centering
	\input{gfx/errorOverTimeWalk1/errorOverTime.tex}
	\caption{Error development over time of a single Monte Carlo run of the walk calculated between estimation and ground truth. Between \SI{230}{\second} and \SI{290}{\second} to pedestrian was not moving.}
	\label{fig:realWorldTime}
\end{figure}

This lead to the conclusion, that a weighted average approach provides a more smooth representation of the estimated locations and thus a higher robustness.
In contrast, a KDE-based approach for estimation is able to resolve multimodalities.
It does not always provide the lowest error, since it depends more on an accurate sensor model then a weighted average approach, but is very suitable as a good indicator about the real performance of a sensor fusion system.
At the end, in the here shown examples we only searched for a global maxima, even though this approach opens a wide range of other possibilities for finding a best estimate.

%wie in bulli paper.

%letzer absatz nochmal gesamtergebniss des gesamten systems
%was läuft noch schief? wo macht was probleme?