IPIN2018/tex/chapters/experiments.tex

\section{Experiments}

As explained at the very beginning of this work, we wanted to explore the limits of the here presented localization system.
By utilizing it to a 13th century historic building, we created a challenging scenario not only because of the various architectural factors, but also because of its function as a museum.
During all experiments, the museum was open to the public and had a varying number of \SI{10}{} to \SI{50}{} visitors while recording.

The \SI{2500}{\square\meter} building consists of \SI{6}{} different levels, which are grouped into 4 floors (see fig. \ref{fig:apfingerprint}).
Thus, the ceiling height is not constant over one floor and varies between \SI{2.6}{\meter} to \SI{3.6}{\meter}.
In the middle of the building is an outdoor area, which is only accessible from one side.
While most of the exterior and ground level walls are made of massive stones, the floors above are half-timbered constructions.
Due to different objects like exhibits, cabinets or signs not all positions within the building were walkable.
For the sake of simplicity we did not incorporate such knowledge into the floorplan.
Thus, the floorplan consists only of walls, ceilings, doors, windows and stairs.
It was created using our 3D map editor software based on architectural drawings from the 1980s.

Sensor measurements are recorded using a simple mobile application that implements the standard Android sensor functionalities.
As smartphones we used either a Samsung Note 2, Google Pixel One or Motorola Nexus 6.
The computation of the state estimation as well as the \docWifi{} optimization are done offline using an Intel Core i7-4702HQ CPU with a frequency of \SI{2.2}{GHz} running \SI{8}{cores} and \SI{16}{GB} main memory.
However, similar to our previous, award-winning system, the setup is able to run completely on commercial smartphones as well as it uses C++ code \cite{torres2017smartphone}.
%Sensor measurements are recorded using a simple mobile application that implements the standard Android SensorManager.

The experiments are separated into three sections:
At first, we discuss the performance of the novel transition model and compare it to a grid-based approach.
In section \ref{sec:exp:opti} we have a look at \docWIFI{} optimization and how the real \docAPshort{} positions differ from it.
Following, we conducted several test walks throughout the building to examine the estimation accuracy (in \SI{}{\meter}) of the localisation system.
We try to resolve sample impoverishment with the here presented method and compare the different estimation methods as presented in section \ref{sec:estimation}.


\subsection{Transition}
To make a statement about the performance of our novel transition model presented within section \ref {}, we chose a simple scenario, in which a tester walks up and down a staircase three times.

\todo{Unser liebes Treppensteigen. Vergleich altes und neues Bewegungsmodell.}

\subsection{\docWIFI{} Optimization}
\label{sec:exp:opti}

%wie viele ap sind es insgesamt?
The \docAPshort{} positions as well as the fingerprints used for optimization can be seen in fig. \ref{fig:apfingerprint}.
As described in section \ref{sec:wifi} we used \SI{42}{} WEMOS D1 mini to provide a \docWIFI{} infrastructure throughout the building.
The position of every installed beacon was measured using a laser scanner.
This allows a comparison with the optimized \docAPshort{} positions.
Within all Wi-Fi observations, we only consider the beacons, which are identified by their well-known MAC address.
Other transmitters like smart TVs or smartphone hotspots are ignored as they might cause estimation errors.
%wie fingerprints aufgenommen, wie viele ...

\begin{figure}[bt]
	\centering
  \includegraphics[width=0.9\textwidth]{gfx/floorplanDummy.png}
	\caption{Floorplan Dummy}
	\label{fig:apfingerprint}
\end{figure}


%kurze beschreibung was wir jetzt alles testen wollen.


%was kommt bei der optimierung raus. vergleichen mit ground truth. auch den fehler gegenüberstellen.
%man sollte sehen das ohne optimierung gar nichts geht.

\subsection{Location Estimation Error}

\begin{figure}[ht]
	\centering
  	\includegraphics[width=0.9\textwidth]{gfx/floorplanDummy.png}
	\caption{Floorplan Dummy}
	\label{fig:floorplan}
\end{figure}
%
The 4 chosen walking paths can be seen in fig. \ref{fig:floorplan}.
They were carried out be 4 different male testers using either a Samsung Note 2, Google Pixel One or Motorola Nexus 6 for recording the measurements.
All in all, we recorded \SI{28}{} distinct measurement series, \SI{7}{} for each walk.
The picked walks contain erroneous situations, in which many of the above treated problems occur.
Thus we are able to discuss everything in detail.
A walk is indicated by a set of numbered markers, fixed to the ground.
Small icons on those markers give the direction of the next marker and in some cases provide instructions to pause walking for a certain time.
The intervals for pausing vary between \SI{10}{\second} to \SI{60}{\second}.
The ground truth is then measured by recording a timestamp while passing a marker.
For this, the tester clicks a button on the smartphone application.
Between two consecutive points, a constant movement speed is assumed.
Thus, the ground truth might not be \SI{100}{\percent} accurate, but fair enough for error measurements.
The approximation error is then calculated by comparing the interpolated ground truth position with the current estimation \cite{Fetzer-16}.

%computation und monte carlo runs
For each walk we deployed 100 runs using \SI{5000}{particles}.
Instead of an initial position and heading, all walks start with a uniform distribution (random position and heading) as prior.
The overall localisation results can be see in table \ref{table:overall}.
Here, we differ between the respective impoverishment techniques presented in chapter \ref{sec:impo}.
For a better overview, we only used the KDE-based estimation, as the errors compared to the weighted average estimation differ by only a few centimetres.


a simple filter (weighted average estimation + simple impoverishment solution) and an advanced filter (KDE estimation + $D_\text{KL}$-based impoverishment solution).
It can be seen that...

the results include also all failed walks, to show the benefits of using an anti impoverishment method. in walk 0 at walk 1 20 percent of walks failed ...
of course it would be possible to resets the system at this point, but thats not our anspruch...
we want to show a real worst case scenario!

\todo{providing a penalty for wrong floors. sonst haben wir das problem das der overall error einfach nicht unterschiedlich genug ist. }


\newcommand{\STAB}[1]{\begin{tabular}{@{}c@{}}#1\end{tabular}}

\begin{table}[t]
	\centering
	\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
		\hline
		Method & \multicolumn{3}{c|}{none} & \multicolumn{3}{c|}{simple} & \multicolumn{3}{c|}{$D_\text{KL}$}\\
		\hline
		  & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ & $\bar{x}$ & $\bar{\sigma}$ & $\tilde{x}_{75}$ \\
		\hline \hline
     	Walk 0 & \SI{1315}{\centi\meter} & \SI{1136}{\centi\meter} & \SI{2266}{\centi\meter} & \SI{1189}{\centi\meter} & \SI{1092}{\centi\meter} & \SI{2019}{\centi\meter} & \SI{301}{\centi\meter} & \SI{252}{\centi\meter} & \SI{376}{\centi\meter} \\ \hline
		Walk 1 & \SI{318}{\centi\meter} & \SI{243}{\centi\meter} & \SI{402}{\centi\meter} & \SI{318}{\centi\meter} & \SI{240}{\centi\meter} & \SI{403}{\centi\meter} & \SI{342}{\centi\meter} & \SI{256}{\centi\meter} & \SI{440}{\centi\meter} \\ \hline
		Walk 2 & \SI{589}{\centi\meter} & \SI{403}{\centi\meter} & \SI{843}{\centi\meter} & \SI{321}{\centi\meter} & \SI{210}{\centi\meter} & \SI{431}{\centi\meter} & \SI{342}{\centi\meter} & \SI{219}{\centi\meter} & \SI{455}{\centi\meter} \\ \hline
		Walk 3 & \SI{462}{\centi\meter} & \SI{337}{\centi\meter} & \SI{701}{\centi\meter} & \SI{407}{\centi\meter} & \SI{306}{\centi\meter} & \SI{599}{\centi\meter} & \SI{341}{\centi\meter} & \SI{253}{\centi\meter} & \SI{462}{\centi\meter} \\
		\hline
	\end{tabular}
		\caption{Overall localization results using the different impoverishment methods. The error is given by the \SI{75}{\percent}-quantil Used only kde for estimation, since kde and avg nehmen sich nicht viel. fehler kleiner als 10 cm im durchschnitt deshalb der übersichtshalber weggelassen. }
	\label{table:overall}
\end{table}


%vielleicht die avg / kde unterscheidung weg lassen? dafür avg, std und 75%

It is clearly visible that bla outperformce blah at path 4.
Exemplary estimation results for walk 4 can be seen in fig. \ref{}.

The 75 quantil gibt aufschluss, über aufgretenedes impoverishment. due the 100 runs, some of the walks, where impoverishment occured are average out.

neff = 0.85

boxkde 0.2 point2(1,1)

wifi useregionalopt=true


BILD: Von einem Pfad der steckenbleibt und den beiden anderen verfahren mit fehler über die zeit.

BILD: WIFI-Fehler unten bei den Kellern.

BILD: Estimation Fehler


%To analyse the drawbacks and benefits of the here presented method to resolve sample impoverishment,

%The benefits of the here presented solution to resolve sample impoverishment can be seen in the example shown in fig. \ref{}.


%probleme mit impoverishment aufzeigen, wo bringt es was, was macht es kaputt etc pp

%%estimation

\begin{figure}
	\centering
	\input{gfx/walk.tex}
	\caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides a high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. }
	\label{fig:realWorldMulti}
\end{figure}

As discussed in chapter \ref{}, the main advantage of a KDE-based estimation is that it provides the "correct" mode of a density, even under a multimodal setting.
A situation in which the system highly benefits from this is illustrated in fig. \ref{fig:realWorldMulti}.
Here, a set of particles splits apart, due to uncertain measurements and multiple possible walking directions.
Indicated by the black dotted line, the resulting bimodal posterior reaches its maximum distance between the modes at \SI{13.4}{\second}.
Thus, a weighted average estimation (blue line) results in a position of the pedestrian somewhere outside the building (light green area).
The ground truth is given by the black solid line.
The KDE-based estimation (orange line) is able to provide reasonable results by choosing the "correct" mode of the density.
After \SI{20.8}{\second} the setting returns to be unimodal again.
Due to a right turn the lower red particles are walking against a wall and thus punished with a low weight.

Although, situations as displayed in fig. \ref{fig:realWorldMulti} frequently occur, the KDE-estimation is not able to improve the overall estimation results.
This can be seen in the corresponding error development over time plot given by fig. \ref{fig:realWorldTime}.
Here, the KDE-estimation performs slightly better then the weighted-average, however after deploying \SI{100}{} Monte Carlo runs, the difference becomes insignificant.
It is obvious, that the above mentioned "correct" mode, not always provides the lowest error.
In some situations the weighted-average estimation is often closer to the ground truth.
Within our experiments this happened especially when entering or leaving thick-walled rooms, causing slow and attenuated Wi-Fi signals.
While the system’s dynamics are moving the particles outside, the faulty Wi-Fi readings are holding back a majority by assigning corresponding weights.
Only with new measurements coming from the hallway or other parts of the building, the distribution and thus the KDE-estimation are able to recover.

\begin{figure}
	\centering
	\input{gfx/errorOverTime.tex}
	\caption{Error development over time of a single Monte Carlo run of the walk calculated between estimation and ground truth. Between \SI{230}{\second} and \SI{290}{\second} to pedestrian was not moving.}
	\label{fig:realWorldTime}
\end{figure}

As already mentioned in our previous work \cite{}.
A KDE-based approach for estimation is able to resolve multimodalities.
It does not always provide the lowest error, since it depends more on an accurate sensor model then a weighted-average approach, but is very suitable as a good indicator about the real performance of a sensor fusion system.
At the end, in the here shown examples we only searched for a global maxima, even though this approach approach opens a wide range of other possibilities for finding a best estimate.

%wie in bulli paper.

%letzer absatz nochmal gesamtergebniss des gesamten systems
%was läuft noch schief? wo macht was probleme?