IPIN2018/tex/chapters/transition.tex

\section{Transition}
\label{sec:transition}

	\begin{figure}[t]
		\centering
		\begin{subfigure}{0.325\textwidth}
			\centering
			\includegraphics[width=5.1cm]{gfx/transition/museumMap.pdf}
			\caption{3D Floorplan}
			\label{fig:museumMap}
		\end{subfigure}
		\begin{subfigure}{0.325\textwidth}
			\centering
			\includegraphics[width=5.1cm]{gfx/transition/museumMapGrid.pdf}
			\caption{Navigation graph}
			\label{fig:museumMapGrid}
		\end{subfigure}
		\begin{subfigure}{0.325\textwidth}
			\centering
			\includegraphics[width=5.1cm]{gfx/transition/museumMapMesh.pdf}
			\caption{Navigation mesh}
			\label{fig:museumMapMesh}
		\end{subfigure}
		\caption{
			Floorplan and transition data structures for the ground floor of the building (\SI{71}{\meter}~x~\SI{53}{\meter}).
			To reach every nook and cranny, the graph based approach (b) requires many nodes and edges.
			The depicted version uses a coarse node-spacing of \SI{90}{\centi\meter} (1700 nodes) and barely reaches all doors and stairs.
			A navigation mesh (c) requires only 320 triangles to tightly reach every corner within the building.
		}
		\label{fig:transition}
	\end{figure}

	Within previous works, we used a graph of equidistant nodes (see \reffig{fig:museumMapGrid})
	to model the buildings floorplan, representing the basis for the transition step \cite{Ebner-15, Ebner-16}.
	% in 15 und 16 haben wir stueckweise den graph eingefuhert
	%
	The graph equals a grid, where each node constitutes the center of a grid-cell.
	Cells, usually around \SI{30} x \SI{30}{\centi\meter} in size,
	are only placed in regions that are actually walkable and not intersected by any walls
	or other obstacles. After placement, each cell is connected with their, up to 8, potential
	neighbors in the plane, creating a walkable graph for each floor. The resulting graphs are
	hereafter connected via stairs or elevators, to form the final data structure
	for the whole building.
	This allowes for (semi-)random walks along the graph, by assigning probabilities to each edge,
	using prior knowledge provided by sensors, forming the transition probability
	$p(\mStateVec_{t} \mid \mStateVec_{t-1}, \mObsVec_{t-1})$ \cite{Ebner-16}.

	Due to the equidistant spacing, the resulting graph was rather rigid and
	only well-suited for rectangular buildings. For more contorted buildings, like many
	historic ones, the node-spacing needs to be small, to reliably reach every door, stair
	and corner of the building. Within \reffig{fig:museumMapGrid} we used a
	\SI{90}{\centi\meter} spacing, that is barely able to reach all places within
	the lower floors of the building, and failing to connect the upper floors reliably.
	While using smaller spacings remedies the problem, it requires huge amounts of memory:
	up to several hundred megabytes and millions of nodes and edges to model a single building.
	% musuem aus figure: 90cm grid : ca 2000 nodes, ca 6500 edges
	% museum aus figure: 30cm grid : ca 32k nodes und 120k edges
	% museum ganz, 20cm grid : ca 75k nodes, 280k edges

	Because of both, required memory amounts and inaccuracies of the graph-based
	model, we developed a new basis for the transition step, that is still able to answer
	$p(\mStateVec_{t} \mid \mStateVec_{t-1}, \mObsVec_{t-1})$.
	The new foundation is provided by well-known navigation meshes \cite{navMesh1}
	where the walkable area is spanned by convex polygons, sharing
	their outline edges. Each polygon knows its adjacent
	neighbors, creating a walkable mesh.
	Using variable shaped/sized elements instead of rigid grid-cells
	provides both, higher accuracy for reaching every corner, and a reduced
	memory footprint as a single polygon is able to cover arbitrarily
	large regions. However, polygons impose several drawbacks on
	common operations used within the transition step, like checking whether
	a point is contained within some region. This is much more costly for polygons
	compared to grid-cells, which are axis-aligned rectangles.
	% museum aus figure: 305 3-ecke
	% museum ganz : 789 fuer alles
	%
	Such issues can be mitigated by using triangles instead of polygons, depicted within \reffig{fig:museumMapMesh}.
	Doing so, each element within the mesh has exactly three edges and a maximum of three neighbors.
	While this usually requires some additional memory, as more triangles are need compared to polygons,
	operations, such as aforementioned contains-check, can now easily be performed,
	\eg{} by using barycentric coordinates.

	\newcommand{\turnNoise}{\mathcal{T}}
	\newcommand{\stepSize}{\mathcal{S}}
	This data structure yields room for various strategies to be applied within the transition step.
	The most simple approach uses an average pedestrian step size together with the
	number of detected steps $\mObsSteps$ and change in heading $\mObsHeading$
	gathered from sensor observations $\mObsVec_{t-1}$.
	Combined with previously estimated position $(x,y)^T$ and heading $\mStateHeading$
	%from $\mStateVec_{t-1}$
	, including uncertainties for step-size $\stepSize$
	and turn-angle $\turnNoise$,
	this directly defines new potential whereabouts
	$p(\mStateVec_{t} \mid \mStateVec_{t-1}, \mObsVec_{t-1})$:
	%
	\begin{equation}
		\begin{aligned}
			x_t &=&		\overbrace{x_{t-1}}^{\text{old pos.}}&	&	&+& \overbrace{\mObsSteps \cdot \stepSize}^{\text{distance}}&	&	&\cdot& \overbrace{\cos(\mStateHeading_{t})}^{\text{direction}}&		&			,\enskip \turnNoise &\sim \mathcal{N}(\mObsHeading, \sigma_\text{turn}^2)				  \\
			y_t &=&		y_{t-1}\phantom{.}&						&	&+& \mObsSteps \cdot \stepSize&									&	&\cdot& \sin(\mStateHeading_{t})&										&			,\enskip \stepSize &\sim \mathcal{N}(\SI{70}{\centi\meter}, \sigma_\text{step}^2)	\\
			\mStateHeading_{t} &=& \mStateHeading_{t-1} + \turnNoise\\
		\end{aligned}
	\end{equation}
	\noindent{}with
	\begin{equation*}
			\mObsSteps,\mObsHeading	\in	\mObsVec_{t-1}
			\enskip\enskip\enskip
			\text{and}
			\enskip\enskip\enskip
			x_{t-1},y_{t-1},\mStateHeading_{t-1}	\in	\mStateVec_{t-1}
			\enskip.
	\end{equation*}

	Whether the newly obtained destination $(x_t, y_t)^T$ is actually reachable from the start $(x_{t-1}, y_{t-1})^T$ can be determined
	by checking if their corresponding triangles are connected with each other.
	If so, the corresponding $z_t$ can be interpolated using the barycentric coordinates of $(x_t, y_t)^T$
	within a 2D projection of the triangle the position belongs to and applying them to the original 3D triangle.

	If the destination is unreachable,
	\eg{} due to walls or other obstacles. Those occurrences demand for different handling strategies. Simply trying again might
	be a viable solution, as uncertainty induced by $\turnNoise$ and $\stepSize$ will yield a slightly different destination
	that might be reachable. Increasing $\sigma_\text{step}$ and $\sigma_\text{turn}$ for those cases might also be a viable choice.
	Likewise, just using some random position, omitting heading/steps might be viable as well.

The detected steps $\mObsSteps$ and the heading change $\mObsHeading$ are obtained using the smartphone's IMU.
To provide a robust heading change, we first need to rotate the gyroscope onto the east-north-up frame using a suitable transformation matrix.
After the rotation, integrating over the gyros $z$-axis for a predefined time interval provides the user’s heading change (yaw) \cite{Ebner-15}.
To obtain the matrix in the first place, we assume that the acceleration during walking is cyclic and thus the average acceleration over several cycles has to be almost zero.
This enables to measure the direction of gravity and use it to construct the transformation matrix.
It should be noted, that especially for cheap IMUs, as they can be found in most smartphones, the matrix has to be updated at very short intervals of one or two seconds to preserve good results \cite{davidson2017survey}.

To receive the number of steps, we use a very simple step detection based on the accelerometer magnitude.
For this, we calculated the difference between the average magnitude over the last \SI{200}{\milli\second} and the gravity vector.
If this difference is above a certain threshold ($> \SI{0.32}{\m\per\square\s}$), a step is detected.
To prevent multiple detections within an unrealistic short interval, we block the complete process for \SI{250}{\milli\second} \cite{Koeping14}.
Of course, there are much more advanced methods as surveyed in \cite{davidson2017survey}, however this simple method has served us very well in the past.

	%\commentByFrank{es gaebe noch ganz andere ansaetze etc. aber wir haben wohl nicht mehr genug platz :P}
	%\commentByToni{ich denke aber auch, es langt.}