Language

2018-03-12 23:24:55 +01:00
parent b76b07b485
commit 9f098887db
5 changed files with 30 additions and 29 deletions
--- a/tex/bare_conf.tex
+++ b/tex/bare_conf.tex
@@ -121,7 +121,7 @@
 \newcommand{\qq}    [1]{``#1''}
 \newcommand{\eg}       {e.\,g.}
 \newcommand{\ie}       {i.\,e.}
-
+\newcommand{\figref}[1]{Fig.~\ref{#1}}
 % missing math operators
 \DeclareMathOperator*{\argmin}{arg\,min}
 \DeclareMathOperator*{\argmax}{arg\,max}
--- a/tex/chapters/conclusion.tex
+++ b/tex/chapters/conclusion.tex
@@ -1,12 +1,12 @@
 \section{Conclusion}
 Within this paper a novel approach for rapid approximation of the KDE was presented. 
-This is achieved by considering the discrete convolution structure of the BKDE and thus elaborate its connection to digital signal processing, especially the Gaussian filter.
+This is achieved by considering the discrete convolution structure of the BKDE and thus elaborating its connection to digital signal processing, especially the Gaussian filter.
-Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as confirmed by the utilized experiments.
+Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as demonstrated by the utilized experiments.
-The analysis of the error showed that the method exhibits an similar error behaviour compared to the BKDE.
+The analysis of the error showed that the method shows an similar error behaviour compared to the BKDE.
 In terms of calculation time, our approach outperforms other state of the art implementations.
-Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with increasing number of dimensions. 
+Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with an increasing number of dimensions. 
 %future work kurz
 Finally, such a fast computation scheme makes the KDE more attractive for real time use cases.
--- a/tex/chapters/experiments.tex
+++ b/tex/chapters/experiments.tex
@@ -11,19 +11,19 @@ Hence, $f$ is of minor importance here and was chosen rather arbitrary to highli
 \begin{equation}
 \begin{split}
  \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\ 
-              &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}}
+              &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}}  \text{,}
 \end{split}
 \end{equation}
 where the majority of the probability mass lies in the range $[-6; 6]^2$.
 \begin{figure}[t]
    \input{gfx/error.tex}
-    \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps to rounding.}  \label{fig:errorBandwidth}
+    \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps due to rounding.}  \label{fig:errorBandwidth}
 \end{figure}
 Four estimates are computed with varying bandwidth using the exact KDE, BKDE, BoxKDE, and ExBoxKDE, which uses the extended box filter.
 %Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points.
-The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in fig.~\ref{fig:errorBandwidth}.
+The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in \figref{fig:errorBandwidth}.
 A minimum error is obtained with $h=0.35$, for larger values oversmoothing occurs and the modes gradually fuse together.
 Both the BKDE and the ExBoxKDE resemble the error curve of the KDE quite well and stable.
@@ -31,14 +31,14 @@ They are rather close to each other, with a tendency to diverge for larger $h$.
 In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.40, 0.67, 0.82\}$.
 These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}.
-As the extend box filter is able to approximate an exact $\sigma$, these discontinues don't appear.
+As the extended box filter is able to approximate an exact $\sigma$, such discontinuities don't appear.
-Consequently, it reduces the overall error of the approximation, but only marginally in this scenario.
+Consequently, it reduces the overall error of the approximation, even though only marginal in this scenario.
 The global average MISE over all values of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version.
 Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively.
 The choice between the extended and regular box filter algorithm depends on how large the acceptable error should be, thus on the particular application.
 Other test cases of theoretical relevance are the MISE as a function of the grid size $G$ and the sample size $N$.
-However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirm the theoretical expectations.
+However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirms theoretical expectations.
 \begin{figure}[t]
@@ -54,25 +54,25 @@ However, both cases do not give a deeper insight of the error behavior of our me
 \subsection{Performance}
 In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods.
-All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory.
+All tests are performed on an Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory.
 We compare our C++ implementation of the BoxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the FastKDE Python implementation \cite{oBrien2016fast}.
 The \texttt{ks} package provides a FFT-based BKDE implementation based on optimized C functions at its core.
 With state estimation problems in mind, we additionally provide a C++ implementation of a weighted-average estimator.
 As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the FastKDE.
-The results of the performance comparison are presented in fig.~\ref{fig:performance}.
+The results of the performance comparison are presented in \figref{fig:performance}.
 % O(N) gut erkennbar für box KDE und weighted average
 The linear complexity of the BoxKDE and the weighted average is clearly visible.
 % Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und FastKDE, aber das WA deutlich schneller als alle anderen
 Especially for small $G$ up to $10^3$ the BoxKDE is much faster compared to BKDE and FastKDE.
 % Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer.
-Nevertheless, the simple weighted-average approach performs the fastest and with increasing $G$ the distance to the BoxKDE grows constantly. 
+Nevertheless, the simple weighted-average approach performs the fastest, and with increasing $G$ the distance to the BoxKDE grows constantly. 
 However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}. 
 % (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte)
 % Auffällig ist der Stufenhafte Anstieg der Laufzeit bei der R Implementierung.
-Further looking at fig. \ref{fig:performance}, the runtime performance of the BKDE approach is increasing in a stepwise manner with growing $G$. 
+Looking at \figref{fig:performance}, the runtime performance of the BKDE approach is increasing in a stepwise manner with growing $G$. 
 % Dies kommt durch die FFT. Der Input in für die FFT muss immer auf die nächste power of two gerundet werden.
 This behavior is caused by the underlying FFT algorithm. 
 % Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant.
@@ -85,14 +85,14 @@ The termination of BKDE graph at $G=4406^2$ is caused by an out of memory error
 % Während die durschnittliche Laufzeit über alle Werte von G beim box filter bei 0.4092s liegt, benötigte der extended box filter im Durschnitt 0.4169s.
 Both discussed Gaussian filter approximations, namely box filter and extended box filter, yield a similar runtime behavior and therefore a similar curve progression.
 While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one has an average of \SI{0.4169}{\second}. 
-To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of the BoxKDE with the regular box filter. 
+To disambiguate \figref{fig:performance}, we only illustrated the results of the BoxKDE with the regular box filter. 
 The weighted-average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly.
 In contrast, the computation of the BoxKDE approach increases exponentially with increasing number of dimensions.
 However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficiently fast for many applications and much smaller compared to other methods.
 The BoxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. 
-In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliable predictable.
+In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliably predictable.
 Furthermore, the computation is easily parallelized, as there is no data dependency between the one-dimensional filter passes in algorithm~\ref{alg:boxKDE}.
 Hence, the inner loops can be parallelized using threads or SIMD instructions, but the overall speedup depends on the particular architecture and the size of the input.
--- a/tex/chapters/realworld.tex
+++ b/tex/chapters/realworld.tex
@@ -17,26 +17,26 @@ The bivariate state estimation was calculated whenever a step was recognized, ab
 \begin{figure}
 	\input{gfx/walk.tex}
-	\caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. }
+	\caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides a high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. }
 	\label{fig:realWorldMulti}
 \end{figure}
 %
-Fig.~\ref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions.
+\figref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions.
 This results in a bimodal posterior distribution, which reaches its maximum distances between the modes at \SI{13.4}{\second} (black dotted line).
 Thus estimating the most probable state over time using the weighted-average results in the blue line, describing the pedestrian's position to be somewhere outside the building (light green area). 
 In contrast, the here proposed method (orange line) is able to retrieve a good estimate compared to the ground truth path shown by the black solid line.
 Due to a right turn, the distribution gets unimodal after \SI{20.8}{\second}. 
-This happens since the lower red particles are walking against a wall and are punished with a low weight.
+This happens since the lower red particles are walking against a wall, and therefore are punished with a low weight.
 This example highlights the main benefits using our approach. 
 While being fast enough to be computed in real time, the proposed method reduces the estimation error of the state in this situation, as it is possible to distinguish the two modes of the density.
 It is clearly visible, that this enables the system to recover the real state if multimodalities arise.
 However, in situations with highly uncertain measurements, the estimation error could further increase since the real estimate is not equal to the best estimate, \ie{} the real position of the pedestrian.
-The error over time for different estimation methods of the complete walk can be seen in fig. \ref{fig:realWorldTime}.
+The error over time for different estimation methods of the complete walk can be seen in \figref{fig:realWorldTime}.
 It is given by calculating the distance between estimation and ground truth at a specific time $t$.
 Estimates provided by simply choosing the maximum particle stand out the most. 
-As one could have expected beforehand, this method provides many strong peaks through continuously jumping between single particles. 
+As expected beforehand, this method provides many strong peaks through continuously jumping between single particles. 
 Additionally, in most real world scenarios many particles share the same weight and thus multiple highest-weighted particles exist.
 \begin{figure}
@@ -45,7 +45,7 @@ Additionally, in most real world scenarios many particles share the same weight
 	\label{fig:realWorldTime}
 \end{figure}
-Further investigating fig. \ref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average.
+Further investigating \figref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average.
 However after deploying \SI{100} Monte Carlo runs, the difference becomes insignificant.
 The main reason for this are again multimodalities caused by faulty or delayed measurements, especially when entering or leaving rooms. 
 Within our experiments the problem occurred due to slow and attenuated Wi-Fi signals inside thick-walled rooms. 
@@ -54,7 +54,7 @@ Therefore, the average between the modes of the distribution is often closer to
 With new measurements coming from the hallway or other parts of the building, the distribution and thus the estimation are able to recover.
 Nevertheless, it can be seen that our approach is able to resolve multimodalities even under real world conditions. 
-It does not always provide the lowest error, since it depends more on an accurate sensor model than a weighted-average approach, but is very suitable as a good indicator about the real performance of a sensor fusion system.
+It does not always provide the lowest error, since it depends more on an accurate sensor model than a weighted-average approach, but it is very suitable as a good indicator about the real performance of a sensor fusion system.
 In the here shown examples we only searched for a global maxima, even though the BoxKDE approach opens a wide range of other possibilities for finding a best estimate. 
 %springt nicht so viel wie maximum
--- a/tex/chapters/usage.tex
+++ b/tex/chapters/usage.tex
@@ -5,12 +5,13 @@
 %As the density estimation poses only a single step in the whole process, its computation needs to be as fast as possible.
 % not taking to much time from the frame
-Consider a set of two-dimensional samples with associated weights, \eg{} presumably generated from a particle filter system.
+Consider a set of two-dimensional samples with associated weights, \eg{} generated from a particle filter system.
 The overall process for bivariate data is described in Algorithm~\ref{alg:boxKDE}.
 Assuming that the given $N$ samples are stored in a sequential list, the first step is to create a grid representation.
-In order to efficiently construct the grid and to allocate the required memory the extrema of the samples need to be known in advance.
+In order to efficiently construct the grid and to allocate the required memory, the extrema of the samples need to be known in advance.
-These limits might be given by the application, for example, the position of a pedestrian within a building is limited by the physical dimensions of the building.
+These limits might be given by the application.
 For example, the position of a pedestrian within a building is limited by the physical dimensions of the building.
 Such knowledge should be integrated into the system to avoid a linear search over the sample set, naturally reducing the computation time.
 \begin{algorithm}[t]
@@ -54,7 +55,7 @@ Given the extreme values of the samples and grid sizes $G_1$ and $G_2$ defined b
 As the number of grid points directly affects both, computation time and accuracy, a suitable grid should be as coarse as possible, but at the same time narrow enough to produce an estimate sufficiently fast with an acceptable approximation error.
 If the extreme values are known in advanced, the computation of the grid is $\landau{N}$, otherwise an additional $\landau{N}$ search is required.
-The grid is stored as an linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$.
+The grid is stored as a linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$.
 Next, the binned data is filtered with a Gaussian using the box filter approximation.
 The box filter's width is derived by \eqref{eq:boxidealwidth} from the standard deviation of the approximated Gaussian, which is in turn equal to the bandwidth of the KDE.
@@ -69,7 +70,7 @@ If multivariate data is processed, the algorithm is easily extended due to its s
 Each filter pass is computed in $\landau{G}$ operations, however, an additional memory buffer is required \cite{dspGuide1997}.
 While the integer-sized box filter requires fewest operations, it causes a larger approximation error due to rounding errors.
-Depending on the required accuracy, the extended box filter algorithm can further improve the estimation results, with only a small additional overhead \cite{gwosdek2011theoretical}.
+Depending on the required accuracy, the extended box filter algorithm can further improve the estimation results with only a small additional overhead \cite{gwosdek2011theoretical}.
 Due to its simple indexing scheme, the recursive box filter can easily be computed in parallel using SIMD operations and parallel computation cores.
 Finally, the most likely state can be obtained from the filtered data, \ie{} from the estimated discrete density, by searching filtered data for its maximum value.