From 9f358d69c9ad564082299f9b9fdaac8b846b6603 Mon Sep 17 00:00:00 2001
From: Markus Bullmann <markus.bullmann@fhws.de>
Date: Tue, 8 May 2018 11:08:48 +0200
Subject: [PATCH] Minor changes to wording

---
 tex/chapters/abstract.tex     |  2 +-
 tex/chapters/introduction.tex |  8 ++++----
 tex/chapters/kde.tex          | 10 +++++-----
 tex/chapters/multivariate.tex |  5 ++---
 tex/chapters/relatedwork.tex  | 11 +++++------
 5 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/tex/chapters/abstract.tex b/tex/chapters/abstract.tex
index c8de6ed..31e9ea7 100644
--- a/tex/chapters/abstract.tex
+++ b/tex/chapters/abstract.tex
@@ -1,6 +1,6 @@
 \begin{abstract}
 It is common practice to use a sample-based representation to solve problems having a probabilistic interpretation.
-In many real world scenarios one is then interested in finding a \qq{best estimate} of the underlying problem, \eg{} the position of a robot.
+In many real world scenarios one is then interested in finding a best estimate of the underlying problem, \eg{} the position of a robot.
 This is often done by means of simple parametric point estimators, providing the sample statistics. 
 However, in complex scenarios this frequently results in a poor representation, due to multimodal densities and limited sample sizes.
 
diff --git a/tex/chapters/introduction.tex b/tex/chapters/introduction.tex
index 0e28cea..b9a640f 100644
--- a/tex/chapters/introduction.tex
+++ b/tex/chapters/introduction.tex
@@ -14,7 +14,7 @@ While such methods are computational fast and suitable most of the time, it is n
 Especially time-sequential, non-linear and non-Gaussian state spaces, depending upon a high number of different sensor types, frequently suffer from a multimodal representation of the posterior distribution.  
 As a result, those techniques are not able to provide an accurate statement about the most probable state, rather causing misleading or false outcomes. 
 For example, in a localization scenario where a bimodal distribution represents the current posterior, a reliable position estimation is more likely to be at one of the modes, instead of somewhere in-between, like provided by a simple weighted-average estimation.
-Additionally, in most practical scenarios the sample size and therefore the resolution is limited, causing the variance of the sample based estimate to be high \cite{Verma2003}.
+Additionally, in most practical scenarios the sample size, and hence the resolution is limited, causing the variance of the sample based estimate to be high \cite{Verma2003}.
 
 It is obvious, that a computation of the full posterior could solve the above, but finding such an analytical solution is an intractable problem, which is the reason for applying a sample representation in the first place. 
 Another promising way is to recover the probability density function from the sample set itself, by using a non-parametric estimator like a kernel density estimation (KDE). 
@@ -25,10 +25,10 @@ Nevertheless, the availability of a fast processing density estimate might impro
 %Therefore, this paper presents a novel approximation approach for rapid computation of the KDE. 
 %In this paper, a well known approximation of the Gaussian filter is used to speed up the computation of the KDE. 
 In this paper, a novel approximation approach for rapid computation of the KDE is presented.
-The basic idea is to interpret the estimation problem as a filtering operation.
+The basic idea is to interpret the density estimation problem as a filtering operation.
 We show that computing the KDE with a Gaussian kernel on binned data is equal to applying a Gaussian filter on the binned data.
-This allows us to use a well known approximation scheme for Gaussian filters: the box filter.
-By the central limit theorem, multiple recursion of a box filter yields an approximative Gaussian filter \cite{kovesi2010fast}.
+This allows us to use a well known approximation scheme based on multiple recursions of a box filter, which yields an approximative Gaussian filter given by the central limit theorem \cite{kovesi2010fast}.
+
 
 This process converges quite fast to a reasonable close approximation of the ideal Gaussian.
 In addition, a box filter can be computed extremely fast by a computer, due to its intrinsic simplicity.
diff --git a/tex/chapters/kde.tex b/tex/chapters/kde.tex
index f81edb4..39aff26 100644
--- a/tex/chapters/kde.tex
+++ b/tex/chapters/kde.tex
@@ -22,7 +22,7 @@ The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
 where $W=\sum_{i=1}^{N}w_i$ and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth.
 $K$ is a kernel function such that $\int K(u) \dop{u} = 1$.
 In general, any kernel can be used, however a common advice is to chose a symmetric and low-order polynomial kernel.
-Thus, several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite{scott2015}.
+Several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite{scott2015}.
 
 While the kernel estimate inherits all the properties of the kernel, usually it is not of crucial matter if a non-optimal kernel was chosen.
 As a matter of fact, the quality of the kernel estimate is primarily determined by the smoothing parameter $h$ \cite{scott2015}.
@@ -41,7 +41,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined
 % TODO aus gründen wird hier die Bandbreite als gegeben angenommen
 %
 %As mentioned above the particular choice of the kernel is only of minor importance as it affects the overall result in an negligible way.
-It is common practice to suspect that the data is approximately Gaussian, and therefore the Gaussian kernel is frequently used.
+It is common practice to suspect that the data is approximately Gaussian, hence the Gaussian kernel is frequently used.
 %Note that this assumption is different compared to assuming a concrete distribution family like a Gaussian distribution or mixture distribution.
 In this work we choose the Gaussian kernel in favour of computational efficiency as our approach is based on the approximation of the Gaussian filter.
 The Gaussian kernel is given as
@@ -109,15 +109,15 @@ This reduces the number of kernel evaluations to $\landau{G}$, but the number of
 Using the FFT to perform the discrete convolution, the complexity can be further reduced to $\landau{G\log{G}}$ \cite{silverman1982algorithm}.%, which is currently the fastest exact BKDE algorithm.
 
 The \mbox{FFT-convolution} approach is usually highlighted as the striking computational benefit of the BKDE.
-However, for this work it is the key to recognize the discrete convolution structure of \eqref{eq:binKde}, as this allows to interpret the computation of a density estimate as a signal filter problem.
+However, for this work it is key to recognize the discrete convolution structure of \eqref{eq:binKde}, as this allows to interpret the computation of a density estimate as a signal filter problem.
 This makes it possible to apply a wide range of well studied techniques from the broad field of digital signal processing (DSP).
-Using the Gaussian kernel from \eqref{eq:gausKern} in conjunction with \eqref{eq:binKde} results in the following equation
+Using the Gaussian kernel from \eqref{eq:gausKern} in conjunction with \eqref{eq:binKde} gives
 \begin{equation}
 \label{eq:bkdeGaus}
 \tilde{f}(g_x)=\frac{1}{W\sqrt{2\pi}} \sum_{j=1}^{G} \frac{C_j}{h} \expp{-\frac{(g_x-g_j)^2}{2h^2}} \text{.}
 \end{equation}
 
-The above formula is a convolution operation of the data and the Gaussian kernel.
+The above formula is a convolution of the data and the Gaussian kernel.
 More precisely, it is a discrete convolution of the finite data grid and the Gaussian function.
 In terms of DSP this is analogous to filter the binned data with a Gaussian filter.
 This finding allows to speed up the computation of the density estimate by using a fast approximation scheme based on iterated box filters.
diff --git a/tex/chapters/multivariate.tex b/tex/chapters/multivariate.tex
index edfaad0..85c6d88 100644
--- a/tex/chapters/multivariate.tex
+++ b/tex/chapters/multivariate.tex
@@ -24,9 +24,8 @@ The only exception is the Gaussian kernel, which is spherically symmetric and ha
 In addition, only smoothing in the direction of the axes is possible.
 If smoothing in other directions is necessary, the computation needs to be done on a prerotated sample set and the estimate needs to be rotated back to fit the original coordinate system \cite{wand1994fast}.
 
-For the multivariate BKDE, in addition to the kernel function, the grid and the binning rules need to be extended to multivariate data.
-Their extensions are rather straightforward, as the grid is easily defined on many dimensions.
-Likewise, the ideas of common and linear binning rule scale with dimensionality \cite{wand1994fast}.
+For the multivariate BKDE, in addition to the kernel function, the grid and the binning rules need to be extended to multivariate data, which is rather straightforward, as the grid is easily defined on many dimensions.
+Likewise, the common and linear binning rule scale with dimensionality \cite{wand1994fast}.
 
 In general, multi-dimensional filters are multi-dimensional convolution operations.
 However, by utilizing the separability property of convolution, a straightforward and a more efficient implementation can be found.
diff --git a/tex/chapters/relatedwork.tex b/tex/chapters/relatedwork.tex
index bba48a1..f24967a 100644
--- a/tex/chapters/relatedwork.tex
+++ b/tex/chapters/relatedwork.tex
@@ -6,21 +6,20 @@
 % -> Fourier transfom
 
 
-The Kernel density estimator is a well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
+The kernel density estimator is a well known non-parametric density estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
 It was subject to extensive research and its theoretical properties are well understood.
 A comprehensive reference is given by Scott \cite{scott2015}.
 Although classified as non-parametric, the KDE depends on two free parameters, the kernel function and its bandwidth.
 The selection of a \qq{good} bandwidth is still an open problem and heavily researched.
-An extensive overview regarding the topic of automatic bandwith selection is given by \cite{heidenreich2013bandwidth}.
+An extensive overview regarding the topic of automatic bandwidth selection is given by \cite{heidenreich2013bandwidth}.
 %However, the automatic selection of the bandwidth is not subject of this work and we refer to the literature \cite{turlach1993bandwidth}.
 
-The great flexibility of the KDE makes it very useful for many applications.
+The great flexibility of the KDE makes it suitable for many applications.
 However, this comes at the cost of a slow computation speed.
 %
 The complexity of a naive implementation of the KDE is \landau{MN}, given by $M$ evaluations of $N$ data samples as input size. 
 %The complexity of a naive implementation of the KDE is \landau{NM} evaluations of the kernel function, given $N$ data samples and $M$ points of the estimate.
-Therefore, a lot of effort was put into reducing the computation time of the KDE.
-Various methods have been proposed, which can be clustered based on different techniques.
+Various methods have been proposed to reduce the computation time of the KDE.
 
 %  k-nearest neighbor searching
 An obvious way to speed up the computation is to reduce the number of evaluated kernel functions.
@@ -29,7 +28,7 @@ These algorithms reduce the number of evaluated kernels by taking the distance b
 
 %  fast multipole method & Fast Gaus Transform 
 Another approach is to reduce the algorithmic complexity of the sum over Gaussian functions, by employing a specialized variant of the fast multipole method.
-The term fast Gauss transform was coined by Greengard \cite{greengard1991fast} who suggested this approach to reduce the complexity of the KDE to \landau{N+M}.
+The term fast Gauss transform was coined by Greengard \cite{greengard1991fast} who suggested this approach to reduce the complexity to \landau{N+M}.
 % However, the complexity grows exponentially with dimension. \cite{Improved Fast Gauss Transform and Efficient Kernel Density Estimation}
 
 % FastKDE, passed on ECF and nuFFT