first draft introduction finished

first draft related work finished
2018-02-14 18:16:29 +01:00
parent a5fc1628e6
commit df18cc87ee
3 changed files with 42 additions and 22 deletions
--- a/tex/chapters/relatedwork.tex
+++ b/tex/chapters/relatedwork.tex
@@ -9,13 +9,16 @@
 Kernel density estimation is well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
 It was subject to extensive research and its theoretical properties are well understood.
 A comprehensive reference is given by Scott \cite{scott2015}.
-Although classified as non-parametric, the KDE has a two free parameters, the kernel function and its bandwidth.
+Although classified as non-parametric, the KDE depends on two free parameters, the kernel function and its bandwidth.
 The selection of a \qq{good} bandwidth is still an open problem and heavily researched.
-However, the automatic selection of the bandwidth is not subject of this work and we refer to the literature \cite{turlach1993bandwidth}.
+An extensive overview regarding the topic of automatic bandwith selection is given by \cite{heidenreich2013bandwith}.
+%However, the automatic selection of the bandwidth is not subject of this work and we refer to the literature \cite{turlach1993bandwidth}.

 The great flexibility of the KDE renders it very useful for many applications.
-However, its flexibility comes at the cost of a relative slow computation speed.
-The complexity of a naive implementation of the KDE is \landau{NM} evaluations of the kernel function, given $N$ data samples and $M$ points of the estimate.
+However, this comes at the cost of a relative slow computation speed.
+%
+The complexity of a naive implementation of the KDE is \landau{MN}, given by $M$ evaluations of $N$ data samples. 
+%The complexity of a naive implementation of the KDE is \landau{NM} evaluations of the kernel function, given $N$ data samples and $M$ points of the estimate.
 Therefore, a lot of effort was put into reducing the computation time of the KDE.
 Various methods have been proposed, which can be clustered based on different techniques.

@@ -32,16 +35,26 @@ The term fast Gauss transform was coined by Greengard \cite{greengard1991fast} w
 % FastKDE, passed on ECF and nuFFT
 Recent methods based on the \qq{self-consistent} KDE proposed by Bernacchia and Pigolotti allow to obtain an estimate without any assumptions.
 They define a Fourier-based filter on the empirical characteristic function of a given dataset.
-The computation time was further reduced by \etal{O'Brien} using a non-uniform FFT algorithm to efficiently transform the data into Fourier space.
+The computation time was further reduced by \etal{O'Brien} using a non-uniform fast Fourier transform (FFT) algorithm to efficiently transform the data into Fourier space.
 Therefore, the data is not required to be on a grid.

 % binning => FFT
 In general, it is desirable to omit a grid, as the data points do not necessary fall onto equally spaced points.
 However, reducing the sample size by distributing the data on a equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable.
-Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins and apply a FFT to quickly compute the estimate.
-This approximation scheme was later called binned KDE an was extensively studied \cite{fan1994fast} \cite{wand1994fast} \cite{hall1996accuracy} \cite{holmstrom2000accuracy}.
+Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins, which results in a discrete convolution structure of the KDE.
+Allowing to efficiently compute the estimate using a FFT algorithm.
+This approximation scheme was later called binned KDE (BKDE) and was extensively studied \cite{fan1994fast} \cite{wand1994fast} \cite{hall1996accuracy} \cite{holmstrom2000accuracy}.

 The idea to approximate a Gaussian filter using several box filters was first formulated by Wells \cite{wells1986efficient}.
-Kovesi \cite{kovesi2010fast} suggested to use two box filter with different widths to increase accuracy maintaining the same complexity.
+Kovesi \cite{kovesi2010fast} suggested to use two box filters with different widths to increase accuracy maintaining the same complexity.
 To eliminate the approximation error completely \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter.

+This work highlights the discrete convolution structure of the BKDE and elaborates its connection to digital signal processing, especially the Gaussian filter.
+Accordingly, this results in an equivalence relation between BKDE and Gaussian filter.
+It follows, that the above mentioned box filter approach is also an approximation of the BKDE, resulting in an efficient computation scheme presented within this paper. 
+This approach has a lower complexity as comparable FFT-based algorithms and adds only a negligible small error, while improving the performance significantly. 
+
+
+
+
+