Fixed many bugs

This commit is contained in:
2018-02-27 10:49:05 +01:00
parent 9d4927a365
commit 1fb9461a5f
8 changed files with 67 additions and 68 deletions

View File

@@ -6,7 +6,7 @@
% -> Fourier transfom
Kernel density estimation is well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
The Kernel density estimator is a well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
It was subject to extensive research and its theoretical properties are well understood.
A comprehensive reference is given by Scott \cite{scott2015}.
Although classified as non-parametric, the KDE depends on two free parameters, the kernel function and its bandwidth.
@@ -24,7 +24,7 @@ Various methods have been proposed, which can be clustered based on different te
% k-nearest neighbor searching
An obvious way to speed up the computation is to reduce the number of evaluated kernel functions.
One possible optimization is based on k-nearest neighbour search performed on spatial data structures.
One possible optimization is based on k-nearest neighbour search, performed on spatial data structures.
These algorithms reduce the number of evaluated kernels by taking the distance between clusters of data points into account \cite{gray2003nonparametric}.
% fast multipole method & Fast Gaus Transform
@@ -38,16 +38,16 @@ They define a Fourier-based filter on the empirical characteristic function of a
The computation time was further reduced by \etal{O'Brien} using a non-uniform fast Fourier transform (FFT) algorithm to efficiently transform the data into Fourier space \cite{oBrien2016fast}.
% binning => FFT
In general, it is desirable to omit a grid, as the data points do not necessary fall onto equally spaced points.
However, reducing the sample size by distributing the data on a equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable.
In general, it is desirable to omit a grid, as the data points do not necessarily fall onto equally spaced points.
However, reducing the sample size by distributing the data on an equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable.
Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins, which results in a discrete convolution structure of the KDE.
Allowing to efficiently compute the estimate using a FFT algorithm.
This approximation scheme was later called binned KDE (BKDE) and was extensively studied \cite{fan1994fast} \cite{wand1994fast} \cite{hall1996accuracy}.
While the FFT algorithm poses an efficient algorithm for large sample sets, it adds an noticeable overhead for smaller ones.
While the FFT algorithm constitutes an efficient algorithm for large sample sets, it adds an noticeable overhead for smaller ones.
The idea to approximate a Gaussian filter using several box filters was first formulated by Wells \cite{wells1986efficient}.
Kovesi \cite{kovesi2010fast} suggested to use two box filters with different widths to increase accuracy maintaining the same complexity.
To eliminate the approximation error completely \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter.
To eliminate the approximation error completely, \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter.
This work highlights the discrete convolution structure of the BKDE and elaborates its connection to digital signal processing, especially the Gaussian filter.
Accordingly, this results in an equivalence relation between BKDE and Gaussian filter.