Intro & related work
This commit is contained in:
@@ -1,2 +1,27 @@
|
|||||||
\section{Experiments}
|
\section{Experiments}
|
||||||
|
We now empirically evaluate the accuracy of our method and compare its runtime performance with other state of the art approaches.
|
||||||
|
To conclude our findings we present a real world example from a indoor localisation system.
|
||||||
|
|
||||||
|
All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of $4.5 \text{GHz}$, which supports the AVX2 instruction set, hence 256-bit wide SIMD registers are available.
|
||||||
|
We compare our C++ implementation of the box filter based KDE to the KernSmooth R package and the \qq{FastKDE} implementation \cite{fastKDE}.
|
||||||
|
The KernSmooth packages provides a FFT-based BKDE implementation based on optimized C functions at its core.
|
||||||
|
|
||||||
|
\subsection{Error}
|
||||||
|
In order to quantity the accuracy of our method the mean integrated squared error (MISE) is used.
|
||||||
|
The ground truth is given as a synthetic data set drawn from a mixture normal density.
|
||||||
|
Clearly, the choice of the ground truth distribution affects the resulting error.
|
||||||
|
However, as our method approximates the KDE it is only of interest to evaluate the closeness to the KDE and not to the ground truth itself.
|
||||||
|
Therefore, the particular choice of the ground truth is only of minor importance here.
|
||||||
|
|
||||||
|
At first we evaluate the accuracy of our method as a function of the bandwidth $h$ in comparison to the exact KDE and the BKDE.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
% kde, box filter, exbox in abhänigkeit von h (bild)
|
||||||
|
% sample size und grid size text
|
||||||
|
% fastKDE fehler vergleich macht kein sinn weil kernel und bandbreite unterschiedlich sind
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Performance}
|
||||||
|
|
||||||
|
\subsection{Real World}
|
||||||
|
|||||||
@@ -28,14 +28,19 @@ We formalize this ...
|
|||||||
Our experiments support our ..
|
Our experiments support our ..
|
||||||
}
|
}
|
||||||
|
|
||||||
In this paper, a novel approximation approach for rapid computation of the KDE is presented.
|
|
||||||
%Therefore, this paper presents a novel approximation approach for rapid computation of the KDE.
|
%Therefore, this paper presents a novel approximation approach for rapid computation of the KDE.
|
||||||
%In this paper, a well known approximation of the Gaussian filter is used to speed up the computation of the KDE.
|
%In this paper, a well known approximation of the Gaussian filter is used to speed up the computation of the KDE.
|
||||||
|
In this paper, a novel approximation approach for rapid computation of the KDE is presented.
|
||||||
|
The basic idea is to interpret the estimation problem as a filtering operation.
|
||||||
|
We show that computing the KDE with a Gaussian kernel on pre-binned data is equal to applying a Gaussian filter on the binned data.
|
||||||
|
This allows us to use a well known approximation scheme for Gaussian filters using the box filter.
|
||||||
|
Multiple recursion of a box filter yields an approximative Gaussian filter \cite{kovesi2010fast}.
|
||||||
|
|
||||||
|
This process converges quite fast to a reasonable close approximation of the ideal Gaussian.
|
||||||
|
In addition, a box filter can be computed extremely fast by a computer, due to its intrinsic simplicity.
|
||||||
|
While the idea to use several box filter passes to approximate a Gaussian has been around for a long, the application to obtain a fast KDE is new.
|
||||||
|
% time sequential, fixed computation time, pre binned data!!
|
||||||
|
|
||||||
% KDE wellknown nonparametic estimation method
|
% KDE wellknown nonparametic estimation method
|
||||||
% Flexibility is paid with slow speed
|
% Flexibility is paid with slow speed
|
||||||
|
|||||||
@@ -1,5 +1,47 @@
|
|||||||
\section{Related work}
|
\section{Related work}
|
||||||
% original work rosenblatt/parzen
|
% original work rosenblatt/parzen
|
||||||
|
% langsam
|
||||||
|
% other approaches Fast Gaussian Transform
|
||||||
% binned version silverman, scott, härdle
|
% binned version silverman, scott, härdle
|
||||||
% -> Fourier transfom
|
% -> Fourier transfom
|
||||||
% other approaches Fast Gaussian Transform
|
|
||||||
|
|
||||||
|
Kernel density estimation is well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}.
|
||||||
|
It was subject to extensive research and its theoretical properties are well understood.
|
||||||
|
A comprehensive reference is given by Scott \cite{scott2015}.
|
||||||
|
Although classified as non-parametric, the KDE has a two free parameters, the kernel function and its bandwidth.
|
||||||
|
The selection of a \qq{good} bandwidth is still an open problem and heavily researched.
|
||||||
|
However, the automatic selection of the bandwidth is not subject of this work and we refer to the literature \cite{turlach1993bandwidth}.
|
||||||
|
|
||||||
|
The great flexibility of the KDE renders it very useful for many applications.
|
||||||
|
However, its flexibility comes at the cost of a relative slow computation speed.
|
||||||
|
The complexity of a naive implementation of the KDE is \landau{NM} evaluations of the kernel function, given $N$ data samples and $M$ points of the estimate.
|
||||||
|
Therefore, a lot of effort was put into reducing the computation time of the KDE.
|
||||||
|
Various methods have been proposed, which can be clustered based on different techniques.
|
||||||
|
|
||||||
|
% k-nearest neighbor searching
|
||||||
|
An obvious way to speed up the computation is to reduce the number of evaluated kernel functions.
|
||||||
|
One possible optimization is based on k-nearest neighbour search performed on spatial data structures.
|
||||||
|
These algorithms reduce the number of evaluated kernels by taking the the spatial distance between clusters of data points into account \cite{gray2003nonparametric}.
|
||||||
|
|
||||||
|
% fast multipole method & Fast Gaus Transform
|
||||||
|
Another approach is to reduce the algorithmic complexity of the sum over Gaussian functions, by employing a specialized variant of the fast multipole method.
|
||||||
|
The term fast Gauss transform was coined by Greengard \cite{greengard1991fast} who suggested this approach to reduce the complexity of the KDE to \label{N+M}.
|
||||||
|
% However, the complexity grows exponentially with dimension. \cite{Improved Fast Gauss Transform and Efficient Kernel Density Estimation}
|
||||||
|
|
||||||
|
% FastKDE, passed on ECF and nuFFT
|
||||||
|
Recent methods based on the \qq{self-consistent} KDE proposed by Bernacchia and Pigolotti allow to obtain an estimate without any assumptions.
|
||||||
|
They define a Fourier-based filter on the empirical characteristic function of a given dataset.
|
||||||
|
The computation time was further reduced by \etal{O'Brien} using a non-uniform FFT algorithm to efficiently transform the data into Fourier space.
|
||||||
|
Therefore, the data is not required to be on a grid.
|
||||||
|
|
||||||
|
% binning => FFT
|
||||||
|
In general, it is desirable to omit a grid, as the data points do not necessary fall onto equally spaced points.
|
||||||
|
However, reducing the sample size by distributing the data on a equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable.
|
||||||
|
Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins and apply a FFT to quickly compute the estimate.
|
||||||
|
This approximation scheme was later called binned KDE an was extensively studied \cite{fan1994fast} \cite{wand1994fast} \cite{hall1996accuracy} \cite{holmstrom2000accuracy}.
|
||||||
|
|
||||||
|
The idea to approximate a Gaussian filter using several box filters was first formulated by Wells \cite{wells1986efficient}.
|
||||||
|
Kovesi \cite{kovesi2010fast} suggested to use two box filter with different widths to increase accuracy maintaining the same complexity.
|
||||||
|
To eliminate the approximation error completely \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter.
|
||||||
|
|
||||||
|
|||||||
@@ -2890,4 +2890,48 @@ year = {2003}
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@inproceedings{kovesi2010fast,
|
||||||
|
title={Fast almost-gaussian filtering},
|
||||||
|
author={Kovesi, Peter},
|
||||||
|
booktitle={Proceedings of the 2010 International Conference on Digital Image Computing: Techniques and Applications},
|
||||||
|
pages={121--125},
|
||||||
|
year={2010},
|
||||||
|
publisher={IEEE}
|
||||||
|
}
|
||||||
|
|
||||||
|
@book{turlach1993bandwidth,
|
||||||
|
title={Bandwidth selection in kernel density estimation: A review},
|
||||||
|
author={Turlach, Berwin A.},
|
||||||
|
year={1993},
|
||||||
|
publisher={CORE and Institut de Statistique Universit{\'e} catholique de Louvain Louvain-la-Neuve}
|
||||||
|
}
|
||||||
|
|
||||||
|
@inproceedings{gray2003nonparametric,
|
||||||
|
title={Nonparametric density estimation: Toward computational tractability},
|
||||||
|
author={Gray, Alexander G and Moore, Andrew W},
|
||||||
|
booktitle={Proceedings of the 2003 SIAM International Conference on Data Mining},
|
||||||
|
pages={203--211},
|
||||||
|
year={2003},
|
||||||
|
organization={SIAM}
|
||||||
|
}
|
||||||
|
|
||||||
|
@article{greengard1991fast,
|
||||||
|
title={The fast Gauss transform},
|
||||||
|
author={Greengard, Leslie and Strain, John},
|
||||||
|
journal={SIAM Journal on Scientific and Statistical Computing},
|
||||||
|
volume={12},
|
||||||
|
number={1},
|
||||||
|
pages={79--94},
|
||||||
|
year={1991},
|
||||||
|
publisher={SIAM}
|
||||||
|
}
|
||||||
|
|
||||||
|
@article{wells1986efficient,
|
||||||
|
title={Efficient synthesis of Gaussian filters by cascaded uniform filters},
|
||||||
|
author={Wells, William M.},
|
||||||
|
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||||
|
number={2},
|
||||||
|
pages={234--239},
|
||||||
|
year={1986},
|
||||||
|
publisher={IEEE}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user