Minor changes to wording

This commit is contained in:
2018-05-08 11:08:48 +02:00
parent 35e1f4e0b3
commit 9f358d69c9
5 changed files with 17 additions and 19 deletions

View File

@@ -22,7 +22,7 @@ The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
where $W=\sum_{i=1}^{N}w_i$ and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth.
$K$ is a kernel function such that $\int K(u) \dop{u} = 1$.
In general, any kernel can be used, however a common advice is to chose a symmetric and low-order polynomial kernel.
Thus, several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite{scott2015}.
Several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite{scott2015}.
While the kernel estimate inherits all the properties of the kernel, usually it is not of crucial matter if a non-optimal kernel was chosen.
As a matter of fact, the quality of the kernel estimate is primarily determined by the smoothing parameter $h$ \cite{scott2015}.
@@ -41,7 +41,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined
% TODO aus gründen wird hier die Bandbreite als gegeben angenommen
%
%As mentioned above the particular choice of the kernel is only of minor importance as it affects the overall result in an negligible way.
It is common practice to suspect that the data is approximately Gaussian, and therefore the Gaussian kernel is frequently used.
It is common practice to suspect that the data is approximately Gaussian, hence the Gaussian kernel is frequently used.
%Note that this assumption is different compared to assuming a concrete distribution family like a Gaussian distribution or mixture distribution.
In this work we choose the Gaussian kernel in favour of computational efficiency as our approach is based on the approximation of the Gaussian filter.
The Gaussian kernel is given as
@@ -109,15 +109,15 @@ This reduces the number of kernel evaluations to $\landau{G}$, but the number of
Using the FFT to perform the discrete convolution, the complexity can be further reduced to $\landau{G\log{G}}$ \cite{silverman1982algorithm}.%, which is currently the fastest exact BKDE algorithm.
The \mbox{FFT-convolution} approach is usually highlighted as the striking computational benefit of the BKDE.
However, for this work it is the key to recognize the discrete convolution structure of \eqref{eq:binKde}, as this allows to interpret the computation of a density estimate as a signal filter problem.
However, for this work it is key to recognize the discrete convolution structure of \eqref{eq:binKde}, as this allows to interpret the computation of a density estimate as a signal filter problem.
This makes it possible to apply a wide range of well studied techniques from the broad field of digital signal processing (DSP).
Using the Gaussian kernel from \eqref{eq:gausKern} in conjunction with \eqref{eq:binKde} results in the following equation
Using the Gaussian kernel from \eqref{eq:gausKern} in conjunction with \eqref{eq:binKde} gives
\begin{equation}
\label{eq:bkdeGaus}
\tilde{f}(g_x)=\frac{1}{W\sqrt{2\pi}} \sum_{j=1}^{G} \frac{C_j}{h} \expp{-\frac{(g_x-g_j)^2}{2h^2}} \text{.}
\end{equation}
The above formula is a convolution operation of the data and the Gaussian kernel.
The above formula is a convolution of the data and the Gaussian kernel.
More precisely, it is a discrete convolution of the finite data grid and the Gaussian function.
In terms of DSP this is analogous to filter the binned data with a Gaussian filter.
This finding allows to speed up the computation of the density estimate by using a fast approximation scheme based on iterated box filters.