From b9711be4fbd0628a12b46f593c8404151b4fae6f Mon Sep 17 00:00:00 2001 From: Markus Bullmann Date: Wed, 14 Feb 2018 18:18:10 +0100 Subject: [PATCH] More box --- tex/chapters/mvg.tex | 54 ++++++++++++++++++++++++++------------------ tex/egbib.bib | 11 +++++++++ 2 files changed, 43 insertions(+), 22 deletions(-) diff --git a/tex/chapters/mvg.tex b/tex/chapters/mvg.tex index de095e6..64c184c 100644 --- a/tex/chapters/mvg.tex +++ b/tex/chapters/mvg.tex @@ -5,7 +5,8 @@ % Repetitive Box filter to approx Gauss % Simple multipass, n/m approach, extended box filter -The Gaussian filter is a popular filter to smooth a signal by convoluting an input signal with the Gaussian function +The Gaussian filter is a widely used smoothing filter. +It is defined as the convolution of an input signal and the Gaussian function \begin{equation} \label{eq:gausfx} g(x) = \frac{1}{\sigma \sqrt{2\pi}} \expp{-\frac{x^2}{2\sigma^2}} \text{,} @@ -13,24 +14,23 @@ g(x) = \frac{1}{\sigma \sqrt{2\pi}} \expp{-\frac{x^2}{2\sigma^2}} \text{,} where $\sigma$ is a smoothing parameter called standard deviation. In the discrete case the Gaussian filter is easily computed with the sliding window algorithm in time domain. -Convolution is separable if the filter kernel is separable, i.e. multidimensional convolution can be computed as individual one-dimensional convolutions with a one-dimensional kernel. +It is easily extended to multi-dimensional signals, as convolution is separable if the filter kernel is separable, i.e. multidimensional convolution can be computed as individual one-dimensional convolutions with a one-dimensional kernel. Because of $\operatorname{e}^{x^2+y^2} = \operatorname{e}^{x^2}\cdot\operatorname{e}^{y^2}$ the Gaussian filter is separable and can be easily applied to multi-dimensional signals. -Separability of a filter can also be used to reduce the number of required operations to compute the filter result. % TODO ähnlichkeit Gauss und KDE -> schneller Gaus = schnelle KDE - Computation of a filter using the a naive implementation of the sliding window algorithm yields $\landau{NK}$, where $N$ is the length of the input signal and $K$ is the size of the filter kernel. Note that in the case of the Gaussian filter $K$ depends on $\sigma$. In order to capture all significant values of the Gaussian function the kernel size $K$ must be adopted to the standard deviation of the Gaussian. -Another approach to efficiently compute a filter result is the FFT-convoultion algorithm which is a $\landau{N\log(N)}$ operation. +A popular approach to efficiently compute a filter result is the FFT-convoultion algorithm which is $\landau{N\log(N)}$. For large values of $\sigma$ the computation time of the Gaussian filter might be reduced by applying the filter in frequency domain. -Both signals are transformed into frequency domain using the FFT. -The filtered result is equal to the point-wise multiplication of the transformed signals. -In case of the Gaussian filter the Fourier transform of the kernel can be saved, as the Gaussian is a eigenfunction for the Fourier transform. +In order to do so, both signals are transformed into frequency domain using the FFT. +The convoluted time signal is equal to the point-wise multiplication of the signals in frequency domain. +In case of the Gaussian filter the computation of the Fourier transform of the kernel can be saved, as the Gaussian is a eigenfunction for the Fourier transform \cite{?}. +While the FFT-convolution algorithm poses an efficient algorithm for large signals, it adds an noticeable overhead for small signals. While the above mentions algorithms poses efficient computations schemes to compute an exact filter result, approximative algorithms can further speed up the computation. A well-known rapid approximation of the Guassian filter is given by the moving average filter. @@ -39,24 +39,33 @@ A well-known rapid approximation of the Guassian filter is given by the moving a The moving average filter is a simplistic filter which takes an input function $x$ and produces a second function $y$. A single output value is computed by taking the average of a number of values symmetrical around a single point in the input. The number of values in the average can also be seen as the width $w=2r+1$, where $r$ is the \qq{radius} of the filter. -The computation of an output value using a moving average filter of radius $r$ is defined as +The computation of an single output value using a moving average filter of radius $r$ is defined as \begin{equation} \label{eq:symMovAvg} y[i]=\frac{1}{2r+1} \sum_{j=-r}^{r}x[i+j] \text{.} \end{equation} -% TODO O(N) complexity - +Such a filter requires clearly $\landau{Nw}$ operations, where $N$ is the signal length. It is well-known that a moving average filter can approximate a Gaussian filter by repetitive recursive computations. \eqref{eq:symMovAvg} is equal to convolution of the input signal and the rectangular function. -Given by the central theorem of probabilistic repetitive convolution of a rectangular function with itself will yield a Gaussian in the limit. -Filtering a signal with the moving average filter in several passes approximately converges to a Gaussian filter. -In practice three or five iterations are most likely enough to obtain a reasonable close Gaussian approximation. -This allows rapid approximations of the Gaussian filter as the moving average can be computed with a few additions and multiplications. -Opposed to the Gaussian function were exponential functions need to be evaluated for every output value. +Given by the central limit theorem of probability repetitive convolution of a rectangular function with itself eventually yields a Gaussian in the limit. +Likewise, filtering a signal with the moving average filter in several passes approximately converges to a Gaussian filter. +In practice three or five iterations are most likely enough to obtain a reasonable close Gaussian approximation \cite{kovesi2010fast}. +This allows rapid approximation of the Gaussian filter using the moving average filter, which only requires a few addition and multiplication operations. +Opposed to the Gaussian filter where several evaluations of the exponential function are necessary. +The most efficient way to compute a single output value of the moving average is: +\begin{equation} +\label{eq:recMovAvg} + y[i] = y[i-1] + x[i+r] - x[i-r-1] +\end{equation} +This recursive calculation scheme further reduces the time complexity of the moving average filter to $\landau{N}$, by reusing already calculated values. +Furthermore, only one addition and subtraction is required to calculate a single output value. + + +Given a fast approximation scheme, it is necessary to construct a box filter analogous to a given Gaussian filter. As given in \eqref{eq:gausfx} the solely parameter of the Gaussian kernel is the standard deviation $\sigma$. -In contrast the moving average filter is parametrized by its width $w$. +In contrast, the moving average filter is parametrized by its width $w$. Therefore, in order to approximate the Gaussian filter of a given $\sigma$ a corresponding value of $w$ must be found. Given $n$ iterations of moving average filters with identical widths the ideal width $\wideal$, as suggested by Wells~\cite{wells1986efficient}, is \begin{equation} @@ -64,8 +73,8 @@ Given $n$ iterations of moving average filters with identical widths the ideal w \wideal = \sqrt{\frac{12\sigma^2}{n}+1} \text{.} \end{equation} -In general $\wideal$ can be any real number but the moving average filter in \eqref{eq:symMovAvg} is restricted to odd integer values. -Hence the set of possible approximated standard deviations is limited, because the ideal width has to be rounded to the next valid value. +In general $\wideal$ can be any real number, but the moving average filter in \eqref{eq:symMovAvg} is restricted to odd integer values. +Hence, the set of possible approximated standard deviations is limited, because the ideal width must be rounded to the next valid value. In order to reduce the rounding error Kovesi~\cite{kovesi2010fast} proposes to perform two box filters with different widths \begin{align} \label{eq:boxwidthtwo} @@ -79,18 +88,19 @@ In order to reduce the rounding error Kovesi~\cite{kovesi2010fast} proposes to p \end{split} \end{align} -Given $w_l$ and $w_u$ the approximation is done by computing $m$ box filters of width $w_l$ followed by $(n-m)$ box filters of size $w_u$ where $0\le m\le n$. +Given $w_l$ and $w_u$ the approximation is done by computing $m$ box filters of width $w_l$ followed by $(n-m)$ box filters of size $w_u$, where $0\le m\le n$. As all other values are known $m$ can be computed with \begin{equation} \label{eq:boxrepeatm} m=\frac{12\sigma^2-nw_l^2-4nw_l-3n}{-4w_l-4} \text{.} \end{equation} -The approximated sigma as a function of the integer width has staircase shaped graph. +The approximated $\sigma$ as a function of the integer width has a staircase shaped graph. By reducing the rounding error the step size of the function is reduced. However, the overall shape will not change. \etal{Gwosdek}~\cite{gwosdek2011theoretical} proposed an approach which allows to approximate any real-valued value of $\sigma$. -This is achieved by +Just like the conventional box filter, the extended version has a uniform value in the range $[-r; r]$, but unlike the conventional the extended box filter has different values at its edges. +This extension introduces only marginal computational overhead over conventional box filtering. diff --git a/tex/egbib.bib b/tex/egbib.bib index d8b58ad..36fa749 100644 --- a/tex/egbib.bib +++ b/tex/egbib.bib @@ -2934,4 +2934,15 @@ year = {2003} pages={234--239}, year={1986}, publisher={IEEE} +} + +@article{heidenreich2013bandwidth, + title={Bandwidth selection for kernel density estimation: A review of fully automatic selectors}, + author={Heidenreich, Nils-Bastian and Schindler, Anja and Sperlich, Stefan}, + journal={AStA Advances in Statistical Analysis}, + volume={97}, + number={4}, + pages={403--433}, + year={2013}, + publisher={Springer} } \ No newline at end of file