Fixed FE 1
This commit is contained in:
@@ -13,7 +13,7 @@
|
||||
%In contrast,
|
||||
The KDE is often the preferred tool to estimate a density function from discrete data samples because of its flexibility and ability to produce a continuous estimate.
|
||||
%
|
||||
Given a univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
|
||||
Given an univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
|
||||
The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
|
||||
\begin{equation}
|
||||
\label{eq:kde}
|
||||
@@ -31,7 +31,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined
|
||||
%
|
||||
%Any non-optimal bandwidth causes undersmoothing or oversmoothing.
|
||||
%An undersmoothing estimator has a large variance and hence a small $h$ leads to undersmoothing.
|
||||
%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite[7]{Cybakov2009}.
|
||||
%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite{Cybakov2009}.
|
||||
%Clearly with an adverse choice of the bandwidth crucial information like modality might get smoothed out.
|
||||
%All in all it is not obvious to determine a good choice of the bandwidth.
|
||||
%
|
||||
@@ -50,16 +50,17 @@ The Gaussian kernel is given as
|
||||
K_G(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} } \text{.}
|
||||
\end{equation}
|
||||
|
||||
The flexibility of the KDE comes at the expense of computational efficiency, which leads to the development of more efficient computation schemes.
|
||||
The computation time depends, besides the number of calculated points $M$, on the input size, namely the number of data points $N$.
|
||||
In general, reducing the size of the sample negatively affects the accuracy of the estimate.
|
||||
Still, the sample size is a suitable parameter to speed up the computation.
|
||||
The flexibility of the KDE comes at the expense of computation speed, which leads to the development of more efficient computation schemes.
|
||||
The computation time depends, besides the number of calculated points $M$, on the input size, namely the size of sample $N$.
|
||||
In general, reducing the size of the sample set negatively affects the accuracy of the estimate.
|
||||
Still, $N$ is a suitable parameter to speed up the computation.
|
||||
|
||||
Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
|
||||
The BKDE reduces $N$ by combining each single sample with its adjacent samples into bins, and thus, approximates the KDE.
|
||||
%Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
|
||||
Each bin represents the count of the sample set at a given point of an equidistant grid with spacing $\delta$.
|
||||
A binning rule distributes a sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
|
||||
A binning rule distributes each sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
|
||||
% and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$.
|
||||
Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$.
|
||||
Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$ \cite{hall1996accuracy}.
|
||||
|
||||
Given a binning rule $r_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as
|
||||
\begin{equation}
|
||||
|
||||
Reference in New Issue
Block a user