Fixed FE 1

This commit is contained in:
MBulli
2018-03-12 22:21:39 +01:00
parent c224967b19
commit 316b1d2911
11 changed files with 76 additions and 72 deletions

View File

@@ -13,7 +13,7 @@
%In contrast,
The KDE is often the preferred tool to estimate a density function from discrete data samples because of its flexibility and ability to produce a continuous estimate.
%
Given a univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
Given an univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
\begin{equation}
\label{eq:kde}
@@ -31,7 +31,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined
%
%Any non-optimal bandwidth causes undersmoothing or oversmoothing.
%An undersmoothing estimator has a large variance and hence a small $h$ leads to undersmoothing.
%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite[7]{Cybakov2009}.
%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite{Cybakov2009}.
%Clearly with an adverse choice of the bandwidth crucial information like modality might get smoothed out.
%All in all it is not obvious to determine a good choice of the bandwidth.
%
@@ -50,16 +50,17 @@ The Gaussian kernel is given as
K_G(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} } \text{.}
\end{equation}
The flexibility of the KDE comes at the expense of computational efficiency, which leads to the development of more efficient computation schemes.
The computation time depends, besides the number of calculated points $M$, on the input size, namely the number of data points $N$.
In general, reducing the size of the sample negatively affects the accuracy of the estimate.
Still, the sample size is a suitable parameter to speed up the computation.
The flexibility of the KDE comes at the expense of computation speed, which leads to the development of more efficient computation schemes.
The computation time depends, besides the number of calculated points $M$, on the input size, namely the size of sample $N$.
In general, reducing the size of the sample set negatively affects the accuracy of the estimate.
Still, $N$ is a suitable parameter to speed up the computation.
Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
The BKDE reduces $N$ by combining each single sample with its adjacent samples into bins, and thus, approximates the KDE.
%Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
Each bin represents the count of the sample set at a given point of an equidistant grid with spacing $\delta$.
A binning rule distributes a sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
A binning rule distributes each sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
% and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$.
Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$.
Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$ \cite{hall1996accuracy}.
Given a binning rule $r_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as
\begin{equation}