Fixed FE 1

2018-03-12 22:21:39 +01:00
parent c224967b19
commit 316b1d2911
11 changed files with 76 additions and 72 deletions
--- a/tex/chapters/kde.tex
+++ b/tex/chapters/kde.tex
@@ -13,7 +13,7 @@
 %In contrast, 
 The KDE is often the preferred tool to estimate a density function from discrete data samples because of its flexibility and ability to produce a continuous estimate.
 %
-Given a univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
+Given an univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights.
 The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
 \begin{equation}
 \label{eq:kde}
@@ -31,7 +31,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined
 %
 %Any non-optimal bandwidth causes undersmoothing or oversmoothing.
 %An undersmoothing estimator has a large variance and hence a small $h$ leads to undersmoothing.
-%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite[7]{Cybakov2009}.
+%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite{Cybakov2009}.
 %Clearly with an adverse choice of the bandwidth crucial information like modality might get smoothed out.
 %All in all it is not obvious to determine a good choice of the bandwidth.
 %
@@ -50,16 +50,17 @@ The Gaussian kernel is given as
 K_G(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} } \text{.}
 \end{equation}

-The flexibility of the KDE comes at the expense of computational efficiency, which leads to the development of more efficient computation schemes.
-The computation time depends, besides the number of calculated points $M$, on the input size, namely the number of data points $N$.
-In general, reducing the size of the sample negatively affects the accuracy of the estimate.
-Still, the sample size is a suitable parameter to speed up the computation.
+The flexibility of the KDE comes at the expense of computation speed, which leads to the development of more efficient computation schemes.
+The computation time depends, besides the number of calculated points $M$, on the input size, namely the size of sample $N$.
+In general, reducing the size of the sample set negatively affects the accuracy of the estimate.
+Still, $N$ is a suitable parameter to speed up the computation.

-Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
+The BKDE reduces $N$ by combining each single sample with its adjacent samples into bins, and thus, approximates the KDE.
+%Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
 Each bin represents the count of the sample set at a given point of an equidistant grid with spacing $\delta$.
-A binning rule distributes a sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
+A binning rule distributes each sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
 % and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$.
-Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$.
+Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$ \cite{hall1996accuracy}.

 Given a binning rule $r_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as
 \begin{equation}