From a46fe242e8ba101447edcd7c11a288041103da9e Mon Sep 17 00:00:00 2001
From: MBulli <markusbullmann@gmail.com>
Date: Sun, 18 Feb 2018 21:11:42 +0100
Subject: [PATCH] Added weights to KDE and BKDE

---
 tex/chapters/kde.tex | 32 ++++++++++++++++----------------
 tex/chapters/mvg.tex |  4 ++--
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/tex/chapters/kde.tex b/tex/chapters/kde.tex
index 36886cc..e3eb907 100644
--- a/tex/chapters/kde.tex
+++ b/tex/chapters/kde.tex
@@ -13,12 +13,14 @@
 %In contrast, 
 The KDE is often the preferred tool to estimate a density function from discrete data samples because of its ability to produce a continuous estimate and its flexibility.
 %
-Given a univariate random sample $X=\{X_1, X_2, \dots, X_n\}$, the kernel estimator $\hat{f}$ which defines the estimate at the point $x$ is given as
+Given a univariate random sample $X=\{X_1, \dots, X_n\}$, where $X$ has the density function $f$ and let $w_1, \dots w_n$ be associated weights.
+The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as
 \begin{equation}
 \label{eq:kde}
-\hat{f}(x) = \frac{1}{nh} \sum_{i=1}^{n} K \left(\frac{x-X_i}{h}\right)
+\hat{f}(x) = \frac{1}{W} \sum_{i=1}^{n} \frac{w_i}{h} K \left(\frac{x-X_i}{h}\right)
 \end{equation}
-where $K$ is a kernel function such that $\int K(u) \dop{u} = 1$ and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth \cite[138]{scott2015}.
+where $W=\sum_{i=1}^{n}w_i$ and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth.
+$K$ is a kernel function such that $\int K(u) \dop{u} = 1$ \cite[138]{scott2015}.
 In general any kernel can be used, however the general advice is to chose a symmetric and low-order polynomial kernel.
 Thus, several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite[152.]{scott2015}.
 
@@ -54,40 +56,38 @@ In general, reducing the size of the sample negatively affects the accuracy of t
 Still, the sample size is a suitable parameter to speedup the computation.
 
 Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE.
-Each bin represents the \qq{weight} of the sample set at a given point of a equidistant grid with spacing $\delta$.
-A binning rule distributes a sample $x$ among the grid points $g_j=j\delta$ indexed by $j\in\Z$.
+Each bin represents the \qq{count} of the sample set at a given point of a equidistant grid with spacing $\delta$.
+A binning rule distributes a sample $x$ among the grid points $g_j=j\delta$, indexed by $j\in\Z$.
 % and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$.
 Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$.
 
-Given a binning rule $w_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as
+Given a binning rule $b_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as
 \begin{equation}
 \label{eq:binKde}
-\tilde{f}(g_x) = \frac{1}{nh} \sum_{j=1}^{G} C_j K \left(\frac{g_x-g_j}{h}\right)
+\tilde{f}(g_x) = \frac{1}{W} \sum_{j=1}^{G} \frac{C_j}{h} K \left(\frac{g_x-g_j}{h}\right)
 \end{equation}
 where $G$ is the number of grid points and
 \begin{equation}
 \label{eq:gridCnts}
-    C_j=\sum_{i=1}^{n} w_j(x_i,\delta)
+    C_j=\sum_{i=1}^{n} b_j(x_i,\delta)
 \end{equation}
-is the count at grid point $g_j$ \cite{hall1996accuracy}.
+is the count at grid point $g_j$, such that $\sum_{j=1}^{G} C_j = W$ \cite{hall1996accuracy}.
 
-\commentByMarkus{Wording: Count vs. Weight}
-
-In theory, any function which assigns weights to grid points is a valid binning rule.
+In theory, any function which determines the count at grid points is a valid binning rule.
 However, for many applications it is recommend to use the simple binning rule
 \begin{align}
 \label{eq:simpleBinning}
-    w_j(x,\delta) &=
+    b_j(x,\delta) &=
     \begin{cases}
-        1 & \text{if } x \in ((j-\frac{1}{2})\delta,  (j-\frac{1}{2})\delta ] \\
+        w_j & \text{if } x \in ((j-\frac{1}{2})\delta,  (j-\frac{1}{2})\delta ] \\
         0 & \text{else}
     \end{cases}
 \end{align}
 or the common linear binning rule which divides the sample into two fractional weights shared by the nearest grid points
 \begin{align}
-    w_j(x,\delta) &=
+    b_j(x,\delta) &=
     \begin{cases}
-        1-|\delta^{-1}x-j| & \text{if } |\delta^{-1}x-j|\le1 \\
+        w_j(1-|\delta^{-1}x-j|) & \text{if } |\delta^{-1}x-j|\le1 \\
         0				   & \text{else.}
     \end{cases}
 \end{align}
diff --git a/tex/chapters/mvg.tex b/tex/chapters/mvg.tex
index c85ed47..5d9ea51 100644
--- a/tex/chapters/mvg.tex
+++ b/tex/chapters/mvg.tex
@@ -14,8 +14,8 @@ y[n] = \frac{1}{\sigma\sqrt{2\pi}} \sum_{k=0}^{M-1} x[k]\expp{-\frac{(n-k)^2}{2\
 where $\sigma$ is a smoothing parameter called standard deviation.
 
 Note that \eqref{eq:bkdeGaus} has the same structure as \eqref{eq:gausFilt}, except the varying notational symbol of the smoothing parameter and the different factor in front of the sum.
-While in both equations the constant factor of the Gaussian is removed of the inner sum, \eqref{eq:bkdeGaus} has an additional normalization factor $N^{-1}$.
-This factor is necessary to in order to ensure that the estimate is a valid density function, i.e. that it integrates to one.
+While in both equations the constant factor of the Gaussian is removed of the inner sum, \eqref{eq:bkdeGaus} has an additional normalization factor $W^{-1}$.
+This factor is necessary to ensure that the estimate is a valid density function, i.e. that it integrates to one.
 Such a restriction is superfluous in the context of digital filters, so the normalization factor is omitted.
 
 Computation of a digital filter using the a naive implementation of the discrete convolution algorithm yields $\landau{NM}$, where $N$ is the length of the input signal and $M$ is the size of the filter kernel.