This repository has been archived on 2020-04-08. You can view files and clone it, but cannot push or open issues or pull requests.
Files
Fusion2018/tex/chapters/kde.tex
toni 00754a00da changed template
added chapters
added bib
2018-02-06 10:51:54 +01:00

30 lines
1.4 KiB
TeX

\section{Binned Kernel Density Estimation}
% KDE by rosenblatt and parzen
% general KDE
% Gauss Kernel
% Formula Gauss KDE
% -> complexity/operation count
% Binned KDE
% Binned Gauss KDE
% -> complexity/operation count
The histogram is a simple and for a long time the most used non-parametric estimator.
However, its inability to produce a continuous estimate dismisses it for many applications where a smooth distribution is assumed.
In contrast, the KDE is often the preferred tool because of its ability to produce a continuous estimate and its flexibility.
Given $n$ independently observed realizations of the observation set $X=(x_1,\dots,x_n)$, the kernel density estimate $\hat{f}_n$ of the density function $f$ of the underlying distribution is given with
\begin{equation}
\label{eq:kde}
\hat{f}_n = \frac{1}{nh} \sum_{i=1}^{n} K \left( \frac{x-X_i}{h} \right) \text{,} %= \frac{1}{n} \sum_{i=1}^{n} K_h(x-x_i)
\end{equation}
where $K$ is the kernel function and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth.
While any density function can be used as the kernel function $K$ (such that $\int K(u) \dop{u} = 1$), a variety of popular choices of the kernel function $K$ exits.
In practice the Gaussian kernel is commonly used:
\begin{equation}
K(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} }
\end{equation}
\begin{equation}
\hat{f}_n = \frac{1}{nh\sqrt{2\pi}} \sum_{i=1}^{n} \expp{-\frac{(x-X_i)^2}{2h^2}}
\end{equation}