Going thru changes

2018-02-20 10:43:34 +01:00
parent 6ffa7d1c15
commit e49c7a1cbf
5 changed files with 114 additions and 118 deletions
--- a/tex/chapters/multivariate.tex
+++ b/tex/chapters/multivariate.tex
@@ -0,0 +1,109 @@
+\section{Extension to multi-dimensional data}
+\todo{WIP}
+So far only univariate sample sets were considered.
+This is due to the fact, that the equations of the KDE \eqref{eq:kde}, BKDE \eqref{eq:binKde}, Gaussian filter \eqref{eq:gausFilt}, and the box filter \eqref{eq:boxFilt} are quite easily extended to multi-dimensional input.
+Each method can be seen as several one-dimensional problems combined to a multi-dimensional result.
+%However, with an increasing number of dimensions the computation time significantly increases.
+In the following, the generalization to multi-dimensional input are briefly outlined.
+
+In order to estimate a multivariate density using KDE or BKDE a multivariate kernel needs to be used.
+Multivariate kernel functions can be constructed in various ways, however, a popular way is given by the product kernel.
+Such a kernel is constructed by combining several univariate kernels into a product, where each kernel is applied in each dimension with a possibly different bandwidth.
+
+Given a multivariate random variable $X=(x_1,\dots ,x_d)$ in $d$ dimensions.
+The sample $\bm{X}$ is a $n\times d$ matrix defined as \cite[162]{scott2015}
+\begin{equation}
+    \bm{X}=
+    \begin{pmatrix}
+        X_1    \\
+        \vdots \\
+        X_n    \\
+    \end{pmatrix}
+    =
+    \begin{pmatrix}
+        x_{11} & \dots & x_{1d} \\
+        \vdots & \ddots & \vdots \\
+        x_{n1} & \dots & x_{nd}
+    \end{pmatrix} \text{.}
+\end{equation}
+
+The multivariate KDE $\hat{f}$ which defines the estimate pointwise at $\bm{x}=(x_1, \dots, x_d)^T$ is given as \cite[162]{scott2015}
+\begin{equation}
+\label{eq:mvKDE}
+    \hat{f}(\bm{x}) = \frac{1}{W} \sum_{i=1}^{n} \frac{w_i}{h_1 \dots h_d} \left[  \prod_{j=1}^{d} K\left( \frac{x_j-x_{ij}}{h_j} \right)  \right]  \text{.}
+\end{equation}
+where the bandwidth is given as a vector $\bm{h}=(h_1, \dots, h_d)$.
+
+Note that \eqref{eq:mvKDE} does not include all possible multivariate kernels, such as spherically symmetric kernels, which are based on rotation of a univariate kernel.
+In general a multivariate product and spherically symmetric kernel based on the same univariate kernel will differ.
+The only exception is the Gaussian kernel which is spherical symmetric and has independent marginals. % TODO scott cite?!
+In addition, only smoothing in the direction of the axes are possible.
+If smoothing in other directions is necessary, the computation needs to be done on a prerotated sample set and the estimate needs to be rotated back to fit the original coordinate system \cite{wand1994fast}.
+
+For the multivariate BKDE, in addition to the kernel function the grid and the binning rules need to be extended to multivariate data.
+\todo{Reicht hier text oder müssen Formeln her?}
+
+
+In general multi-dimensional filters are multi-dimensional convolution operations.
+However, by utilizing the separability property of convolution a straightforward and a more efficient implementation can be found.
+Convolution is separable if the filter kernel is separable, i.e. it can be split into successive convolutions of several kernels.
+Likewise digital filters based on such kernels are called separable filters.
+They are easily applied to multi-dimensional signals, because the input signal can be filtered in each dimension separately by an one-dimensional filter.
+
+The Gaussian filter is separable, because of $e^{x^2+y^2} = e^{x^2}\cdot e^{y^2}$.
+
+
+% KDE:
+%So far only the univariate case was considered.
+%This is due to the fact, that univariate kernel estimators can quite easily be extended to multivariate distributions.
+%A common approach is to apply an univariate kernel with a possibly different bandwidth in each dimension.
+%These kind of multivariate kernel is called product kernel as the multivariate kernel result is the product of each individual univariate kernel.
+%
+%Given a multivariate random variable $X=(x_1,\dots ,x_d)$ in $d$ dimensions.
+%The sample $\bm{X}$ is a $n\times d$ matrix defined as \cite[162]{scott2015}
+%\begin{equation}
+%    \bm{X}=
+%    \begin{pmatrix}
+%        X_1    \\
+%        \vdots \\
+%        X_n    \\
+%    \end{pmatrix}
+%    =
+%    \begin{pmatrix}
+%        x_{11} & \dots & x_{1d} \\
+%        \vdots & \ddots & \vdots \\
+%        x_{n1} & \dots & x_{nd}
+%    \end{pmatrix} \text{.}
+%\end{equation}
+%
+%The multivariate kernel density estimator $\hat{f}$ which defines the estimate pointwise at $\bm{x}=(x_1, \dots, x_d)^T$ is given as \cite[162]{scott2015}
+%\begin{equation}
+%    \hat{f}(\bm{x}) = \frac{1}{nh_1 \dots h_d} \sum_{i=1}^{n} \left[  \prod_{j=1}^{d} K\left( \frac{x_j-x_{ij}}{h_j} \right)  \right]  \text{.}
+%\end{equation}
+%where the bandwidth is given as a vector $\bm{h}=(h_1, \dots, h_d)$.
+
+% Product kernel allows our method
+% Spherically symmetric kernel not supported, but Gaussian kernel == product & spehrically symmetric
+% smoothing not in the direction of the axes -> rotate data, kde, rotate back
+
+%Multivariate Gauss-Kernel
+%\begin{equation}
+%K(u)=\frac{1}{(2\pi)^{d/2}} \expp{-\frac{1}{2} \bm{x}^T \bm{x}}
+%\end{equation}
+
+% Gaus:
+%If the filter kernel is separable, the convolution is also separable i.e. multi-dimensional convolution can be computed as individual one-dimensional convolutions with a one-dimensional kernel.
+%Because of $e^{x^2+y^2} = e^{x^2}\cdot e^{y^2}$ the Gaussian filter is separable and can be easily applied to multi-dimensional signals. \todo{quelle}
+
+
+%wie benutzen wir das ganze jetzt? auf was muss ich achten?
+
+% Am Beispiel 2D Daten
+% Histogram erzeugen (== data binnen)
+% Hierzu wird min/max benötigt
+% Anschließend Filterung per Box Filter über das Histogram
+% - Wenn möglich parallel (SIMD, GPU)
+% - separiert in jeder dim einzeln
+% Maximum aus Filter ergebnis nehmen
+
+