Fusion2018/tex/chapters/experiments.tex

\section{Experiments}

\subsection{Mean Integrated Squared Error}


We now empirically evaluate the accuracy of our method, using the mean integrated squared error (MISE).
The ground truth is given as $N=1000$ synthetic samples drawn from a bivariate mixture normal density $f$
\begin{equation}
\begin{split}
  \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\
              &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}}
\end{split}
\end{equation}
where the majority of the probability mass lies in the range $[-6; 6]^2$.
Clearly, the structure of the ground truth affects the error in the estimate, but as our method approximates the KDE only the closeness to the KDE is of interest.
Therefore, the particular choice of the ground truth is only of minor importance here.

\begin{figure}[t]
	\label{fig:evalBandwidth}
    %\includegraphics[width=\columnwidth]{gfx/Eval1Bandwidth_abs.png}
    \input{gfx/error.tex}
    \caption{Hier kommt bandwith error plot single. da sind voll die ergebnisse zu der genaugikeit und so für voll die verfahren und weighted avg auch noch.}  \label{fig:eval1GroundTruth}
\end{figure}

The exact KDE, evaluated at $50^2$ points, is compared to the BKDE, box filter, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points.
The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in figure~\ref{fig:evalBandwidth}.

Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable.
They are rather close to each other, with a tendency to diverge for larger $h$.
In contrast, the error curve of the box filter estimate has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$.
These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}.
As the extend box filter is able to approximate an exact $\sigma$, it lacks these discontinues.
Consequently, it reduces the overall error of the approximation, but only marginal in this scenario.
The global average MISE over all values of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version.
Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively.
The choice between the extended and regular box filter algorithm depends on how large the acceptable error should be, thus on the particular application.

Other test cases of theoretical relevance are the MISE as a function of the grid size $G$ and the sample size $N$.
However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirm the theoretical expectations.


\begin{figure}[t]
	\label{fig:performance}
	%\includegraphics[width=\textwidth,height=6cm]{gfx/tmpPerformance.png}
	\input{gfx/perf.tex}
    \caption{Hier kommt Performance Plot 2 spaltig. Hier bitte noch ein wenig Text hinzufügen, damit da auch was steht. verstehst? vielleicht geht es sogar noch bis in die zweite zeile mit rein. mal schaun. }
\end{figure}

% kde, box filter, exbox in abhänigkeit von h (bild)
% sample size und grid size text
% fastKDE fehler vergleich macht kein sinn weil kernel und bandbreite unterschiedlich sind


\subsection{Performance}
In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods.
All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory.
We compare our C++ implementation of the extended box filter based KDE approximation to the KernSmooth R package and the \qq{FastKDE} Python implementation \cite{oBrien2016fast}.
The KernSmooth packages provides a FFT-based BKDE implementation based on optimized C functions at its core.
% Vergleich zu weighted average (in c++) um unseren großen Geschwindigkeitsvorteil zu zeigen.
With state estimation problems in mind, we additionally provide a C++ implementation of a weighted average estimator.
\commentByToni{Vielleicht sollten wir hier noch paar Worte über die Implementierung verlieren. Ist das alles std c++? nehmen wir iwas mega kraßes? usw. vielleicht im camera ready sogar nen link zum coder oder sowas.}


The results for performance comparison are presented in plot \ref{fig:performance}.
% O(N) gut erkennbar für box KDE und weighted average
The linear complexity \landau{N} of the boxKDE and the weighted average is clearly visible.
% Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und fastKDE, aber das WA deutlich schneller als alle anderen
Especially for small $G$ up to $10^3$ the boxKDE is much faster compared to KernSmooth R and fastKDE.
% Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer.
Nevertheless, the simple weighted average approach performs the fastest and with increasing $G$ the distance to the boxKDE grows constantly.
However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}.
% (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte)


% Auffällig ist der Stufenhafte Anstieg der Laufzeit bei der R Implementierung.
Further looking at fig. \ref{fig:performance}, the runtime performance of the KernSmooth R approach is increasing in a stepwise manner with growing $G$.
% Dies kommt durch die FFT. Der Input in für die FFT muss immer auf die nächste power of two gerundet werden.
This behaviour is caused by the underlying FFT algorithm.
% Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant.
The FFT approach requires the input to be always rounded up to a power of two, what then causes a constant runtime behaviour within the those boundaries and a strong performance deterioration at corresponding manifolds.
% Der Abbruch bei G=4406^2 liegt daran, weil für größere Gs eine out of memory error ausgelöst wird.
The termination of KernSmooth R at $G=4406^2$ is caused by an out of memory error for even bigger $G$.

% Der Plot für den normalen Box Filter wurde aus Gründen der Übersichtlichkeit weggelassen.
% Sowohl der box filter als auch der extended box filter haben ein sehr ähnliches Laufzeit Verhalten und somit einen sehr ähnlichen Kurvenverlauf.
% Während die durschnittliche Laufzeit über alle Werte von G beim box filter bei 0.4092s liegt, benötigte der extended box filter im Durschnitt 0.4169s.
Both discussed Gaussian filter, namely box filter and extended box filter, yield a similar runtime behaviour and therefore a similar curve progression.
While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one provides an average of \SI{0.4169}{\second}.
To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of an boxKDE with extended box filter.


\commentByToni{die grafiken haben wir ja jetzt beschrieben, aber ein wenig diskussion in die tiefe fehlt mir trotzdem noch}
\commentByToni{also irgendwie kommt der exbox jetzt so plötzlich... der wurde vorher so dünn besprochen und jetzt ist er auf einmal voll im Fokus. wirkt komisch}
\commentByFrank{Farben (blue)(green) in den Bildunterschriften stimmen nicht mehr}
\commentByFrank{Fig4 (error over time) checken ob die beiden farbigen linien jetzt richtig rum sind. NIEMALS GENERIERTE TEX GRAFIKEN DIREKT EDITIEREN}


\input{chapters/realworld}