Added some notes
This commit is contained in:
@@ -104,6 +104,7 @@
|
|||||||
\newcommand{\dB}{dB}
|
\newcommand{\dB}{dB}
|
||||||
\newcommand{\hpa}{hPa}
|
\newcommand{\hpa}{hPa}
|
||||||
\newcommand{\degree}{\ensuremath{^{\circ}}}
|
\newcommand{\degree}{\ensuremath{^{\circ}}}
|
||||||
|
\newcommand{\byte}{B}
|
||||||
|
|
||||||
\newcommand{\dop} [1]{\ensuremath{ \mathop{\mathrm{d}#1} }}
|
\newcommand{\dop} [1]{\ensuremath{ \mathop{\mathrm{d}#1} }}
|
||||||
\newcommand{\R} {\ensuremath{ \mathbf{R} }}
|
\newcommand{\R} {\ensuremath{ \mathbf{R} }}
|
||||||
|
|||||||
@@ -13,24 +13,28 @@ where the majority of the probability mass lies in the range $[-6; 6]^2$.
|
|||||||
Clearly, the structure of the ground truth affects the error in the estimate, but as our method approximates the KDE only the closeness to the KDE is of interest.
|
Clearly, the structure of the ground truth affects the error in the estimate, but as our method approximates the KDE only the closeness to the KDE is of interest.
|
||||||
Therefore, the particular choice of the ground truth is only of minor importance here.
|
Therefore, the particular choice of the ground truth is only of minor importance here.
|
||||||
|
|
||||||
|
The exact KDE, evaluated at $50^2$ points, is compared to the BKDE, box filter, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points.
|
||||||
|
The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in figure~\ref{fig:evalBandwidth}.
|
||||||
|
|
||||||
Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable.
|
Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable.
|
||||||
They are rather close to each other, with a tendency to diverge for larger $h$.
|
They are rather close to each other, with a tendency to diverge for larger $h$.
|
||||||
In contrast, the error curve of the box filter estimate has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$.
|
In contrast, the error curve of the box filter estimate has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$.
|
||||||
These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}.
|
These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}.
|
||||||
As the extend box filter is able to approximate an exact $\sigma$, it lacks these discontinues.
|
As the extend box filter is able to approximate an exact $\sigma$, it lacks these discontinues.
|
||||||
|
Consequently, reducing the overall error of the approximation, but only marginal in this scenario.
|
||||||
The exact KDE, evaluated at $50^2$ points, is compared to the BKDE, box filter, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points.
|
The global average MISE over all value of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version.
|
||||||
The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in figure~\ref{fig:evalBandwidth}.
|
Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively.
|
||||||
|
The choice between the extended and regular box filter algorithm depends on how large the acceptable error should be, thus on the particular application.
|
||||||
|
|
||||||
|
|
||||||
\begin{figure} [t]
|
\begin{figure} [t]
|
||||||
\label{fig:evalBandwidth}
|
\label{fig:evalBandwidth}
|
||||||
\includegraphics[width=\columnwidth]{gfx/Eval1Bandwidth_abs.png}
|
\includegraphics[width=\columnwidth]{gfx/tmpPerformance.png}
|
||||||
\caption{Hier kommt Performance Plot 2 spaltig} \label{fig:eval1GroundTruth}
|
\caption{Hier kommt Performance Plot 2 spaltig} \label{fig:eval1GroundTruth}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
Other test cases of theoretical relevance are error as a function of the grid size $G$ and the sample size $N$.
|
Other test cases of theoretical relevance are the MISE as a function of the grid size $G$ and the sample size $N$.
|
||||||
However, both cases do not give a deeper insight of the error behaviour of our method, as it closely mimics the error curve of the KDE and only confirm the theoretical expectations.
|
However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirm the theoretical expectations.
|
||||||
|
|
||||||
% kde, box filter, exbox in abhänigkeit von h (bild)
|
% kde, box filter, exbox in abhänigkeit von h (bild)
|
||||||
% sample size und grid size text
|
% sample size und grid size text
|
||||||
@@ -38,13 +42,30 @@ However, both cases do not give a deeper insight of the error behaviour of our m
|
|||||||
|
|
||||||
|
|
||||||
\subsection{Performance}
|
\subsection{Performance}
|
||||||
All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of $4.5 \text{GHz}$, which supports the AVX2 instruction set, hence 256-bit wide SIMD registers are available.
|
In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods.
|
||||||
We compare our C++ implementation of the box filter based KDE to the KernSmooth R package and the \qq{FastKDE} implementation \cite{oBrien2016fast}.
|
All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory.
|
||||||
|
We compare our C++ implementation of the extended box filter based KDE approximation to the KernSmooth R package and the \qq{FastKDE} Python implementation \cite{oBrien2016fast}.
|
||||||
The KernSmooth packages provides a FFT-based BKDE implementation based on optimized C functions at its core.
|
The KernSmooth packages provides a FFT-based BKDE implementation based on optimized C functions at its core.
|
||||||
|
% Vergleich zu weighted average (in c++) um unseren großen Geschwindigkeitsvorteil zu zeigen.
|
||||||
|
|
||||||
|
%The results are presented in plot \ref{...}
|
||||||
|
% O(N) gut erkennbar für box KDE und weighted average
|
||||||
|
% Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und fastKDE, aber das WA deutlich schneller als alle anderen
|
||||||
|
% Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer.
|
||||||
|
% (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte)
|
||||||
|
|
||||||
|
% Auffällig ist der Stufenhafte Anstieg der Laufzeit bei der R Implementierung.
|
||||||
|
% Dies kommt durch die FFT. Der Input in für die FFT muss immer auf die nächste power of two gerundet werden.
|
||||||
|
% Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant.
|
||||||
|
% Der Abbruch bei G=4406^2 liegt daran, weil für größere Gs eine out of memory error ausgelöst wird.
|
||||||
|
|
||||||
|
% Der Plot für den normalen Box Filter wurde aus Gründen der Übersichtlichkeit weggelassen.
|
||||||
|
% Sowohl der box filter als auch der extended box filter haben ein sehr ähnliches Laufzeit Verhalten und somit einen sehr ähnlichen Kurvenverlauf.
|
||||||
|
% Während die durschnittliche Laufzeit über alle Werte von G beim box filter bei 0.4092s liegt, benötigte der extended box filter im Durschnitt 0.4169s.
|
||||||
|
|
||||||
\begin{figure} [t]
|
\begin{figure} [t]
|
||||||
\label{fig:evalBandwidth}
|
\label{fig:evalBandwidth}
|
||||||
\includegraphics[width=\columnwidth]{gfx/Eval1Bandwidth_abs.png}
|
\includegraphics[width=\columnwidth]{gfx/tmpPerformance.png}
|
||||||
\caption{Hier kommt Performance Plot 2 spaltig} \label{fig:eval1GroundTruth}
|
\caption{Hier kommt Performance Plot 2 spaltig} \label{fig:eval1GroundTruth}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
|
|||||||
BIN
tex/gfx/tmpPerformance.png
Normal file
BIN
tex/gfx/tmpPerformance.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 89 KiB |
Reference in New Issue
Block a user