From 1fb9461a5fc5d53454de0c32e051535a6aa324a9 Mon Sep 17 00:00:00 2001 From: Markus Bullmann Date: Tue, 27 Feb 2018 10:49:05 +0100 Subject: [PATCH 01/11] Fixed many bugs --- tex/chapters/abstract.tex | 2 +- tex/chapters/experiments.tex | 24 +++++++++++----------- tex/chapters/introduction.tex | 9 ++++----- tex/chapters/kde.tex | 38 +++++++++++++++++------------------ tex/chapters/multivariate.tex | 18 ++++++++--------- tex/chapters/mvg.tex | 18 ++++++++--------- tex/chapters/realworld.tex | 14 ++++++------- tex/chapters/relatedwork.tex | 12 +++++------ 8 files changed, 67 insertions(+), 68 deletions(-) diff --git a/tex/chapters/abstract.tex b/tex/chapters/abstract.tex index 5c5c6fe..3f741fb 100644 --- a/tex/chapters/abstract.tex +++ b/tex/chapters/abstract.tex @@ -1,7 +1,7 @@ \begin{abstract} It is common practice to use a sample-based representation to solve problems having a probabilistic interpretation. In many real world scenarios one is then interested in finding a \qq{best estimate} of the underlying problem, e.g. the position of a robot. -This is often done by means of simple parametric point estimator, providing the sample statistics. +This is often done by means of simple parametric point estimators, providing the sample statistics. However, in complex scenarios this frequently results in a poor representation, due to multimodal densities and limited sample sizes. Recovering the probability density function using a kernel density estimation yields a promising approach to solve the state estimation problem i.e. finding the \qq{real} most probable state, but comes with high computational costs. diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index 5eafd9b..187282e 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -4,7 +4,7 @@ We now empirically evaluate the accuracy of our boxKDE method, using the mean integrated squared error (MISE). -The ground truth is given as $N=1000$ synthetic samples drawn from a bivariate mixture normal density $f$ +The ground truth is given with $N=1000$ synthetic samples drawn from a bivariate mixture normal density $f$ \begin{equation} \begin{split} \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\ @@ -21,7 +21,7 @@ Therefore, the particular choice of the ground truth is only of minor importance \end{figure} Evaluated at $50^2$ points the exact KDE is compared to the BKDE, boxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. -The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in figure~\ref{fig:errorBandwidth}. +The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in fig.~\ref{fig:errorBandwidth}. A minimum error is obtained with $h=0.35$, for larger oversmoothing occurs and the modes gradually fuse together. Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable. @@ -42,7 +42,7 @@ However, both cases do not give a deeper insight of the error behavior of our me \begin{figure}[t] %\includegraphics[width=\textwidth,height=6cm]{gfx/tmpPerformance.png} \input{gfx/perf.tex} - \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted average estimate (blue) performs fastest followed by the boxKDE (orange) approximation. Both the BKDE (red), and the fastKDE (green) are magnitudes slower, especially for $G<10^4$.}\label{fig:performance} + \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted-average estimate (blue) performs fastest followed by the boxKDE (orange) approximation. Both the BKDE (red) and the fastKDE (green) are magnitudes slower, especially for $G<10^3$.}\label{fig:performance} \end{figure} % kde, box filter, exbox in abhänigkeit von h (bild) @@ -53,18 +53,18 @@ However, both cases do not give a deeper insight of the error behavior of our me \subsection{Performance} In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods. All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. -We compare our C++ implementation of the box filter based KDE approximation based on algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the fastKDE Python implementation \cite{oBrien2016fast}. -The \texttt{ks} packages provides a FFT-based BKDE implementation based on optimized C functions at its core. -With state estimation problems in mind, we additionally provide a C++ implementation of a weighted average estimator. -As both methods are not using a grid, an equivalent input sample set was used for the weighted average and the fastKDE. +We compare our C++ implementation of the boxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the fastKDE Python implementation \cite{oBrien2016fast}. +The \texttt{ks} package provides a FFT-based BKDE implementation based on optimized C functions at its core. +With state estimation problems in mind, we additionally provide a C++ implementation of a weighted-average estimator. +As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the fastKDE. -The results for performance comparison are presented in plot \ref{fig:performance}. +The results for performance comparison are presented in fig.~\ref{fig:performance}. % O(N) gut erkennbar für box KDE und weighted average The linear complexity of the boxKDE and the weighted average is clearly visible. % Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und fastKDE, aber das WA deutlich schneller als alle anderen Especially for small $G$ up to $10^3$ the boxKDE is much faster compared to BKDE and fastKDE. % Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer. -Nevertheless, the simple weighted average approach performs the fastest and with increasing $G$ the distance to the boxKDE grows constantly. +Nevertheless, the simple weighted-average approach performs the fastest and with increasing $G$ the distance to the boxKDE grows constantly. However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}. % (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte) @@ -74,7 +74,7 @@ Further looking at fig. \ref{fig:performance}, the runtime performance of the BK % Dies kommt durch die FFT. Der Input in für die FFT muss immer auf die nächste power of two gerundet werden. This behavior is caused by the underlying FFT algorithm. % Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant. -The FFT approach requires the input to be always rounded up to a power of two, what then causes a constant runtime behavior within those boundaries and a strong performance deterioration at corresponding manifolds. +The FFT approach requires the input to be always rounded up to a power of two, what then causes a constant runtime behaviour within those boundaries and a strong performance deterioration at corresponding manifolds. % Der Abbruch bei G=4406^2 liegt daran, weil für größere Gs eine out of memory error ausgelöst wird. The termination of BKDE graph at $G=4406^2$ is caused by an out of memory error for even bigger $G$ in the \texttt{ks} package. @@ -85,10 +85,10 @@ Both discussed Gaussian filter approximations, namely box filter and extended bo While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one provides an average of \SI{0.4169}{\second}. To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of the boxKDE with the regular box filter. -The weighted average has the great advantage of being independent of the dimensionality of the input and effortlessly implemented. +The weighted-average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly. In contrast, the computation of the boxKDE approach increases exponentially with increasing number of dimensions. However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficient fast for many applications and much smaller compared to other methods. -The boxKDE approach presents a reasonable alternative to the weighted average and is easily integrated into existing systems. +The boxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliable predictable. Furthermore, the computation is easily parallelized, as there is no data dependency between the one-dimensional filter passes in algorithm~\ref{alg:boxKDE}. diff --git a/tex/chapters/introduction.tex b/tex/chapters/introduction.tex index 181b574..f915131 100644 --- a/tex/chapters/introduction.tex +++ b/tex/chapters/introduction.tex @@ -4,8 +4,7 @@ Sensor fusion approaches are often based upon probabilistic descriptions like particle filters, using samples to represent the distribution of a dynamical system. To update the system recursively in time, probabilistic sensor models process the noisy measurements and a state transition function provides the system's dynamics. Therefore a sample or particle is a representation of one possible system state, e.g. the position of a pedestrian within a building. -In most real world scenarios one is then interested in finding the most probable state within the state space, to provide the \qq{best estimate} of the underlying problem. -Generally speaking, solving the state estimation problem. +In most real world scenarios one is then interested in finding the most probable state within the state space, to provide the best estimate of the underlying problem, generally speaking, solving the state estimation problem. In the discrete manner of a sample representation this is often done by providing a single value, also known as sample statistic, to serve as a \qq{best guess}. This value is then calculated by means of simple parametric point estimators, e.g. the weighted-average over all samples, the sample with the highest weight or by assuming other parametric statistics like normal distributions \cite{Fetzer2016OMC}. %da muss es doch noch andere methoden geben... verflixt und zugenäht... aber grundsätzlich ist ein weighted average doch ein point estimator? (https://www.statlect.com/fundamentals-of-statistics/point-estimation) @@ -17,9 +16,9 @@ As a result, those techniques are not able to provide an accurate statement abou For example, in a localization scenario where a bimodal distribution represents the current posterior, a reliable position estimation is more likely to be at one of the modes, instead of somewhere in-between, like provided by a simple weighted-average estimation. Additionally, in most practical scenarios the sample size and therefore the resolution is limited, causing the variance of the sample based estimate to be high \cite{Verma2003}. -It is obvious, that a computation of the full posterior could solve the above, but finding such an analytical solution is an intractable problem, what is the reason for applying a sample representation in the first place. +It is obvious, that a computation of the full posterior could solve the above, but finding such an analytical solution is an intractable problem, which is the reason for applying a sample representation in the first place. Another promising way is to recover the probability density function from the sample set itself, by using a non-parametric estimator like a kernel density estimation (KDE). -With this, it is easy to find the \qq{real} most probable state and thus to avoid the aforementioned drawbacks. +With this, it is easy to recover the \qq{real} most probable state and thus to avoid the aforementioned drawbacks. However, non-parametric estimators tend to consume a large amount of computational time, which renders them unpractical for real time scenarios. Nevertheless, the availability of a fast processing density estimate might improve the accuracy of today's sensor fusion systems without sacrificing their real time capability. @@ -34,7 +33,7 @@ By the central limit theorem, multiple recursion of a box filter yields an appro This process converges quite fast to a reasonable close approximation of the ideal Gaussian. In addition, a box filter can be computed extremely fast by a computer, due to its intrinsic simplicity. While the idea to use several box filter passes to approximate a Gaussian has been around for a long time, the application to obtain a fast KDE is new. -Especially in time critical and time sequential sensor fusion scenarios, the here presented approach outperforms other state of the art solutions, due to a fully linear complexity \landau{N} and a negligible overhead, even for small sample sets. +Especially in time critical and time sequential sensor fusion scenarios, the here presented approach outperforms other state of the art solutions, due to a fully linear complexity and a negligible overhead, even for small sample sets. In addition, it requires only a few elementary operations and is highly parallelizable. diff --git a/tex/chapters/kde.tex b/tex/chapters/kde.tex index 5e49921..00f035b 100644 --- a/tex/chapters/kde.tex +++ b/tex/chapters/kde.tex @@ -1,4 +1,4 @@ -\section{Kernel Density Estimation} +\section{Kernel Density Estimator} % KDE by rosenblatt and parzen % general KDE % Gauss Kernel @@ -11,17 +11,17 @@ %The histogram is a simple and for a long time the most used non-parametric estimator. %However, its inability to produce a continuous estimate dismisses it for many applications where a smooth distribution is assumed. %In contrast, -The KDE is often the preferred tool to estimate a density function from discrete data samples because of its ability to produce a continuous estimate and its flexibility. +The KDE is often the preferred tool to estimate a density function from discrete data samples because of its flexibility and ability to produce a continuous estimate. % Given a univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights. The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as \begin{equation} \label{eq:kde} -\hat{f}(x) = \frac{1}{W} \sum_{i=1}^{N} \frac{w_i}{h} K \left(\frac{x-X_i}{h}\right) +\hat{f}(x) = \frac{1}{W} \sum_{i=1}^{N} \frac{w_i}{h} K \left(\frac{x-X_i}{h}\right) \text{,} \end{equation} where $W=\sum_{i=1}^{N}w_i$ and $h\in\R^+$ is an arbitrary smoothing parameter called bandwidth. $K$ is a kernel function such that $\int K(u) \dop{u} = 1$. -In general any kernel can be used, however the general advice is to chose a symmetric and low-order polynomial kernel. +In general, any kernel can be used, however a common advice is to chose a symmetric and low-order polynomial kernel. Thus, several popular kernel functions are used in practice, like the Uniform, Gaussian, Epanechnikov, or Silverman kernel \cite{scott2015}. While the kernel estimate inherits all the properties of the kernel, usually it is not of crucial matter if a non-optimal kernel was chosen. @@ -51,25 +51,25 @@ K_G(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} } \text{.} \end{equation} The flexibility of the KDE comes at the expense of computational efficiency, which leads to the development of more efficient computation schemes. -The computation time depends, besides the number of calculated points, on the number of data points $N$. +The computation time depends, besides the number of calculated points $M$, on the input size, namely the number of data points $N$. In general, reducing the size of the sample negatively affects the accuracy of the estimate. -Still, the sample size is a suitable parameter to speedup the computation. +Still, the sample size is a suitable parameter to speed up the computation. Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE. -Each bin represents the count of the sample set at a given point of a equidistant grid with spacing $\delta$. -A binning rule distributes a sample $x$ among the grid points $g_j=j\delta$, indexed by $j\in\Z$. +Each bin represents the count of the sample set at a given point of an equidistant grid with spacing $\delta$. +A binning rule distributes a sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$. % and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$. Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$. Given a binning rule $r_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as \begin{equation} \label{eq:binKde} -\tilde{f}(g_x) = \frac{1}{W} \sum_{j=1}^{G} \frac{C_j}{h} K \left(\frac{g_x-g_j}{h}\right) +\tilde{f}(g_x) = \frac{1}{W} \sum_{j=1}^{G} \frac{C_j}{h} K \left(\frac{g_x-g_j}{h}\right) \text{,} \end{equation} where $G$ is the number of grid points and \begin{equation} \label{eq:gridCnts} - C_j=\sum_{i=1}^{n} r_j(x_i,\delta) + C_j=\sum_{i=1}^{N} r_j(x_i,\delta) \end{equation} is the count at grid point $g_j$, such that $\sum_{j=1}^{G} C_j = W$ \cite{hall1996accuracy}. @@ -83,7 +83,7 @@ However, for many applications it is recommend to use the simple binning rule 0 & \text{else} \end{cases} \end{align} -or the common linear binning rule which divides the sample into two fractional weights shared by the nearest grid points +or the common linear binning rule, which divides the sample into two fractional weights shared by the nearest grid points \begin{align} \label{eq:linearBinning} r_j(x,\delta) &= @@ -94,32 +94,32 @@ or the common linear binning rule which divides the sample into two fractional w \end{align} An advantage is that their impact on the approximation error is extensively investigated and well understood \cite{hall1996accuracy}. Both methods can be computed with a fast $\landau{N}$ algorithm, as simple binning is essentially the quotient of an integer division and the fractional weights of the linear binning are given by the remainder of the division. -As linear binning is more precise it is often preferred over simple binning \cite{fan1994fast}. +As linear binning is more precise, it is often preferred over simple binning \cite{fan1994fast}. -While linear binning improves the accuracy of the estimate the choice of the grid size is of more importance. +While linear binning improves the accuracy of the estimate, the choice of the grid size is of more importance. The number of grid points $G$ determines the trade-off between the approximation error caused by the binning and the computational speed of the algorithm. -Clearly, a large value of $G$ produces a estimate close to the regular KDE, but requires more evaluations of the kernel compared to a coarser grid. +Clearly, a large value of $G$ produces an estimate close to the regular KDE, but requires more evaluations of the kernel compared to a coarser grid. However, it is unknown what particular $G$ gives the best trade-off for any given sample set. In general, there is no definite answer because the amount of binning depends on the structure of the unknown density and the sample size \cite{hall1996accuracy}. A naive implementation of \eqref{eq:binKde} reduces the number of kernel evaluations to $\landau{G^2}$, assuming that $G Fourier transfom -Kernel density estimation is well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}. +The Kernel density estimator is a well known non-parametric estimator, originally described independently by Rosenblatt \cite{rosenblatt1956remarks} and Parzen \cite{parzen1962estimation}. It was subject to extensive research and its theoretical properties are well understood. A comprehensive reference is given by Scott \cite{scott2015}. Although classified as non-parametric, the KDE depends on two free parameters, the kernel function and its bandwidth. @@ -24,7 +24,7 @@ Various methods have been proposed, which can be clustered based on different te % k-nearest neighbor searching An obvious way to speed up the computation is to reduce the number of evaluated kernel functions. -One possible optimization is based on k-nearest neighbour search performed on spatial data structures. +One possible optimization is based on k-nearest neighbour search, performed on spatial data structures. These algorithms reduce the number of evaluated kernels by taking the distance between clusters of data points into account \cite{gray2003nonparametric}. % fast multipole method & Fast Gaus Transform @@ -38,16 +38,16 @@ They define a Fourier-based filter on the empirical characteristic function of a The computation time was further reduced by \etal{O'Brien} using a non-uniform fast Fourier transform (FFT) algorithm to efficiently transform the data into Fourier space \cite{oBrien2016fast}. % binning => FFT -In general, it is desirable to omit a grid, as the data points do not necessary fall onto equally spaced points. -However, reducing the sample size by distributing the data on a equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable. +In general, it is desirable to omit a grid, as the data points do not necessarily fall onto equally spaced points. +However, reducing the sample size by distributing the data on an equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable. Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins, which results in a discrete convolution structure of the KDE. Allowing to efficiently compute the estimate using a FFT algorithm. This approximation scheme was later called binned KDE (BKDE) and was extensively studied \cite{fan1994fast} \cite{wand1994fast} \cite{hall1996accuracy}. -While the FFT algorithm poses an efficient algorithm for large sample sets, it adds an noticeable overhead for smaller ones. +While the FFT algorithm constitutes an efficient algorithm for large sample sets, it adds an noticeable overhead for smaller ones. The idea to approximate a Gaussian filter using several box filters was first formulated by Wells \cite{wells1986efficient}. Kovesi \cite{kovesi2010fast} suggested to use two box filters with different widths to increase accuracy maintaining the same complexity. -To eliminate the approximation error completely \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter. +To eliminate the approximation error completely, \etal{Gwosdek} \cite{gwosdek2011theoretical} proposed a new approach called extended box filter. This work highlights the discrete convolution structure of the BKDE and elaborates its connection to digital signal processing, especially the Gaussian filter. Accordingly, this results in an equivalence relation between BKDE and Gaussian filter. From c224967b1961fb88c2eafac225781918114112f4 Mon Sep 17 00:00:00 2001 From: MBulli Date: Mon, 12 Mar 2018 21:03:15 +0100 Subject: [PATCH 02/11] Fixed FD --- tex/chapters/introduction.tex | 2 +- tex/chapters/realworld.tex | 2 +- tex/egbib.bib | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tex/chapters/introduction.tex b/tex/chapters/introduction.tex index f915131..af59233 100644 --- a/tex/chapters/introduction.tex +++ b/tex/chapters/introduction.tex @@ -18,7 +18,7 @@ Additionally, in most practical scenarios the sample size and therefore the reso It is obvious, that a computation of the full posterior could solve the above, but finding such an analytical solution is an intractable problem, which is the reason for applying a sample representation in the first place. Another promising way is to recover the probability density function from the sample set itself, by using a non-parametric estimator like a kernel density estimation (KDE). -With this, it is easy to recover the \qq{real} most probable state and thus to avoid the aforementioned drawbacks. +With this, the \qq{real} most probable state is given by the maxima of the density estimation and thus avoids the aforementioned drawbacks. However, non-parametric estimators tend to consume a large amount of computational time, which renders them unpractical for real time scenarios. Nevertheless, the availability of a fast processing density estimate might improve the accuracy of today's sensor fusion systems without sacrificing their real time capability. diff --git a/tex/chapters/realworld.tex b/tex/chapters/realworld.tex index fd87bf5..33595ca 100644 --- a/tex/chapters/realworld.tex +++ b/tex/chapters/realworld.tex @@ -17,7 +17,7 @@ The bivariate state estimation was calculated whenever a step was recognized, ab \begin{figure} \input{gfx/walk.tex} - \caption{Occurring bimodal distribution at the start of the walk, caused by uncertain measurements. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the boxKDE approach (orange) does not. } + \caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the boxKDE approach (orange) does not. } \label{fig:realWorldMulti} \end{figure} % diff --git a/tex/egbib.bib b/tex/egbib.bib index d66a6a1..2c514ec 100644 --- a/tex/egbib.bib +++ b/tex/egbib.bib @@ -3017,7 +3017,7 @@ year = {2003} @article{oBrien2016fast, title={A fast and objective multidimensional kernel density estimation method: fastKDE}, - author={O’Brien, Travis A and Kashinath, Karthik and Cavanaugh, Nicholas R and Collins, William D and O’Brien, John P}, + author={O'Brien, Travis A and Kashinath, Karthik and Cavanaugh, Nicholas R and Collins, William D and O'Brien, John P}, journal={Computational Statistics \& Data Analysis}, volume={101}, pages={148--160}, From 316b1d2911f4a34363ab0442168778dc2685a884 Mon Sep 17 00:00:00 2001 From: MBulli Date: Mon, 12 Mar 2018 22:21:39 +0100 Subject: [PATCH 03/11] Fixed FE 1 --- tex/bare_conf.tex | 2 ++ tex/chapters/abstract.tex | 6 +++--- tex/chapters/conclusion.tex | 4 ++-- tex/chapters/experiments.tex | 38 +++++++++++++++++------------------ tex/chapters/introduction.tex | 8 ++++---- tex/chapters/kde.tex | 19 +++++++++--------- tex/chapters/multivariate.tex | 14 ++++++------- tex/chapters/mvg.tex | 10 ++++----- tex/chapters/realworld.tex | 23 +++++++++++---------- tex/chapters/relatedwork.tex | 4 ++-- tex/chapters/usage.tex | 20 +++++++++--------- 11 files changed, 76 insertions(+), 72 deletions(-) diff --git a/tex/bare_conf.tex b/tex/bare_conf.tex index e6d92a5..ce66156 100644 --- a/tex/bare_conf.tex +++ b/tex/bare_conf.tex @@ -119,6 +119,8 @@ \newcommand{\VecTwo}[2]{\ensuremath{\left[\begin{smallmatrix} #1 \\ #2 \end{smallmatrix}\right] }} \newcommand{\qq} [1]{``#1''} +\newcommand{\eg} {e.\,g.} +\newcommand{\ie} {i.\,e.} % missing math operators \DeclareMathOperator*{\argmin}{arg\,min} diff --git a/tex/chapters/abstract.tex b/tex/chapters/abstract.tex index 3f741fb..c8de6ed 100644 --- a/tex/chapters/abstract.tex +++ b/tex/chapters/abstract.tex @@ -1,14 +1,14 @@ \begin{abstract} It is common practice to use a sample-based representation to solve problems having a probabilistic interpretation. -In many real world scenarios one is then interested in finding a \qq{best estimate} of the underlying problem, e.g. the position of a robot. +In many real world scenarios one is then interested in finding a \qq{best estimate} of the underlying problem, \eg{} the position of a robot. This is often done by means of simple parametric point estimators, providing the sample statistics. However, in complex scenarios this frequently results in a poor representation, due to multimodal densities and limited sample sizes. -Recovering the probability density function using a kernel density estimation yields a promising approach to solve the state estimation problem i.e. finding the \qq{real} most probable state, but comes with high computational costs. +Recovering the probability density function using a kernel density estimation yields a promising approach to solve the state estimation problem \ie{} finding the \qq{real} most probable state, but comes with high computational costs. Especially in time critical and time sequential scenarios, this turns out to be impractical. Therefore, this work uses techniques from digital signal processing in the context of estimation theory, to allow rapid computations of kernel density estimates. The gains in computational efficiency are realized by substituting the Gaussian filter with an approximate filter based on the box filter. Our approach outperforms other state of the art solutions, due to a fully linear complexity and a negligible overhead, even for small sample sets. -Finally, our findings are tried and tested within a real world sensor fusion system. +Finally, our findings are evaluated and tested within a real world sensor fusion system. \end{abstract} diff --git a/tex/chapters/conclusion.tex b/tex/chapters/conclusion.tex index e80aaf7..a486cea 100644 --- a/tex/chapters/conclusion.tex +++ b/tex/chapters/conclusion.tex @@ -1,10 +1,10 @@ \section{Conclusion} -Within this paper a novel approach for rapid computation of the KDE was presented. +Within this paper a novel approach for rapid approximation of the KDE was presented. This is achieved by considering the discrete convolution structure of the BKDE and thus elaborate its connection to digital signal processing, especially the Gaussian filter. Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as confirmed by the utilized experiments. -The analysis of the error showed that the method exhibits an expected error behaviour compared to the BKDE. +The analysis of the error showed that the method exhibits an similar error behaviour compared to the BKDE. In terms of calculation time, our approach outperforms other state of the art implementations. Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with increasing number of dimensions. diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index 187282e..6d3a5d0 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -3,7 +3,7 @@ \subsection{Mean Integrated Squared Error} -We now empirically evaluate the accuracy of our boxKDE method, using the mean integrated squared error (MISE). +We now empirically evaluate the accuracy of our BoxKDE method, using the mean integrated squared error (MISE). The ground truth is given with $N=1000$ synthetic samples drawn from a bivariate mixture normal density $f$ \begin{equation} \begin{split} @@ -17,20 +17,20 @@ Therefore, the particular choice of the ground truth is only of minor importance \begin{figure}[t] \input{gfx/error.tex} - \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the boxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular boxKDE (orange) exhibits noticeable jumps to rounding.} \label{fig:errorBandwidth} + \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps to rounding.} \label{fig:errorBandwidth} \end{figure} -Evaluated at $50^2$ points the exact KDE is compared to the BKDE, boxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. +Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in fig.~\ref{fig:errorBandwidth}. A minimum error is obtained with $h=0.35$, for larger oversmoothing occurs and the modes gradually fuse together. Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable. They are rather close to each other, with a tendency to diverge for larger $h$. -In contrast, the error curve of the boxKDE has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$. +In contrast, the error curve of the BoxKDE has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$. These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}. -As the extend box filter is able to approximate an exact $\sigma$, it lacks these discontinues. -Consequently, it reduces the overall error of the approximation, but only marginal in this scenario. +As the extend box filter is able to approximate an exact $\sigma$, these discontinues don't appear. +Consequently, it reduces the overall error of the approximation, but only marginally in this scenario. The global average MISE over all values of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version. Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively. The choice between the extended and regular box filter algorithm depends on how large the acceptable error should be, thus on the particular application. @@ -42,7 +42,7 @@ However, both cases do not give a deeper insight of the error behavior of our me \begin{figure}[t] %\includegraphics[width=\textwidth,height=6cm]{gfx/tmpPerformance.png} \input{gfx/perf.tex} - \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted-average estimate (blue) performs fastest followed by the boxKDE (orange) approximation. Both the BKDE (red) and the fastKDE (green) are magnitudes slower, especially for $G<10^3$.}\label{fig:performance} + \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted-average estimate (blue) performs fastest followed by the BoxKDE (orange) approximation. Both the BKDE (red) and the FastKDE (green) are magnitudes slower, especially for $G<10^3$.}\label{fig:performance} \end{figure} % kde, box filter, exbox in abhänigkeit von h (bild) @@ -53,18 +53,18 @@ However, both cases do not give a deeper insight of the error behavior of our me \subsection{Performance} In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods. All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. -We compare our C++ implementation of the boxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the fastKDE Python implementation \cite{oBrien2016fast}. +We compare our C++ implementation of the BoxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the FastKDE Python implementation \cite{oBrien2016fast}. The \texttt{ks} package provides a FFT-based BKDE implementation based on optimized C functions at its core. With state estimation problems in mind, we additionally provide a C++ implementation of a weighted-average estimator. -As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the fastKDE. +As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the FastKDE. -The results for performance comparison are presented in fig.~\ref{fig:performance}. +The results of the performance comparison are presented in fig.~\ref{fig:performance}. % O(N) gut erkennbar für box KDE und weighted average -The linear complexity of the boxKDE and the weighted average is clearly visible. -% Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und fastKDE, aber das WA deutlich schneller als alle anderen -Especially for small $G$ up to $10^3$ the boxKDE is much faster compared to BKDE and fastKDE. +The linear complexity of the BoxKDE and the weighted average is clearly visible. +% Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und FastKDE, aber das WA deutlich schneller als alle anderen +Especially for small $G$ up to $10^3$ the BoxKDE is much faster compared to BKDE and FastKDE. % Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer. -Nevertheless, the simple weighted-average approach performs the fastest and with increasing $G$ the distance to the boxKDE grows constantly. +Nevertheless, the simple weighted-average approach performs the fastest and with increasing $G$ the distance to the BoxKDE grows constantly. However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}. % (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte) @@ -82,13 +82,13 @@ The termination of BKDE graph at $G=4406^2$ is caused by an out of memory error % Sowohl der box filter als auch der extended box filter haben ein sehr ähnliches Laufzeit Verhalten und somit einen sehr ähnlichen Kurvenverlauf. % Während die durschnittliche Laufzeit über alle Werte von G beim box filter bei 0.4092s liegt, benötigte der extended box filter im Durschnitt 0.4169s. Both discussed Gaussian filter approximations, namely box filter and extended box filter, yield a similar runtime behavior and therefore a similar curve progression. -While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one provides an average of \SI{0.4169}{\second}. -To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of the boxKDE with the regular box filter. +While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one has an average of \SI{0.4169}{\second}. +To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of the BoxKDE with the regular box filter. The weighted-average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly. -In contrast, the computation of the boxKDE approach increases exponentially with increasing number of dimensions. -However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficient fast for many applications and much smaller compared to other methods. -The boxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. +In contrast, the computation of the BoxKDE approach increases exponentially with increasing number of dimensions. +However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficiently fast for many applications and much smaller compared to other methods. +The BoxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliable predictable. Furthermore, the computation is easily parallelized, as there is no data dependency between the one-dimensional filter passes in algorithm~\ref{alg:boxKDE}. diff --git a/tex/chapters/introduction.tex b/tex/chapters/introduction.tex index af59233..0e28cea 100644 --- a/tex/chapters/introduction.tex +++ b/tex/chapters/introduction.tex @@ -3,10 +3,10 @@ Sensor fusion approaches are often based upon probabilistic descriptions like particle filters, using samples to represent the distribution of a dynamical system. To update the system recursively in time, probabilistic sensor models process the noisy measurements and a state transition function provides the system's dynamics. -Therefore a sample or particle is a representation of one possible system state, e.g. the position of a pedestrian within a building. +Therefore a sample or particle is a representation of one possible system state, \eg{} the position of a pedestrian within a building. In most real world scenarios one is then interested in finding the most probable state within the state space, to provide the best estimate of the underlying problem, generally speaking, solving the state estimation problem. In the discrete manner of a sample representation this is often done by providing a single value, also known as sample statistic, to serve as a \qq{best guess}. -This value is then calculated by means of simple parametric point estimators, e.g. the weighted-average over all samples, the sample with the highest weight or by assuming other parametric statistics like normal distributions \cite{Fetzer2016OMC}. +This value is then calculated by means of simple parametric point estimators, \eg{} the weighted-average over all samples, the sample with the highest weight or by assuming other parametric statistics like normal distributions \cite{Fetzer2016OMC}. %da muss es doch noch andere methoden geben... verflixt und zugenäht... aber grundsätzlich ist ein weighted average doch ein point estimator? (https://www.statlect.com/fundamentals-of-statistics/point-estimation) %Für related work brauchen wir hier definitiv quellen. einige berechnen ja auch https://en.wikipedia.org/wiki/Sample_mean_and_covariance oder nehmen eine gewisse verteilung für die sample menge and und berechnen dort die parameter @@ -19,14 +19,14 @@ Additionally, in most practical scenarios the sample size and therefore the reso It is obvious, that a computation of the full posterior could solve the above, but finding such an analytical solution is an intractable problem, which is the reason for applying a sample representation in the first place. Another promising way is to recover the probability density function from the sample set itself, by using a non-parametric estimator like a kernel density estimation (KDE). With this, the \qq{real} most probable state is given by the maxima of the density estimation and thus avoids the aforementioned drawbacks. -However, non-parametric estimators tend to consume a large amount of computational time, which renders them unpractical for real time scenarios. +However, non-parametric estimators tend to consume a large amount of computation time, which renders them unpractical for real time scenarios. Nevertheless, the availability of a fast processing density estimate might improve the accuracy of today's sensor fusion systems without sacrificing their real time capability. %Therefore, this paper presents a novel approximation approach for rapid computation of the KDE. %In this paper, a well known approximation of the Gaussian filter is used to speed up the computation of the KDE. In this paper, a novel approximation approach for rapid computation of the KDE is presented. The basic idea is to interpret the estimation problem as a filtering operation. -We show that computing the KDE with a Gaussian kernel on pre-binned data is equal to applying a Gaussian filter on the binned data. +We show that computing the KDE with a Gaussian kernel on binned data is equal to applying a Gaussian filter on the binned data. This allows us to use a well known approximation scheme for Gaussian filters: the box filter. By the central limit theorem, multiple recursion of a box filter yields an approximative Gaussian filter \cite{kovesi2010fast}. diff --git a/tex/chapters/kde.tex b/tex/chapters/kde.tex index 00f035b..0ebf8e9 100644 --- a/tex/chapters/kde.tex +++ b/tex/chapters/kde.tex @@ -13,7 +13,7 @@ %In contrast, The KDE is often the preferred tool to estimate a density function from discrete data samples because of its flexibility and ability to produce a continuous estimate. % -Given a univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights. +Given an univariate random sample set $X=\{X_1, \dots, X_N\}$, where $X$ has the density function $f$ and let $w_1, \dots w_N$ be associated weights. The kernel estimator $\hat{f}$ which estimates $f$ at the point $x$ is given as \begin{equation} \label{eq:kde} @@ -31,7 +31,7 @@ As a matter of fact, the quality of the kernel estimate is primarily determined % %Any non-optimal bandwidth causes undersmoothing or oversmoothing. %An undersmoothing estimator has a large variance and hence a small $h$ leads to undersmoothing. -%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite[7]{Cybakov2009}. +%On the other hand given a large $h$ the bias increases, which leads to oversmoothing \cite{Cybakov2009}. %Clearly with an adverse choice of the bandwidth crucial information like modality might get smoothed out. %All in all it is not obvious to determine a good choice of the bandwidth. % @@ -50,16 +50,17 @@ The Gaussian kernel is given as K_G(u)=\frac{1}{\sqrt{2\pi}} \expp{- \frac{u^2}{2} } \text{.} \end{equation} -The flexibility of the KDE comes at the expense of computational efficiency, which leads to the development of more efficient computation schemes. -The computation time depends, besides the number of calculated points $M$, on the input size, namely the number of data points $N$. -In general, reducing the size of the sample negatively affects the accuracy of the estimate. -Still, the sample size is a suitable parameter to speed up the computation. +The flexibility of the KDE comes at the expense of computation speed, which leads to the development of more efficient computation schemes. +The computation time depends, besides the number of calculated points $M$, on the input size, namely the size of sample $N$. +In general, reducing the size of the sample set negatively affects the accuracy of the estimate. +Still, $N$ is a suitable parameter to speed up the computation. -Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE. +The BKDE reduces $N$ by combining each single sample with its adjacent samples into bins, and thus, approximates the KDE. +%Since each single sample is combined with its adjacent samples into bins, the BKDE approximates the KDE. Each bin represents the count of the sample set at a given point of an equidistant grid with spacing $\delta$. -A binning rule distributes a sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$. +A binning rule distributes each sample among the grid points $g_j=j\delta$, indexed by $j\in\Z$. % and can be represented as a set of functions $\{ w_j(x,\delta), j\in\Z \}$. -Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$. +Computation requires a finite grid on the interval $[a,b]$ containing the data, thus the number of grid points is $G=(b-a)/\delta+1$ \cite{hall1996accuracy}. Given a binning rule $r_j$ the BKDE $\tilde{f}$ of a density $f$ computed pointwise at the grid point $g_x$ is given as \begin{equation} diff --git a/tex/chapters/multivariate.tex b/tex/chapters/multivariate.tex index 28875c3..7df81cd 100644 --- a/tex/chapters/multivariate.tex +++ b/tex/chapters/multivariate.tex @@ -10,15 +10,15 @@ Multivariate kernel functions can be constructed in various ways, however, a pop Such a kernel is constructed by combining several univariate kernels into a product, where each kernel is applied in each dimension with a possibly different bandwidth. Given a multivariate random variable $\bm{X}=(x_1,\dots ,x_d)$ in $d$ dimensions. -The sample set $\mathcal{X}$ is a $n\times d$ matrix \cite[162]{scott2015}. +The sample set $\mathcal{X}$ is a $n\times d$ matrix \cite{scott2015}. The multivariate KDE $\hat{f}$ which defines the estimate pointwise at $\bm{u}=(u_1, \dots, u_d)^T$ is given as \begin{equation} \label{eq:mvKDE} - \hat{f}(\bm{u}) = \frac{1}{W} \sum_{i=1}^{n} \frac{w_i}{h_1 \dots h_d} \left[ \prod_{j=1}^{d} K\left( \frac{u_j-x_{ij}}{h_j} \right) \right] \text{,} + \hat{f}(\bm{u}) = \frac{1}{W} \sum_{i=1}^{n} \frac{w_i}{h_1 \dots h_d} \left[ \prod_{j=1}^{d} K\left( \frac{u_j-x_{i,j}}{h_j} \right) \right] \text{,} \end{equation} where the bandwidth is given as a vector $\bm{h}=(h_1, \dots, h_d)$. -Note that \eqref{eq:mvKDE} does not include all possible multivariate kernels, such as spherically symmetric kernels, which are based on rotation of a univariate kernel. +Note that \eqref{eq:mvKDE} does not include all possible multivariate kernels, such as spherically symmetric kernels, which are based on rotation of an univariate kernel. In general, a multivariate product and spherically symmetric kernel based on the same univariate kernel will differ. The only exception is the Gaussian kernel, which is spherically symmetric and has independent marginals. % TODO scott cite?! In addition, only smoothing in the direction of the axes is possible. @@ -30,7 +30,7 @@ Likewise, the ideas of common and linear binning rule scale with dimensionality In general, multi-dimensional filters are multi-dimensional convolution operations. However, by utilizing the separability property of convolution, a straightforward and a more efficient implementation can be found. -Convolution is separable if the filter kernel is separable, i.e. it can be split into successive convolutions of several kernels. +Convolution is separable if the filter kernel is separable, \ie{} it can be split into successive convolutions of several kernels. In example, the Gaussian filter is separable, because of $e^{x^2+y^2} = e^{x^2}\cdot e^{y^2}$. Likewise digital filters based on such kernels are called separable filters. They are easily applied to multi-dimensional signals, because the input signal can be filtered in each dimension individually by an one-dimensional filter \cite{dspGuide1997}. @@ -45,7 +45,7 @@ They are easily applied to multi-dimensional signals, because the input signal c %These kind of multivariate kernel is called product kernel as the multivariate kernel result is the product of each individual univariate kernel. % %Given a multivariate random variable $X=(x_1,\dots ,x_d)$ in $d$ dimensions. -%The sample $\bm{X}$ is a $n\times d$ matrix defined as \cite[162]{scott2015} +%The sample $\bm{X}$ is a $n\times d$ matrix defined as \cite{scott2015} %\begin{equation} % \bm{X}= % \begin{pmatrix} @@ -61,7 +61,7 @@ They are easily applied to multi-dimensional signals, because the input signal c % \end{pmatrix} \text{.} %\end{equation} % -%The multivariate kernel density estimator $\hat{f}$ which defines the estimate pointwise at $\bm{x}=(x_1, \dots, x_d)^T$ is given as \cite[162]{scott2015} +%The multivariate kernel density estimator $\hat{f}$ which defines the estimate pointwise at $\bm{x}=(x_1, \dots, x_d)^T$ is given as \cite{scott2015} %\begin{equation} % \hat{f}(\bm{x}) = \frac{1}{nh_1 \dots h_d} \sum_{i=1}^{n} \left[ \prod_{j=1}^{d} K\left( \frac{x_j-x_{ij}}{h_j} \right) \right] \text{.} %\end{equation} @@ -77,7 +77,7 @@ They are easily applied to multi-dimensional signals, because the input signal c %\end{equation} % Gaus: -%If the filter kernel is separable, the convolution is also separable i.e. multi-dimensional convolution can be computed as individual one-dimensional convolutions with a one-dimensional kernel. +%If the filter kernel is separable, the convolution is also separable \ie{} multi-dimensional convolution can be computed as individual one-dimensional convolutions with a one-dimensional kernel. %Because of $e^{x^2+y^2} = e^{x^2}\cdot e^{y^2}$ the Gaussian filter is separable and can be easily applied to multi-dimensional signals. \todo{quelle} diff --git a/tex/chapters/mvg.tex b/tex/chapters/mvg.tex index b6ead5d..e1f63e3 100644 --- a/tex/chapters/mvg.tex +++ b/tex/chapters/mvg.tex @@ -4,7 +4,7 @@ % Gauss Blur Filter % Repetitive Box filter to approx Gauss % Simple multipass, n/m approach, extended box filter -Digital filters are implemented by convolving the input signal with a filter kernel, i.e. the digital filter's impulse response. +Digital filters are implemented by convolving the input signal with a filter kernel, \ie{} the digital filter's impulse response. Consequently, the filter kernel of a Gaussian filter is a Gaussian with finite support \cite{dspGuide1997}. Assuming a finite-support Gaussian filter kernel of size $M$ and an input signal $x$, discrete convolution produces the smoothed output signal \begin{equation} @@ -14,8 +14,8 @@ Assuming a finite-support Gaussian filter kernel of size $M$ and an input signal where $\sigma$ is a smoothing parameter called standard deviation. Note that \eqref{eq:bkdeGaus} has the same structure as \eqref{eq:gausFilt}, except the varying notational symbol of the smoothing parameter and the different factor in front of the sum. -While in both equations the constant factor of the Gaussian is removed of the inner sum, \eqref{eq:bkdeGaus} has an additional normalization factor $W^{-1}$. -This factor is necessary to ensure that the estimate is a valid density function, i.e. that it integrates to one. +While in both equations the constant factor of the Gaussian is removed from the inner sum, \eqref{eq:bkdeGaus} has an additional normalization factor $W^{-1}$. +This factor is necessary to ensure that the estimate is a valid density function, \ie{} that it integrates to one. Such a restriction is superfluous in the context of digital filters, so the normalization factor is omitted. Computation of a digital filter using the naive implementation of the discrete convolution algorithm yields $\landau{NM}$, where $N$ is again the input size given by the length of the input signal and $M$ is the size of the filter kernel. @@ -77,7 +77,7 @@ The overall algorithm to efficiently compute \eqref{eq:boxFilt} is listed in Alg \end{algorithm} Given a fast approximation scheme, it is necessary to construct a box filter analogous to a given Gaussian filter. -As seen in \eqref{eq:gausFilt}, the solely parameter of the Gaussian kernel is the standard deviation $\sigma$. +As seen in \eqref{eq:gausFilt}, the sole parameter of the Gaussian kernel is the standard deviation $\sigma$. In contrast, the box function \eqref{eq:boxFx} is parametrized by its width $L$. Therefore, in order to approximate the Gaussian filter of a given $\sigma$, a corresponding value of $L$ must be found. Given $n$ iterations of box filters with identical sizes the ideal size $\Lideal$, as suggested by Wells~\cite{wells1986efficient}, is @@ -112,7 +112,7 @@ The approximated $\sigma$ as a function of the integer width has a staircase sha By reducing the rounding error, the step size of the function is reduced. However, the overall shape will not change. \etal{Gwosdek}~\cite{gwosdek2011theoretical} proposed an approach which allows to approximate any real-valued value of $\sigma$. -Just like the conventional box filter, the extended version has a uniform value in the range $[-l; l]$, but unlike the conventional the extended box filter has different values at its edges. +Just like the conventional box filter, the extended version has a uniform value in the range $[-l; l]$, but unlike the conventional, the extended box filter has different values at its edges. This extension introduces only marginal computational overhead over conventional box filtering. diff --git a/tex/chapters/realworld.tex b/tex/chapters/realworld.tex index 33595ca..c2da623 100644 --- a/tex/chapters/realworld.tex +++ b/tex/chapters/realworld.tex @@ -2,12 +2,12 @@ To demonstrate the real time capabilities of the proposed method a real world scenario was chosen, namely indoor localization. The given problem is to localize a pedestrian walking inside a building. -Ebner et al. proposed a method, which incorporates multiple sensors, e.g. Wi-Fi, barometer, step-detection and turn-detection \cite{Ebner-15}. +Ebner et al. proposed a method, which incorporates multiple sensors, \eg{} Wi-Fi, barometer, step-detection and turn-detection \cite{Ebner-15}. At a given time $t$ the system estimates a state providing the most probable position of the pedestrian. It is implemented using a particle filter with sample importance resampling and \SI{5000} particles. The dynamics are modelled realistically, which constrains the movement according to walls, doors and stairs. -We arranged a \SI{223}{\meter} long walk within the first floor of a \SI{2500}{m$^2$} museum, which was build in the 13th century and therefore offers non-optimal conditions for localization. +We arranged a \SI{223}{\meter} long walk within the first floor of a \SI{2500}{m$^2$} museum, which was built in the 13th century and therefore offers non-optimal conditions for localization. %The measurements for the walks were recorded using a Motorola Nexus 6 at 2.4 GHz band only. % Since this work only focuses on processing a given sample set, further details of the localisation system and the described scenario can be looked up in \cite{Ebner17} and \cite{Fetzer17}. @@ -17,26 +17,26 @@ The bivariate state estimation was calculated whenever a step was recognized, ab \begin{figure} \input{gfx/walk.tex} - \caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the boxKDE approach (orange) does not. } + \caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. } \label{fig:realWorldMulti} \end{figure} % Fig.~\ref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions. This results in a bimodal posterior distribution, which reaches its maximum distances between the modes at \SI{13.4}{\second} (black dotted line). -Thus estimating the most probable state using the weighted-average results in the blue line, describing the pedestrian's position to be somewhere outside the building (light green area). -In contrast, the here proposed method (orange line) is able to retrieve a good estimate compared the the ground truth path shown by the black solid line. +Thus estimating the most probable state over time using the weighted-average results in the blue line, describing the pedestrian's position to be somewhere outside the building (light green area). +In contrast, the here proposed method (orange line) is able to retrieve a good estimate compared to the ground truth path shown by the black solid line. Due to a right turn, the distribution gets unimodal after \SI{20.8}{\second}. -This happens since the lower red particles are walking against a wall and thus punished with a low weight. +This happens since the lower red particles are walking against a wall and are punished with a low weight. This example highlights the main benefits using our approach. While being fast enough to be computed in real time, the proposed method reduces the estimation error of the state in this situation, as it is possible to distinguish the two modes of the density. It is clearly visible, that this enables the system to recover the real state if multimodalities arise. -However, in situations with highly uncertain measurements, the estimation error could further increase since the real estimate is not equal to the best estimate, i.e. the real position of the pedestrian. +However, in situations with highly uncertain measurements, the estimation error could further increase since the real estimate is not equal to the best estimate, \ie{} the real position of the pedestrian. The error over time for different estimation methods of the complete walk can be seen in fig. \ref{fig:realWorldTime}. It is given by calculating the distance between estimation and ground truth at a specific time $t$. Estimates provided by simply choosing the maximum particle stand out the most. -As one could have expected beforehand, this method provides many strong peaks through continues jumping between single particles. +As one could have expected beforehand, this method provides many strong peaks through continuously jumping between single particles. Additionally, in most real world scenarios many particles share the same weight and thus multiple highest-weighted particles exist. \begin{figure} @@ -45,16 +45,17 @@ Additionally, in most real world scenarios many particles share the same weight \label{fig:realWorldTime} \end{figure} -Further investigating fig. \ref{fig:realWorldTime}, the boxKDE performs slightly better than the weighted-average, however after deploying \SI{100} Monte Carlo runs, the difference becomes insignificant. +Further investigating fig. \ref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average. +However after deploying \SI{100} Monte Carlo runs, the difference becomes insignificant. The main reason for this are again multimodalities caused by faulty or delayed measurements, especially when entering or leaving rooms. Within our experiments the problem occurred due to slow and attenuated Wi-Fi signals inside thick-walled rooms. While the system's dynamics are moving the particles outside, the faulty Wi-Fi readings are holding back a majority by assigning corresponding weights. Therefore, the average between the modes of the distribution is often closer to the ground truth as the real estimate, which is located on the \qq{wrong} mode. With new measurements coming from the hallway or other parts of the building, the distribution and thus the estimation are able to recover. -Nevertheless, it could be seen that our approach is able to resolve multimodalities even under real world conditions. +Nevertheless, it can be seen that our approach is able to resolve multimodalities even under real world conditions. It does not always provide the lowest error, since it depends more on an accurate sensor model than a weighted-average approach, but is very suitable as a good indicator about the real performance of a sensor fusion system. -At the end, in the here shown examples we only searched for a global maxima, even though the boxKDE approach opens a wide range of other possibilities for finding a best estimate. +In the here shown examples we only searched for a global maxima, even though the BoxKDE approach opens a wide range of other possibilities for finding a best estimate. %springt nicht so viel wie maximum %sehr ähnlich zu weighted-average. in 1000 mc runs ist sind average und std sehr ähnlich. diff --git a/tex/chapters/relatedwork.tex b/tex/chapters/relatedwork.tex index 695b00b..bba48a1 100644 --- a/tex/chapters/relatedwork.tex +++ b/tex/chapters/relatedwork.tex @@ -33,12 +33,12 @@ The term fast Gauss transform was coined by Greengard \cite{greengard1991fast} w % However, the complexity grows exponentially with dimension. \cite{Improved Fast Gauss Transform and Efficient Kernel Density Estimation} % FastKDE, passed on ECF and nuFFT -Recent methods based on the self-consistent KDE proposed by Bernacchia and Pigolotti \cite{bernacchia2011self} allow to obtain an estimate without any assumptions, i.e. the kernel and bandwidth are both derived during the estimation. +Recent methods based on the self-consistent KDE proposed by Bernacchia and Pigolotti \cite{bernacchia2011self} allow to obtain an estimate without any assumptions, \ie{} the kernel and bandwidth are both derived during the estimation. They define a Fourier-based filter on the empirical characteristic function of a given dataset. The computation time was further reduced by \etal{O'Brien} using a non-uniform fast Fourier transform (FFT) algorithm to efficiently transform the data into Fourier space \cite{oBrien2016fast}. % binning => FFT -In general, it is desirable to omit a grid, as the data points do not necessarily fall onto equally spaced points. +In general, it is desirable to compute the estimate directly from the sample set. However, reducing the sample size by distributing the data on an equidistant grid can significantly reduce the computation time, if an approximative KDE is acceptable. Silverman \cite{silverman1982algorithm} originally suggested to combine adjacent data points into data bins, which results in a discrete convolution structure of the KDE. Allowing to efficiently compute the estimate using a FFT algorithm. diff --git a/tex/chapters/usage.tex b/tex/chapters/usage.tex index 54e0888..98ce473 100644 --- a/tex/chapters/usage.tex +++ b/tex/chapters/usage.tex @@ -5,7 +5,7 @@ %As the density estimation poses only a single step in the whole process, its computation needs to be as fast as possible. % not taking to much time from the frame -Consider a set of two-dimensional samples with associated weights, e.g. presumably generated from a particle filter system. +Consider a set of two-dimensional samples with associated weights, \eg{} presumably generated from a particle filter system. The overall process for bivariate data is described in Algorithm~\ref{alg:boxKDE}. Assuming that the given $N$ samples are stored in a sequential list, the first step is to create a grid representation. @@ -35,7 +35,7 @@ Such knowledge should be integrated into the system to avoid a linear search ove \Statex %\For{$1 \textbf{ to } n$} - \Loop{ $n$ \textbf{times}} \Comment{$n$ box filter iterations} + \Loop{ $n$ \textbf{times}} \Comment{$n$ separated box filter iterations} \For{$ i=1 \textbf{ to } G_1$} @@ -51,26 +51,26 @@ Such knowledge should be integrated into the system to avoid a linear search ove \end{algorithm} Given the extreme values of the samples and grid sizes $G_1$ and $G_2$ defined by the user, a $G_1\times G_2$ grid can be constructed, using a binning rule from \eqref{eq:simpleBinning} or \eqref{eq:linearBinning}. -As the number of grid points directly affects both computation time and accuracy, a suitable grid should be as coarse as possible, but at the same time narrow enough to produce an estimate sufficiently fast with an acceptable approximation error. +As the number of grid points directly affects both, computation time and accuracy, a suitable grid should be as coarse as possible, but at the same time narrow enough to produce an estimate sufficiently fast with an acceptable approximation error. If the extreme values are known in advanced, the computation of the grid is $\landau{N}$, otherwise an additional $\landau{N}$ search is required. The grid is stored as an linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$. Next, the binned data is filtered with a Gaussian using the box filter approximation. -The box filter width is derived from the standard deviation of the approximated Gaussian, which is in turn equal to the bandwidth of the KDE. +The box filter's width is derived by \eqref{eq:boxidealwidth} from the standard deviation of the approximated Gaussian, which is in turn equal to the bandwidth of the KDE. However, the bandwidth $h$ needs to be scaled according to the grid size. -This is necessary as $h$ is defined in the input space of the KDE, i.e. in relation to the sample data. +This is necessary as $h$ is defined in the input space of the KDE, \ie{} in relation to the sample data. In contrast, the bandwidth of a BKDE is defined in the context of the binned data, which differs from the unbinned data due to the discretisation of the samples. For this reason, $h$ needs to be divided by the bin size to account the discrepancy between the different sampling spaces. -Given the scaled bandwidth the required box filter width can be computed. % as in \eqref{label} +Given the scaled bandwidth the required box filter's width can be computed. % as in \eqref{label} Due to its best runtime performance the recursive box filter implementation is used. If multivariate data is processed, the algorithm is easily extended due to its separability. -Each filter pass is computed in $\landau{G}$ operations, however, an additional memory buffer is required. +Each filter pass is computed in $\landau{G}$ operations, however, an additional memory buffer is required \cite{dspGuide1997}. While the integer-sized box filter requires fewest operations, it causes a larger approximation error due to rounding errors. -Depending on the required accuracy the extended box filter algorithm can further improve the estimation results, with only a small additional overhead. -Due to its simple indexing scheme, the recursive box filter can easily be computed in parallel using SIMD operations or parallel computation cores. +Depending on the required accuracy, the extended box filter algorithm can further improve the estimation results, with only a small additional overhead \cite{gwosdek2011theoretical}. +Due to its simple indexing scheme, the recursive box filter can easily be computed in parallel using SIMD operations and parallel computation cores. -Finally, the most likely state can be obtained from the filtered data, i.e. from the estimated discrete density, by searching filtered data for its maximum value. +Finally, the most likely state can be obtained from the filtered data, \ie{} from the estimated discrete density, by searching filtered data for its maximum value. From b76b07b485c3c522a9874f991a1ef74672b8abcb Mon Sep 17 00:00:00 2001 From: MBulli Date: Mon, 12 Mar 2018 23:24:45 +0100 Subject: [PATCH 04/11] Eval fix? --- tex/chapters/experiments.tex | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index 6d3a5d0..23b9cac 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -1,10 +1,13 @@ \section{Experiments} \subsection{Mean Integrated Squared Error} +We now empirically evaluate the feasibility of our BoxKDE method by analyzing its approximation error. +In order to evaluate the error the KDE and various approximations of it are computed and compared using the mean integrated squared error (MISE). +A synthetic sample set $\bm{X}$ with $N=1000$ obtained from a bivariate mixture normal density $f$ provides the basis of the comparison. +For each method an estimate is computed and the MISE of it relative to $f$ is calculated. +The specific structure of the underlying distribution clearly affects the error in the estimate, but only the closeness of the approximation to the KDE is of interest. +Hence, $f$ is of minor importance here and was chosen rather arbitrary to highlight the behavior of the BoxKDE. - -We now empirically evaluate the accuracy of our BoxKDE method, using the mean integrated squared error (MISE). -The ground truth is given with $N=1000$ synthetic samples drawn from a bivariate mixture normal density $f$ \begin{equation} \begin{split} \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\ @@ -12,21 +15,20 @@ The ground truth is given with $N=1000$ synthetic samples drawn from a bivariate \end{split} \end{equation} where the majority of the probability mass lies in the range $[-6; 6]^2$. -Clearly, the structure of the ground truth affects the error in the estimate, but as our method approximates the KDE only the closeness to the KDE is of interest. -Therefore, the particular choice of the ground truth is only of minor importance here. \begin{figure}[t] \input{gfx/error.tex} \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps to rounding.} \label{fig:errorBandwidth} \end{figure} -Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. -The MISE between $f$ and the estimates as a function of $h$ are evaluated, and the resulting plot is given in fig.~\ref{fig:errorBandwidth}. -A minimum error is obtained with $h=0.35$, for larger oversmoothing occurs and the modes gradually fuse together. +Four estimates are computed with varying bandwidth using the exact KDE, BKDE, BoxKDE, and ExBoxKDE, which uses the extended box filter. +%Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. +The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in fig.~\ref{fig:errorBandwidth}. +A minimum error is obtained with $h=0.35$, for larger values oversmoothing occurs and the modes gradually fuse together. -Both the BKDE and the extended box filter estimate resemble the error curve of the KDE quite well and stable. +Both the BKDE and the ExBoxKDE resemble the error curve of the KDE quite well and stable. They are rather close to each other, with a tendency to diverge for larger $h$. -In contrast, the error curve of the BoxKDE has noticeable jumps at $h=(0.4; 0.252; 0.675; 0.825)$. +In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.40, 0.67, 0.82\}$. These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}. As the extend box filter is able to approximate an exact $\sigma$, these discontinues don't appear. From 9f098887dbbbb14c7841c294d4fa87c7c4fcbf6d Mon Sep 17 00:00:00 2001 From: MBulli Date: Mon, 12 Mar 2018 23:24:55 +0100 Subject: [PATCH 05/11] Language --- tex/bare_conf.tex | 2 +- tex/chapters/conclusion.tex | 8 ++++---- tex/chapters/experiments.tex | 24 ++++++++++++------------ tex/chapters/realworld.tex | 14 +++++++------- tex/chapters/usage.tex | 11 ++++++----- 5 files changed, 30 insertions(+), 29 deletions(-) diff --git a/tex/bare_conf.tex b/tex/bare_conf.tex index ce66156..069f978 100644 --- a/tex/bare_conf.tex +++ b/tex/bare_conf.tex @@ -121,7 +121,7 @@ \newcommand{\qq} [1]{``#1''} \newcommand{\eg} {e.\,g.} \newcommand{\ie} {i.\,e.} - +\newcommand{\figref}[1]{Fig.~\ref{#1}} % missing math operators \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} diff --git a/tex/chapters/conclusion.tex b/tex/chapters/conclusion.tex index a486cea..b69df65 100644 --- a/tex/chapters/conclusion.tex +++ b/tex/chapters/conclusion.tex @@ -1,12 +1,12 @@ \section{Conclusion} Within this paper a novel approach for rapid approximation of the KDE was presented. -This is achieved by considering the discrete convolution structure of the BKDE and thus elaborate its connection to digital signal processing, especially the Gaussian filter. -Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as confirmed by the utilized experiments. +This is achieved by considering the discrete convolution structure of the BKDE and thus elaborating its connection to digital signal processing, especially the Gaussian filter. +Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as demonstrated by the utilized experiments. -The analysis of the error showed that the method exhibits an similar error behaviour compared to the BKDE. +The analysis of the error showed that the method shows an similar error behaviour compared to the BKDE. In terms of calculation time, our approach outperforms other state of the art implementations. -Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with increasing number of dimensions. +Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with an increasing number of dimensions. %future work kurz Finally, such a fast computation scheme makes the KDE more attractive for real time use cases. diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index 23b9cac..c7df8c7 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -11,19 +11,19 @@ Hence, $f$ is of minor importance here and was chosen rather arbitrary to highli \begin{equation} \begin{split} \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\ - &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}} + &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}} \text{,} \end{split} \end{equation} where the majority of the probability mass lies in the range $[-6; 6]^2$. \begin{figure}[t] \input{gfx/error.tex} - \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps to rounding.} \label{fig:errorBandwidth} + \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps due to rounding.} \label{fig:errorBandwidth} \end{figure} Four estimates are computed with varying bandwidth using the exact KDE, BKDE, BoxKDE, and ExBoxKDE, which uses the extended box filter. %Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. -The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in fig.~\ref{fig:errorBandwidth}. +The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in \figref{fig:errorBandwidth}. A minimum error is obtained with $h=0.35$, for larger values oversmoothing occurs and the modes gradually fuse together. Both the BKDE and the ExBoxKDE resemble the error curve of the KDE quite well and stable. @@ -31,14 +31,14 @@ They are rather close to each other, with a tendency to diverge for larger $h$. In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.40, 0.67, 0.82\}$. These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}. -As the extend box filter is able to approximate an exact $\sigma$, these discontinues don't appear. -Consequently, it reduces the overall error of the approximation, but only marginally in this scenario. +As the extended box filter is able to approximate an exact $\sigma$, such discontinuities don't appear. +Consequently, it reduces the overall error of the approximation, even though only marginal in this scenario. The global average MISE over all values of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version. Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively. The choice between the extended and regular box filter algorithm depends on how large the acceptable error should be, thus on the particular application. Other test cases of theoretical relevance are the MISE as a function of the grid size $G$ and the sample size $N$. -However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirm the theoretical expectations. +However, both cases do not give a deeper insight of the error behavior of our method, as it closely mimics the error curve of the KDE and only confirms theoretical expectations. \begin{figure}[t] @@ -54,25 +54,25 @@ However, both cases do not give a deeper insight of the error behavior of our me \subsection{Performance} In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods. -All tests are performed on a Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. +All tests are performed on an Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. We compare our C++ implementation of the BoxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the FastKDE Python implementation \cite{oBrien2016fast}. The \texttt{ks} package provides a FFT-based BKDE implementation based on optimized C functions at its core. With state estimation problems in mind, we additionally provide a C++ implementation of a weighted-average estimator. As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the FastKDE. -The results of the performance comparison are presented in fig.~\ref{fig:performance}. +The results of the performance comparison are presented in \figref{fig:performance}. % O(N) gut erkennbar für box KDE und weighted average The linear complexity of the BoxKDE and the weighted average is clearly visible. % Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und FastKDE, aber das WA deutlich schneller als alle anderen Especially for small $G$ up to $10^3$ the BoxKDE is much faster compared to BKDE and FastKDE. % Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer. -Nevertheless, the simple weighted-average approach performs the fastest and with increasing $G$ the distance to the BoxKDE grows constantly. +Nevertheless, the simple weighted-average approach performs the fastest, and with increasing $G$ the distance to the BoxKDE grows constantly. However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}. % (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte) % Auffällig ist der Stufenhafte Anstieg der Laufzeit bei der R Implementierung. -Further looking at fig. \ref{fig:performance}, the runtime performance of the BKDE approach is increasing in a stepwise manner with growing $G$. +Looking at \figref{fig:performance}, the runtime performance of the BKDE approach is increasing in a stepwise manner with growing $G$. % Dies kommt durch die FFT. Der Input in für die FFT muss immer auf die nächste power of two gerundet werden. This behavior is caused by the underlying FFT algorithm. % Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant. @@ -85,14 +85,14 @@ The termination of BKDE graph at $G=4406^2$ is caused by an out of memory error % Während die durschnittliche Laufzeit über alle Werte von G beim box filter bei 0.4092s liegt, benötigte der extended box filter im Durschnitt 0.4169s. Both discussed Gaussian filter approximations, namely box filter and extended box filter, yield a similar runtime behavior and therefore a similar curve progression. While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one has an average of \SI{0.4169}{\second}. -To keep the arrangement of fig. \ref{fig:performance} clear, we only illustrated the results of the BoxKDE with the regular box filter. +To disambiguate \figref{fig:performance}, we only illustrated the results of the BoxKDE with the regular box filter. The weighted-average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly. In contrast, the computation of the BoxKDE approach increases exponentially with increasing number of dimensions. However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficiently fast for many applications and much smaller compared to other methods. The BoxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. -In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliable predictable. +In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliably predictable. Furthermore, the computation is easily parallelized, as there is no data dependency between the one-dimensional filter passes in algorithm~\ref{alg:boxKDE}. Hence, the inner loops can be parallelized using threads or SIMD instructions, but the overall speedup depends on the particular architecture and the size of the input. diff --git a/tex/chapters/realworld.tex b/tex/chapters/realworld.tex index c2da623..54ae0f7 100644 --- a/tex/chapters/realworld.tex +++ b/tex/chapters/realworld.tex @@ -17,26 +17,26 @@ The bivariate state estimation was calculated whenever a step was recognized, ab \begin{figure} \input{gfx/walk.tex} - \caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides an high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. } + \caption{Occurring bimodal distribution caused by uncertain measurements in the first \SI{13.4}{\second} of the walk. After \SI{20.8}{\second}, the distribution gets unimodal. The weigted-average estimation (blue) provides a high error compared to the ground truth (solid black), while the BoxKDE approach (orange) does not. } \label{fig:realWorldMulti} \end{figure} % -Fig.~\ref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions. +\figref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions. This results in a bimodal posterior distribution, which reaches its maximum distances between the modes at \SI{13.4}{\second} (black dotted line). Thus estimating the most probable state over time using the weighted-average results in the blue line, describing the pedestrian's position to be somewhere outside the building (light green area). In contrast, the here proposed method (orange line) is able to retrieve a good estimate compared to the ground truth path shown by the black solid line. Due to a right turn, the distribution gets unimodal after \SI{20.8}{\second}. -This happens since the lower red particles are walking against a wall and are punished with a low weight. +This happens since the lower red particles are walking against a wall, and therefore are punished with a low weight. This example highlights the main benefits using our approach. While being fast enough to be computed in real time, the proposed method reduces the estimation error of the state in this situation, as it is possible to distinguish the two modes of the density. It is clearly visible, that this enables the system to recover the real state if multimodalities arise. However, in situations with highly uncertain measurements, the estimation error could further increase since the real estimate is not equal to the best estimate, \ie{} the real position of the pedestrian. -The error over time for different estimation methods of the complete walk can be seen in fig. \ref{fig:realWorldTime}. +The error over time for different estimation methods of the complete walk can be seen in \figref{fig:realWorldTime}. It is given by calculating the distance between estimation and ground truth at a specific time $t$. Estimates provided by simply choosing the maximum particle stand out the most. -As one could have expected beforehand, this method provides many strong peaks through continuously jumping between single particles. +As expected beforehand, this method provides many strong peaks through continuously jumping between single particles. Additionally, in most real world scenarios many particles share the same weight and thus multiple highest-weighted particles exist. \begin{figure} @@ -45,7 +45,7 @@ Additionally, in most real world scenarios many particles share the same weight \label{fig:realWorldTime} \end{figure} -Further investigating fig. \ref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average. +Further investigating \figref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average. However after deploying \SI{100} Monte Carlo runs, the difference becomes insignificant. The main reason for this are again multimodalities caused by faulty or delayed measurements, especially when entering or leaving rooms. Within our experiments the problem occurred due to slow and attenuated Wi-Fi signals inside thick-walled rooms. @@ -54,7 +54,7 @@ Therefore, the average between the modes of the distribution is often closer to With new measurements coming from the hallway or other parts of the building, the distribution and thus the estimation are able to recover. Nevertheless, it can be seen that our approach is able to resolve multimodalities even under real world conditions. -It does not always provide the lowest error, since it depends more on an accurate sensor model than a weighted-average approach, but is very suitable as a good indicator about the real performance of a sensor fusion system. +It does not always provide the lowest error, since it depends more on an accurate sensor model than a weighted-average approach, but it is very suitable as a good indicator about the real performance of a sensor fusion system. In the here shown examples we only searched for a global maxima, even though the BoxKDE approach opens a wide range of other possibilities for finding a best estimate. %springt nicht so viel wie maximum diff --git a/tex/chapters/usage.tex b/tex/chapters/usage.tex index 98ce473..052ee06 100644 --- a/tex/chapters/usage.tex +++ b/tex/chapters/usage.tex @@ -5,12 +5,13 @@ %As the density estimation poses only a single step in the whole process, its computation needs to be as fast as possible. % not taking to much time from the frame -Consider a set of two-dimensional samples with associated weights, \eg{} presumably generated from a particle filter system. +Consider a set of two-dimensional samples with associated weights, \eg{} generated from a particle filter system. The overall process for bivariate data is described in Algorithm~\ref{alg:boxKDE}. Assuming that the given $N$ samples are stored in a sequential list, the first step is to create a grid representation. -In order to efficiently construct the grid and to allocate the required memory the extrema of the samples need to be known in advance. -These limits might be given by the application, for example, the position of a pedestrian within a building is limited by the physical dimensions of the building. +In order to efficiently construct the grid and to allocate the required memory, the extrema of the samples need to be known in advance. +These limits might be given by the application. +For example, the position of a pedestrian within a building is limited by the physical dimensions of the building. Such knowledge should be integrated into the system to avoid a linear search over the sample set, naturally reducing the computation time. \begin{algorithm}[t] @@ -54,7 +55,7 @@ Given the extreme values of the samples and grid sizes $G_1$ and $G_2$ defined b As the number of grid points directly affects both, computation time and accuracy, a suitable grid should be as coarse as possible, but at the same time narrow enough to produce an estimate sufficiently fast with an acceptable approximation error. If the extreme values are known in advanced, the computation of the grid is $\landau{N}$, otherwise an additional $\landau{N}$ search is required. -The grid is stored as an linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$. +The grid is stored as a linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$. Next, the binned data is filtered with a Gaussian using the box filter approximation. The box filter's width is derived by \eqref{eq:boxidealwidth} from the standard deviation of the approximated Gaussian, which is in turn equal to the bandwidth of the KDE. @@ -69,7 +70,7 @@ If multivariate data is processed, the algorithm is easily extended due to its s Each filter pass is computed in $\landau{G}$ operations, however, an additional memory buffer is required \cite{dspGuide1997}. While the integer-sized box filter requires fewest operations, it causes a larger approximation error due to rounding errors. -Depending on the required accuracy, the extended box filter algorithm can further improve the estimation results, with only a small additional overhead \cite{gwosdek2011theoretical}. +Depending on the required accuracy, the extended box filter algorithm can further improve the estimation results with only a small additional overhead \cite{gwosdek2011theoretical}. Due to its simple indexing scheme, the recursive box filter can easily be computed in parallel using SIMD operations and parallel computation cores. Finally, the most likely state can be obtained from the filtered data, \ie{} from the estimated discrete density, by searching filtered data for its maximum value. From 2ba6e82acd400d055b17d004d8017207b5c95d6c Mon Sep 17 00:00:00 2001 From: k-a-z-u Date: Tue, 13 Mar 2018 13:09:37 +0100 Subject: [PATCH 06/11] minor plot changes --- tex/gfx/errorOverTime.eps | 8 +- tex/gfx/errorOverTime.tex | 2 +- tex/gfx/errorOverTimePlotterSolo.gp | 2 +- tex/gfx/perf.eps | 861 +++++++++++++++++----------- tex/gfx/perf.gp | 6 +- tex/gfx/perf.tex | 28 +- tex/gfx/perf/BoxSIMD.csv | 60 ++ 7 files changed, 626 insertions(+), 341 deletions(-) create mode 100644 tex/gfx/perf/BoxSIMD.csv diff --git a/tex/gfx/errorOverTime.eps b/tex/gfx/errorOverTime.eps index 30ab44e..8b96477 100644 --- a/tex/gfx/errorOverTime.eps +++ b/tex/gfx/errorOverTime.eps @@ -1,7 +1,7 @@ %!PS-Adobe-2.0 EPSF-2.0 %%Title: errorOverTime.tex -%%Creator: gnuplot 5.2 patchlevel 2 -%%CreationDate: Mon Feb 26 10:56:15 2018 +%%Creator: gnuplot 5.2 patchlevel 2 (Gentoo revision r0) +%%CreationDate: Tue Mar 13 13:08:23 2018 %%DocumentFonts: %%BoundingBox: 50 50 316 194 %%EndComments @@ -438,10 +438,10 @@ systemdict /pdfmark known not { SDict begin [ /Title (errorOverTime.tex) /Subject (gnuplot plot) - /Creator (gnuplot 5.2 patchlevel 2) + /Creator (gnuplot 5.2 patchlevel 2 (Gentoo revision r0)) % /Producer (gnuplot) % /Keywords () - /CreationDate (Mon Feb 26 10:56:15 2018) + /CreationDate (Tue Mar 13 13:08:23 2018) /DOCINFO pdfmark end } ifelse diff --git a/tex/gfx/errorOverTime.tex b/tex/gfx/errorOverTime.tex index 165a374..c80ebba 100644 --- a/tex/gfx/errorOverTime.tex +++ b/tex/gfx/errorOverTime.tex @@ -101,7 +101,7 @@ \csname LTb\endcsname%% \put(4373,2486){\makebox(0,0)[r]{\strut{}\footnotesize{maximum particle}}}% \csname LTb\endcsname%% - \put(4373,2266){\makebox(0,0)[r]{\strut{}\footnotesize{weighted-average particle}}}% + \put(4373,2266){\makebox(0,0)[r]{\strut{}\footnotesize{weighted average particle}}}% \csname LTb\endcsname%% \put(4373,2046){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE}}}% }% diff --git a/tex/gfx/errorOverTimePlotterSolo.gp b/tex/gfx/errorOverTimePlotterSolo.gp index 7e32559..42285e3 100644 --- a/tex/gfx/errorOverTimePlotterSolo.gp +++ b/tex/gfx/errorOverTimePlotterSolo.gp @@ -21,7 +21,7 @@ set ytics 0,5,15 plot \ "0_1519064452.csv" using ($1/1000):2 with lines lc rgb "#c8c8c8" lw 2.0 title "\\footnotesize{maximum particle}", \ - "0_1519062956.csv" using ($1/1000):2 with lines lc rgb "#3465A4" lw 1.5 title "\\footnotesize{weighted-average particle}",\ + "0_1519062956.csv" using ($1/1000):2 with lines lc rgb "#3465A4" lw 1.5 title "\\footnotesize{weighted average particle}",\ "0_1519062756.csv" using ($1/1000):2 with lines lc rgb "#FCAF3E" lw 1.5 title "\\footnotesize{BoxKDE}" diff --git a/tex/gfx/perf.eps b/tex/gfx/perf.eps index b65465d..89b2ee3 100644 --- a/tex/gfx/perf.eps +++ b/tex/gfx/perf.eps @@ -1,7 +1,7 @@ %!PS-Adobe-2.0 EPSF-2.0 %%Title: perf.tex -%%Creator: gnuplot 5.2 patchlevel 2 -%%CreationDate: Mon Feb 26 11:24:00 2018 +%%Creator: gnuplot 5.2 patchlevel 2 (Gentoo revision r0) +%%CreationDate: Tue Mar 13 13:04:14 2018 %%DocumentFonts: %%BoundingBox: 50 50 302 230 %%EndComments @@ -438,10 +438,10 @@ systemdict /pdfmark known not { SDict begin [ /Title (perf.tex) /Subject (gnuplot plot) - /Creator (gnuplot 5.2 patchlevel 2) + /Creator (gnuplot 5.2 patchlevel 2 (Gentoo revision r0)) % /Producer (gnuplot) % /Keywords () - /CreationDate (Mon Feb 26 11:24:00 2018) + /CreationDate (Tue Mar 13 13:04:14 2018) /DOCINFO pdfmark end } ifelse @@ -485,11 +485,23 @@ newpath LTb LCb setrgbcolor [] 0 setdash -686 726 M -31 0 V -4256 0 R --31 0 V -686 805 M +0.500 UL +LTa +LCa setrgbcolor +686 704 M +4287 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 704 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 797 M 31 0 V 4256 0 R -31 0 V @@ -497,23 +509,27 @@ LCb setrgbcolor 31 0 V 4256 0 R -31 0 V -686 886 M +686 890 M 31 0 V 4256 0 R -31 0 V -686 912 M +686 920 M 31 0 V 4256 0 R -31 0 V -686 934 M +686 945 M 31 0 V 4256 0 R -31 0 V -686 952 M +686 966 M 31 0 V 4256 0 R -31 0 V -686 968 M +686 983 M +31 0 V +4256 0 R +-31 0 V +686 999 M 31 0 V 4256 0 R -31 0 V @@ -521,7 +537,7 @@ stroke 0.500 UL LTa LCa setrgbcolor -686 983 M +686 1013 M 2541 0 V 1614 0 R 132 0 V @@ -530,33 +546,68 @@ stroke LTb LCb setrgbcolor [] 0 setdash -686 983 M +686 1013 M 63 0 V 4224 0 R -63 0 V stroke LTb -686 1283 M +686 1107 M 31 0 V 4256 0 R -31 0 V -686 1362 M +686 1161 M 31 0 V 4256 0 R -31 0 V -686 1409 M +686 1200 M 31 0 V 4256 0 R -31 0 V -686 1443 M +686 1230 M 31 0 V 4256 0 R -31 0 V -686 1469 M +686 1254 M 31 0 V 4256 0 R -31 0 V -686 1491 M +686 1275 M +31 0 V +4256 0 R +-31 0 V +686 1293 M +31 0 V +4256 0 R +-31 0 V +686 1309 M +31 0 V +4256 0 R +-31 0 V +stroke +0.500 UL +LTa +LCa setrgbcolor +686 1323 M +2541 0 V +1614 0 R +132 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 1323 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 1416 M +31 0 V +4256 0 R +-31 0 V +686 1471 M 31 0 V 4256 0 R -31 0 V @@ -564,7 +615,23 @@ LTb 31 0 V 4256 0 R -31 0 V -686 1525 M +686 1539 M +31 0 V +4256 0 R +-31 0 V +686 1564 M +31 0 V +4256 0 R +-31 0 V +686 1584 M +31 0 V +4256 0 R +-31 0 V +686 1602 M +31 0 V +4256 0 R +-31 0 V +686 1618 M 31 0 V 4256 0 R -31 0 V @@ -572,99 +639,48 @@ stroke 0.500 UL LTa LCa setrgbcolor -686 1540 M -2541 0 V -1614 0 R -132 0 V -stroke -1.000 UL -LTb -LCb setrgbcolor -[] 0 setdash -686 1540 M -63 0 V -4224 0 R --63 0 V -stroke -LTb -686 1840 M -31 0 V -4256 0 R --31 0 V -686 1919 M -31 0 V -4256 0 R --31 0 V -686 1966 M -31 0 V -4256 0 R --31 0 V -686 2000 M -31 0 V -4256 0 R --31 0 V -686 2026 M -31 0 V -4256 0 R --31 0 V -686 2048 M -31 0 V -4256 0 R --31 0 V -686 2066 M -31 0 V -4256 0 R --31 0 V -686 2082 M -31 0 V -4256 0 R --31 0 V -stroke -0.500 UL -LTa -LCa setrgbcolor -686 2097 M +686 1632 M 4287 0 V stroke 1.000 UL LTb LCb setrgbcolor [] 0 setdash -686 2097 M +686 1632 M 63 0 V 4224 0 R -63 0 V stroke LTb -686 2397 M +686 1725 M 31 0 V 4256 0 R -31 0 V -686 2476 M +686 1780 M 31 0 V 4256 0 R -31 0 V -686 2523 M +686 1819 M 31 0 V 4256 0 R -31 0 V -686 2557 M +686 1849 M 31 0 V 4256 0 R -31 0 V -686 2583 M +686 1873 M 31 0 V 4256 0 R -31 0 V -686 2605 M +686 1894 M 31 0 V 4256 0 R -31 0 V -686 2623 M +686 1912 M 31 0 V 4256 0 R -31 0 V -686 2639 M +686 1928 M 31 0 V 4256 0 R -31 0 V @@ -672,48 +688,244 @@ stroke 0.500 UL LTa LCa setrgbcolor -686 2654 M +686 1942 M 4287 0 V stroke 1.000 UL LTb LCb setrgbcolor [] 0 setdash -686 2654 M +686 1942 M 63 0 V 4224 0 R -63 0 V stroke LTb -686 2954 M +686 2035 M 31 0 V 4256 0 R -31 0 V -686 3033 M +686 2089 M 31 0 V 4256 0 R -31 0 V -686 3080 M +686 2128 M 31 0 V 4256 0 R -31 0 V -686 3114 M +686 2158 M 31 0 V 4256 0 R -31 0 V -686 3140 M +686 2183 M 31 0 V 4256 0 R -31 0 V -686 3162 M +686 2203 M 31 0 V 4256 0 R -31 0 V +686 2221 M +31 0 V +4256 0 R +-31 0 V +686 2237 M +31 0 V +4256 0 R +-31 0 V +stroke +0.500 UL +LTa +LCa setrgbcolor +686 2251 M +4287 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 2251 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 2344 M +31 0 V +4256 0 R +-31 0 V +686 2399 M +31 0 V +4256 0 R +-31 0 V +686 2438 M +31 0 V +4256 0 R +-31 0 V +686 2468 M +31 0 V +4256 0 R +-31 0 V +686 2492 M +31 0 V +4256 0 R +-31 0 V +686 2513 M +31 0 V +4256 0 R +-31 0 V +686 2531 M +31 0 V +4256 0 R +-31 0 V +686 2547 M +31 0 V +4256 0 R +-31 0 V +stroke +0.500 UL +LTa +LCa setrgbcolor +686 2561 M +4287 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 2561 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 2654 M +31 0 V +4256 0 R +-31 0 V +686 2708 M +31 0 V +4256 0 R +-31 0 V +686 2747 M +31 0 V +4256 0 R +-31 0 V +686 2777 M +31 0 V +4256 0 R +-31 0 V +686 2801 M +31 0 V +4256 0 R +-31 0 V +686 2822 M +31 0 V +4256 0 R +-31 0 V +686 2840 M +31 0 V +4256 0 R +-31 0 V +686 2856 M +31 0 V +4256 0 R +-31 0 V +stroke +0.500 UL +LTa +LCa setrgbcolor +686 2870 M +4287 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 2870 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 2963 M +31 0 V +4256 0 R +-31 0 V +686 3018 M +31 0 V +4256 0 R +-31 0 V +686 3056 M +31 0 V +4256 0 R +-31 0 V +686 3086 M +31 0 V +4256 0 R +-31 0 V +686 3111 M +31 0 V +4256 0 R +-31 0 V +686 3132 M +31 0 V +4256 0 R +-31 0 V +686 3150 M +31 0 V +4256 0 R +-31 0 V +686 3165 M +31 0 V +4256 0 R +-31 0 V +stroke +0.500 UL +LTa +LCa setrgbcolor 686 3180 M +4287 0 V +stroke +1.000 UL +LTb +LCb setrgbcolor +[] 0 setdash +686 3180 M +63 0 V +4224 0 R +-63 0 V +stroke +LTb +686 3273 M 31 0 V 4256 0 R -31 0 V -686 3196 M +686 3327 M +31 0 V +4256 0 R +-31 0 V +686 3366 M +31 0 V +4256 0 R +-31 0 V +686 3396 M +31 0 V +4256 0 R +-31 0 V +686 3420 M +31 0 V +4256 0 R +-31 0 V +686 3441 M +31 0 V +4256 0 R +-31 0 V +686 3459 M +31 0 V +4256 0 R +-31 0 V +686 3475 M 31 0 V 4256 0 R -31 0 V @@ -721,14 +933,14 @@ stroke 0.500 UL LTa LCa setrgbcolor -686 3211 M +686 3489 M 4287 0 V stroke 1.000 UL LTb LCb setrgbcolor [] 0 setdash -686 3211 M +686 3489 M 63 0 V 4224 0 R -63 0 V @@ -1070,66 +1282,59 @@ Z stroke % Begin plot #1 2.000 UL LTb -0.31 0.60 0.02 C 686 2137 M +0.80 0.00 0.00 C 686 1928 M 59 -2 V -54 2 V -96 2 V -43 9 V -77 2 V -101 14 V -59 4 V -80 12 V -70 12 V -83 16 V -73 10 V -65 18 V -73 18 V -78 19 V -69 23 V -82 24 V -72 23 V -72 22 V -71 26 V -70 23 V -74 28 V -76 26 V -71 27 V -76 28 V -70 26 V -72 28 V -73 27 V -75 29 V -73 27 V -72 29 V -72 27 V -73 28 V -73 29 V -73 28 V -73 28 V -72 28 V -73 29 V -73 28 V -72 28 V -73 29 V -73 28 V -73 28 V -72 29 V -73 28 V -73 28 V -72 28 V -73 29 V -73 -43 V -72 28 V -73 30 V -73 27 V -72 28 V -73 28 V -73 29 V -72 28 V -73 28 V -73 29 V -72 28 V -73 28 V +54 -7 V +96 1 V +43 1 V +77 0 V +101 36 V +59 -8 V +80 2 V +70 1 V +83 4 V +73 1 V +65 121 V +73 2 V +78 1 V +69 5 V +82 -3 V +72 169 V +72 -2 V +71 1 V +70 2 V +74 1 V +76 8 V +71 262 V +76 0 V +70 6 V +72 -3 V +73 2 V +75 3 V +73 252 V +72 0 V +72 -1 V +73 7 V +73 -2 V +73 2 V +73 215 V +72 0 V +73 -2 V +73 3 V +72 -1 V +73 3 V +73 215 V +73 -3 V +72 3 V +73 1 V +73 -1 V +72 1 V +73 217 V +73 4 V +72 1 V +73 0 V +73 0 V +72 0 V stroke LTw % End plot #1 @@ -1138,59 +1343,66 @@ LTw LTb LCb setrgbcolor [] 0 setdash -0.80 0.00 0.00 C 686 1805 M -59 -1 V -54 -6 V -96 1 V +0.99 0.69 0.24 C 686 1048 M +59 -35 V +54 17 V +96 41 V 43 0 V -77 0 V -101 33 V -59 -8 V -80 2 V -70 1 V -83 3 V -73 2 V -65 109 V -73 2 V -78 0 V -69 4 V -82 -2 V -72 152 V -72 -2 V -71 1 V -70 2 V -74 1 V -76 8 V -71 235 V -76 0 V -70 5 V -72 -2 V -73 2 V -75 2 V -73 227 V -72 0 V -72 -1 V -73 6 V -73 -1 V -73 1 V -73 194 V -72 0 V -73 -2 V -73 2 V -72 0 V -73 2 V -73 194 V -73 -3 V -72 3 V -73 1 V -73 -1 V -72 1 V -73 195 V -73 4 V -72 1 V -73 0 V -73 0 V -72 0 V +77 41 V +101 41 V +59 12 V +80 38 V +70 27 V +83 37 V +73 32 V +65 25 V +73 30 V +78 48 V +69 28 V +82 32 V +72 39 V +72 29 V +71 32 V +70 31 V +74 44 V +76 52 V +71 39 V +76 21 V +70 31 V +72 31 V +73 34 V +75 31 V +73 34 V +72 28 V +72 33 V +73 31 V +73 33 V +73 31 V +73 32 V +72 35 V +73 35 V +73 92 V +72 17 V +73 48 V +73 41 V +73 5 V +72 81 V +73 17 V +73 91 V +72 43 V +73 51 V +73 44 V +72 43 V +73 42 V +73 41 V +72 35 V +73 38 V +73 28 V +72 43 V +73 36 V +73 36 V +72 36 V +73 34 V stroke LTw % End plot #2 @@ -1198,67 +1410,67 @@ LTw 2.000 UL LTb LCb setrgbcolor -[] 0 setdash -0.99 0.69 0.24 C 686 1014 M -59 -32 V -54 16 V -96 37 V -43 0 V -77 36 V -101 38 V -59 9 V -80 35 V -70 25 V -83 33 V -73 29 V -65 22 V -73 27 V -78 44 V -69 24 V +LT1 +0.99 0.69 0.24 C 686 1010 M +59 10 V +54 17 V +96 17 V +43 10 V +77 48 V +101 13 V +59 -2 V +80 39 V +70 13 V +83 -2 V +73 47 V +65 7 V +73 38 V +78 31 V +69 38 V 82 30 V -72 35 V -72 25 V -71 29 V -70 28 V -74 40 V -76 46 V -71 36 V -76 18 V -70 28 V +72 32 V 72 29 V -73 30 V -75 28 V -73 30 V -72 26 V -72 29 V -73 28 V -73 30 V -73 28 V -73 29 V -72 31 V -73 32 V -73 82 V -72 16 V -73 43 V -73 37 V -73 5 V -72 73 V -73 15 V -73 82 V -72 38 V -73 46 V -73 40 V -72 39 V -73 37 V -73 37 V -72 31 V -73 35 V -73 25 V -72 39 V -73 32 V -73 32 V +71 24 V +70 26 V +74 35 V +76 34 V +71 32 V +76 29 V +70 39 V +72 34 V +73 34 V +75 36 V +73 41 V +72 21 V 72 33 V -73 31 V +73 45 V +73 30 V +73 58 V +73 35 V +72 36 V +73 48 V +73 66 V +72 41 V +73 75 V +73 7 V +73 36 V +72 53 V +73 12 V +73 38 V +72 41 V +73 24 V +73 41 V +72 24 V +73 43 V +73 23 V +72 41 V +73 30 V +73 42 V +72 41 V +73 20 V +73 66 V +72 5 V +73 35 V stroke LTw % End plot #3 @@ -1267,66 +1479,66 @@ LTw LTb LCb setrgbcolor [] 0 setdash -0.20 0.40 0.64 C 686 740 M -59 22 V -54 -22 V -96 22 V -43 49 V -77 25 V -101 20 V -59 32 V +0.20 0.40 0.64 C 686 744 M +59 25 V +54 -25 V +96 25 V +43 54 V +77 27 V +101 23 V +59 36 V 80 0 V -70 31 V -83 30 V -73 30 V -65 13 V -73 22 V -78 23 V -69 26 V -82 23 V -72 27 V -72 28 V -71 22 V -70 28 V -74 24 V -76 30 V -71 26 V -76 29 V -70 27 V -72 28 V -73 27 V -75 30 V -73 27 V -72 28 V -72 29 V -73 33 V -73 22 V -73 30 V -73 33 V -72 25 V -73 36 V -73 20 V -72 30 V -73 33 V -73 32 V -73 26 V -72 34 V +70 34 V +83 33 V +73 34 V +65 14 V 73 24 V -73 27 V -72 47 V -73 12 V -73 26 V +78 26 V +69 29 V +82 25 V 72 31 V -73 28 V -73 27 V -72 29 V -73 28 V -73 33 V -72 23 V +72 30 V +71 25 V +70 31 V +74 27 V +76 33 V +71 29 V +76 32 V +70 31 V +72 30 V +73 30 V +75 33 V +73 30 V +72 32 V +72 32 V +73 37 V +73 25 V 73 32 V -73 31 V -72 24 V +73 37 V +72 27 V +73 41 V +73 23 V +72 33 V +73 36 V +73 36 V +73 29 V +72 37 V 73 27 V +73 30 V +72 52 V +73 13 V +73 30 V +72 33 V +73 32 V +73 30 V +72 32 V +73 31 V +73 37 V +72 26 V +73 35 V +73 34 V +72 27 V +73 31 V stroke LTw % End plot #4 @@ -1345,10 +1557,10 @@ Z stroke % Begin plot #5 2.000 UL LTb -0.31 0.60 0.02 C LCb setrgbcolor +0.80 0.00 0.00 C LCb setrgbcolor 2.000 UL LTb -0.31 0.60 0.02 C 4547 1460 M +0.80 0.00 0.00 C 4547 1460 M 162 0 V stroke LTw @@ -1358,10 +1570,10 @@ LTw LTb LCb setrgbcolor [] 0 setdash -0.80 0.00 0.00 C LCb setrgbcolor +0.99 0.69 0.24 C LCb setrgbcolor 2.000 UL LTb -0.80 0.00 0.00 C 4547 1262 M +0.99 0.69 0.24 C 4547 1262 M 162 0 V stroke LTw @@ -1370,10 +1582,11 @@ LTw 2.000 UL LTb LCb setrgbcolor -[] 0 setdash +LT1 0.99 0.69 0.24 C LCb setrgbcolor 2.000 UL LTb +LT1 0.99 0.69 0.24 C 4547 1064 M 162 0 V stroke diff --git a/tex/gfx/perf.gp b/tex/gfx/perf.gp index 678bb52..da3bd48 100644 --- a/tex/gfx/perf.gp +++ b/tex/gfx/perf.gp @@ -21,11 +21,13 @@ set format y "\\footnotesize{$10^{%T}$}" set xlabel "\\footnotesize{number of grid points $G$}" set ylabel "\\footnotesize{$t$ in seconds}" offset +2.7,0 +#"perf/FastKDE.csv" using (column(1)**2):(column(2)) with lines lc rgb "#4E9A06" lw 2.0 title "\\footnotesize{FastKDE}",\ + plot \ - "perf/FastKDE.csv" using (column(1)**2):(column(2)) with lines lc rgb "#4E9A06" lw 2.0 title "\\footnotesize{FastKDE}",\ "perf/R.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#CC0000" lw 2.0 title "\\footnotesize{BKDE}",\ "perf/Box.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#FCAF3E" lw 2.0 title "\\footnotesize{BoxKDE}",\ - "perf/WeightedAverage.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#3465A4" lw 2.0 title "\\footnotesize{weighted-average}" + "perf/BoxSIMD.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#FCAF3E" lw 2.0 dashtype 2 title "\\footnotesize{BoxKDE (SIMD)}",\ + "perf/WeightedAverage.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#3465A4" lw 2.0 title "\\footnotesize{weighted average}" #FCAF3E #CC0000 diff --git a/tex/gfx/perf.tex b/tex/gfx/perf.tex index c493cc0..79bd2f3 100644 --- a/tex/gfx/perf.tex +++ b/tex/gfx/perf.tex @@ -82,15 +82,25 @@ \begin{picture}(5040.00,3600.00)% \gplgaddtomacro\gplbacktext{% \csname LTb\endcsname%% - \put(554,983){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-6}$}}}% + \put(554,704){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-7}$}}}% \csname LTb\endcsname%% - \put(554,1540){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-4}$}}}% + \put(554,1013){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-6}$}}}% \csname LTb\endcsname%% - \put(554,2097){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-2}$}}}% + \put(554,1323){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-5}$}}}% \csname LTb\endcsname%% - \put(554,2654){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{0}$}}}% + \put(554,1632){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-4}$}}}% \csname LTb\endcsname%% - \put(554,3211){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{2}$}}}% + \put(554,1942){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-3}$}}}% + \csname LTb\endcsname%% + \put(554,2251){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-2}$}}}% + \csname LTb\endcsname%% + \put(554,2561){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{-1}$}}}% + \csname LTb\endcsname%% + \put(554,2870){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{0}$}}}% + \csname LTb\endcsname%% + \put(554,3180){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{1}$}}}% + \csname LTb\endcsname%% + \put(554,3489){\makebox(0,0)[r]{\strut{}\footnotesize{$10^{2}$}}}% \csname LTb\endcsname%% \put(686,484){\makebox(0,0){\strut{}\footnotesize{$10^{2}$}}}% \csname LTb\endcsname%% @@ -111,13 +121,13 @@ \put(30,2096){\rotatebox{-270}{\makebox(0,0){\strut{}\footnotesize{$t$ in seconds}}}}% \put(2829,154){\makebox(0,0){\strut{}\footnotesize{number of grid points $G$}}}% \csname LTb\endcsname%% - \put(4415,1460){\makebox(0,0)[r]{\strut{}\footnotesize{FastKDE}}}% + \put(4415,1460){\makebox(0,0)[r]{\strut{}\footnotesize{BKDE}}}% \csname LTb\endcsname%% - \put(4415,1262){\makebox(0,0)[r]{\strut{}\footnotesize{BKDE}}}% + \put(4415,1262){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE}}}% \csname LTb\endcsname%% - \put(4415,1064){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE}}}% + \put(4415,1064){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE (SIMD)}}}% \csname LTb\endcsname%% - \put(4415,866){\makebox(0,0)[r]{\strut{}\footnotesize{weighted-average}}}% + \put(4415,866){\makebox(0,0)[r]{\strut{}\footnotesize{weighted average}}}% }% \gplbacktext \put(0,0){\includegraphics{perf}}% diff --git a/tex/gfx/perf/BoxSIMD.csv b/tex/gfx/perf/BoxSIMD.csv new file mode 100644 index 0000000..a534140 --- /dev/null +++ b/tex/gfx/perf/BoxSIMD.csv @@ -0,0 +1,60 @@ +10 972 +11 1053 +12 1188 +14 1350 +15 1458 +17 2079 +20 2295 +22 2268 +25 3024 +28 3321 +32 3267 +36 4644 +40 4914 +45 6481 +51 8209 +57 10855 +65 13583 +73 17174 +82 21333 +92 25465 +103 31027 +116 40263 +131 51902 +147 65674 +166 81444 +186 1.0902e+05 +209 1.4037e+05 +235 1.8028e+05 +265 2.3677e+05 +298 3.2e+05 +335 3.7625e+05 +376 4.8003e+05 +423 6.727e+05 +476 8.3675e+05 +535 1.2853e+06 +602 1.6774e+06 +676 2.1823e+06 +760 3.1376e+06 +855 5.1244e+06 +961 6.9201e+06 +1081 1.2096e+07 +1215 1.2767e+07 +1366 1.6668e+07 +1536 2.4807e+07 +1726 2.7084e+07 +1941 3.5893e+07 +2182 4.8553e+07 +2453 5.8113e+07 +2758 7.8821e+07 +3101 9.445e+07 +3486 1.2988e+08 +3919 1.5446e+08 +4406 2.09e+08 +4953 2.6146e+08 +5568 3.5769e+08 +6260 4.8602e+08 +7038 5.6313e+08 +7912 9.2187e+08 +8895 9.5432e+08 +10000 1.2414e+09 From fa8feaf152dae5fb4201c2199ee18dd377f2203f Mon Sep 17 00:00:00 2001 From: k-a-z-u Date: Tue, 13 Mar 2018 15:31:32 +0100 Subject: [PATCH 07/11] fixed GFX --- tex/gfx/perf.eps | 120 ++++++++--------------------------------------- tex/gfx/perf.gp | 2 +- tex/gfx/perf.tex | 6 +-- 3 files changed, 22 insertions(+), 106 deletions(-) diff --git a/tex/gfx/perf.eps b/tex/gfx/perf.eps index 89b2ee3..5356ae2 100644 --- a/tex/gfx/perf.eps +++ b/tex/gfx/perf.eps @@ -1,7 +1,7 @@ %!PS-Adobe-2.0 EPSF-2.0 %%Title: perf.tex %%Creator: gnuplot 5.2 patchlevel 2 (Gentoo revision r0) -%%CreationDate: Tue Mar 13 13:04:14 2018 +%%CreationDate: Tue Mar 13 15:31:12 2018 %%DocumentFonts: %%BoundingBox: 50 50 302 230 %%EndComments @@ -441,7 +441,7 @@ SDict begin [ /Creator (gnuplot 5.2 patchlevel 2 (Gentoo revision r0)) % /Producer (gnuplot) % /Keywords () - /CreationDate (Tue Mar 13 13:04:14 2018) + /CreationDate (Tue Mar 13 15:31:12 2018) /DOCINFO pdfmark end } ifelse @@ -1147,8 +1147,8 @@ LTa LCa setrgbcolor 3544 704 M 0 63 V -0 792 R -0 1930 V +0 594 R +0 2128 V stroke 1.000 UL LTb @@ -1198,8 +1198,8 @@ LTa LCa setrgbcolor 4259 704 M 0 63 V -0 792 R -0 1930 V +0 594 R +0 2128 V stroke 1.000 UL LTb @@ -1274,9 +1274,9 @@ LTb 1.000 UL LTb 3227 767 N -0 792 V +0 594 V 1614 0 V -0 -792 V +0 -594 V -1614 0 V Z stroke % Begin plot #1 @@ -1410,74 +1410,6 @@ LTw 2.000 UL LTb LCb setrgbcolor -LT1 -0.99 0.69 0.24 C 686 1010 M -59 10 V -54 17 V -96 17 V -43 10 V -77 48 V -101 13 V -59 -2 V -80 39 V -70 13 V -83 -2 V -73 47 V -65 7 V -73 38 V -78 31 V -69 38 V -82 30 V -72 32 V -72 29 V -71 24 V -70 26 V -74 35 V -76 34 V -71 32 V -76 29 V -70 39 V -72 34 V -73 34 V -75 36 V -73 41 V -72 21 V -72 33 V -73 45 V -73 30 V -73 58 V -73 35 V -72 36 V -73 48 V -73 66 V -72 41 V -73 75 V -73 7 V -73 36 V -72 53 V -73 12 V -73 38 V -72 41 V -73 24 V -73 41 V -72 24 V -73 43 V -73 23 V -72 41 V -73 30 V -73 42 V -72 41 V -73 20 V -73 66 V -72 5 V -73 35 V -stroke -LTw -% End plot #3 -% Begin plot #4 -2.000 UL -LTb -LCb setrgbcolor [] 0 setdash 0.20 0.40 0.64 C 686 744 M 59 25 V @@ -1541,31 +1473,31 @@ LCb setrgbcolor 73 31 V stroke LTw -% End plot #4 +% End plot #3 LCw setrgbcolor -1.000 3227 767 1614 792 BoxColFill +1.000 3227 767 1614 594 BoxColFill 1.000 UL LTb LCb setrgbcolor [] 0 setdash 3227 767 N -0 792 V +0 594 V 1614 0 V -0 -792 V +0 -594 V -1614 0 V Z stroke -% Begin plot #5 +% Begin plot #4 2.000 UL LTb 0.80 0.00 0.00 C LCb setrgbcolor 2.000 UL LTb -0.80 0.00 0.00 C 4547 1460 M +0.80 0.00 0.00 C 4547 1262 M 162 0 V stroke LTw -% End plot #5 -% Begin plot #6 +% End plot #4 +% Begin plot #5 2.000 UL LTb LCb setrgbcolor @@ -1573,26 +1505,12 @@ LCb setrgbcolor 0.99 0.69 0.24 C LCb setrgbcolor 2.000 UL LTb -0.99 0.69 0.24 C 4547 1262 M -162 0 V -stroke -LTw -% End plot #6 -% Begin plot #7 -2.000 UL -LTb -LCb setrgbcolor -LT1 -0.99 0.69 0.24 C LCb setrgbcolor -2.000 UL -LTb -LT1 0.99 0.69 0.24 C 4547 1064 M 162 0 V stroke LTw -% End plot #7 -% Begin plot #8 +% End plot #5 +% Begin plot #6 2.000 UL LTb LCb setrgbcolor @@ -1604,7 +1522,7 @@ LTb 162 0 V stroke LTw -% End plot #8 +% End plot #6 2.000 UL LTb LCb setrgbcolor diff --git a/tex/gfx/perf.gp b/tex/gfx/perf.gp index da3bd48..15956dc 100644 --- a/tex/gfx/perf.gp +++ b/tex/gfx/perf.gp @@ -22,11 +22,11 @@ set xlabel "\\footnotesize{number of grid points $G$}" set ylabel "\\footnotesize{$t$ in seconds}" offset +2.7,0 #"perf/FastKDE.csv" using (column(1)**2):(column(2)) with lines lc rgb "#4E9A06" lw 2.0 title "\\footnotesize{FastKDE}",\ +#"perf/BoxSIMD.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#FCAF3E" lw 2.0 dashtype 2 title "\\footnotesize{BoxKDE (SIMD)}",\ plot \ "perf/R.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#CC0000" lw 2.0 title "\\footnotesize{BKDE}",\ "perf/Box.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#FCAF3E" lw 2.0 title "\\footnotesize{BoxKDE}",\ - "perf/BoxSIMD.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#FCAF3E" lw 2.0 dashtype 2 title "\\footnotesize{BoxKDE (SIMD)}",\ "perf/WeightedAverage.csv" using (column(1)**2):(column(2)/1e9) with lines lc rgb "#3465A4" lw 2.0 title "\\footnotesize{weighted average}" #FCAF3E diff --git a/tex/gfx/perf.tex b/tex/gfx/perf.tex index 79bd2f3..fecdd1b 100644 --- a/tex/gfx/perf.tex +++ b/tex/gfx/perf.tex @@ -121,11 +121,9 @@ \put(30,2096){\rotatebox{-270}{\makebox(0,0){\strut{}\footnotesize{$t$ in seconds}}}}% \put(2829,154){\makebox(0,0){\strut{}\footnotesize{number of grid points $G$}}}% \csname LTb\endcsname%% - \put(4415,1460){\makebox(0,0)[r]{\strut{}\footnotesize{BKDE}}}% + \put(4415,1262){\makebox(0,0)[r]{\strut{}\footnotesize{BKDE}}}% \csname LTb\endcsname%% - \put(4415,1262){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE}}}% - \csname LTb\endcsname%% - \put(4415,1064){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE (SIMD)}}}% + \put(4415,1064){\makebox(0,0)[r]{\strut{}\footnotesize{BoxKDE}}}% \csname LTb\endcsname%% \put(4415,866){\makebox(0,0)[r]{\strut{}\footnotesize{weighted average}}}% }% From 7c407f950e08a068bfea6be12c60a3afe796ca00 Mon Sep 17 00:00:00 2001 From: Markus Bullmann Date: Tue, 13 Mar 2018 15:58:41 +0100 Subject: [PATCH 08/11] Going thru changes --- tex/bare_conf.tex | 2 +- tex/chapters/conclusion.tex | 4 +-- tex/chapters/experiments.tex | 48 +++++++++++++++++------------------ tex/chapters/multivariate.tex | 4 +-- tex/chapters/realworld.tex | 13 +++++----- tex/chapters/usage.tex | 3 ++- tex/egbib.bib | 10 ++++---- 7 files changed, 42 insertions(+), 42 deletions(-) diff --git a/tex/bare_conf.tex b/tex/bare_conf.tex index 069f978..ac7276a 100644 --- a/tex/bare_conf.tex +++ b/tex/bare_conf.tex @@ -121,7 +121,7 @@ \newcommand{\qq} [1]{``#1''} \newcommand{\eg} {e.\,g.} \newcommand{\ie} {i.\,e.} -\newcommand{\figref}[1]{Fig.~\ref{#1}} +\newcommand{\figref}[1]{fig.~\ref{#1}} % missing math operators \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} diff --git a/tex/chapters/conclusion.tex b/tex/chapters/conclusion.tex index b69df65..4cbd0e7 100644 --- a/tex/chapters/conclusion.tex +++ b/tex/chapters/conclusion.tex @@ -2,13 +2,13 @@ Within this paper a novel approach for rapid approximation of the KDE was presented. This is achieved by considering the discrete convolution structure of the BKDE and thus elaborating its connection to digital signal processing, especially the Gaussian filter. -Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as demonstrated by the utilized experiments. +Using a box filter as an appropriate approximation results in an efficient computation scheme with a fully linear complexity and a negligible overhead, as demonstrated by the experiments. The analysis of the error showed that the method shows an similar error behaviour compared to the BKDE. In terms of calculation time, our approach outperforms other state of the art implementations. Despite being more efficient than other methods, the algorithmic complexity still increases in its exponent with an increasing number of dimensions. %future work kurz -Finally, such a fast computation scheme makes the KDE more attractive for real time use cases. +Finally, such a fast approximation scheme makes the KDE more attractive for real time use cases. In a sensor fusion context, the availability of a reconstructed density of the posterior enables many new approaches and techniques for finding a best estimate of the system's current state. diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index c7df8c7..3735d24 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -1,29 +1,32 @@ \section{Experiments} \subsection{Mean Integrated Squared Error} -We now empirically evaluate the feasibility of our BoxKDE method by analyzing its approximation error. -In order to evaluate the error the KDE and various approximations of it are computed and compared using the mean integrated squared error (MISE). -A synthetic sample set $\bm{X}$ with $N=1000$ obtained from a bivariate mixture normal density $f$ provides the basis of the comparison. -For each method an estimate is computed and the MISE of it relative to $f$ is calculated. -The specific structure of the underlying distribution clearly affects the error in the estimate, but only the closeness of the approximation to the KDE is of interest. +We empirically evaluate the feasibility of our BoxKDE method by analyzing its approximation error. +In order to evaluate the deviation of the estimate from the original density, the mean integrated squared error (MISE) is used. +Those errors are compared for the KDE and its various approximations. +To match the requirements of our application, a synthetic sample set $\mathcal{X}$ with $N=5000$ drawn from a bivariate mixture normal density $f$ given by \eqref{eq:normDist} provides the basis of the comparison. +For each method the estimate is computed and the MISE relative to $f$ is calculated. +The specific structure of the underlying distribution clearly affects the error in the estimate, but only the closeness of our approximation to the KDE is of interest. Hence, $f$ is of minor importance here and was chosen rather arbitrary to highlight the behavior of the BoxKDE. - \begin{equation} +\label{eq:normDist} \begin{split} \bm{X} \sim & ~\G{\VecTwo{0}{0}}{0.5\bm{I}} + \G{\VecTwo{3}{0}}{\bm{I}} + \G{\VecTwo{0}{3}}{\bm{I}} \\ - &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}} \text{,} + &+ \G{\VecTwo{-3}{0} }{\bm{I}} + \G{\VecTwo{0}{-3}}{\bm{I}} \end{split} \end{equation} -where the majority of the probability mass lies in the range $[-6; 6]^2$. + +%where the majority of the probability mass lies in the range $[-6; 6]^2$. \begin{figure}[t] \input{gfx/error.tex} - \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the exact KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps due to rounding.} \label{fig:errorBandwidth} + \caption{MISE relative to the ground truth as a function of $h$. While the error curves of the BKDE (red) and the BoxKDE based on the extended box filter (orange dotted line) resemble the overall course of the error of the KDE (green), the regular BoxKDE (orange) exhibits noticeable jumps due to rounding.} \label{fig:errorBandwidth} \end{figure} -Four estimates are computed with varying bandwidth using the exact KDE, BKDE, BoxKDE, and ExBoxKDE, which uses the extended box filter. +Four estimates are computed with varying bandwidth using the KDE, BKDE, BoxKDE, and ExBoxKDE, which uses the extended box filter. +All estimates are calculated at $30\times 30$ equally spaced points. %Evaluated at $50^2$ points the exact KDE is compared to the BKDE, BoxKDE, and extended box filter approximation, which are evaluated at a smaller grid with $30^2$ points. -The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15; 1.0]$ are given in \figref{fig:errorBandwidth}. +The graphs of the MISE between $f$ and the estimates as a function of $h\in[0.15, 1.0]$ are given in \figref{fig:errorBandwidth}. A minimum error is obtained with $h=0.35$, for larger values oversmoothing occurs and the modes gradually fuse together. Both the BKDE and the ExBoxKDE resemble the error curve of the KDE quite well and stable. @@ -31,7 +34,7 @@ They are rather close to each other, with a tendency to diverge for larger $h$. In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.40, 0.67, 0.82\}$. These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}. -As the extended box filter is able to approximate an exact $\sigma$, such discontinuities don't appear. +As the extended box filter is able to approximate an exact $\sigma$, such discontinuities do not appear. Consequently, it reduces the overall error of the approximation, even though only marginal in this scenario. The global average MISE over all values of $h$ is $0.0049$ for the regular box filter and $0.0047$ in case of the extended version. Likewise, the maximum MISE is $0.0093$ and $0.0091$, respectively. @@ -44,7 +47,7 @@ However, both cases do not give a deeper insight of the error behavior of our me \begin{figure}[t] %\includegraphics[width=\textwidth,height=6cm]{gfx/tmpPerformance.png} \input{gfx/perf.tex} - \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted-average estimate (blue) performs fastest followed by the BoxKDE (orange) approximation. Both the BKDE (red) and the FastKDE (green) are magnitudes slower, especially for $G<10^3$.}\label{fig:performance} + \caption{Logarithmic plot of the runtime performance with increasing grid size $G$ and bivariate data. The weighted-average estimate (blue) performs fastest followed by the BoxKDE (orange) approximation, which is magnitudes slower, especially for $G<10^3$.}\label{fig:performance} \end{figure} % kde, box filter, exbox in abhänigkeit von h (bild) @@ -54,19 +57,18 @@ However, both cases do not give a deeper insight of the error behavior of our me \subsection{Performance} In the following, we underpin the promising theoretical linear time complexity of our method with empirical time measurements compared to other methods. -All tests are performed on an Intel Core \mbox{i5-7600K} CPU with a frequency of \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. -We compare our C++ implementation of the BoxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the \texttt{ks} R package and the FastKDE Python implementation \cite{oBrien2016fast}. -The \texttt{ks} package provides a FFT-based BKDE implementation based on optimized C functions at its core. -With state estimation problems in mind, we additionally provide a C++ implementation of a weighted-average estimator. -As both methods are not using a grid, an equivalent input sample set was used for the weighted-average and the FastKDE. +All tests are performed on an Intel Core \mbox{i5-7600K} CPU @ \SI{4.2}{\giga\hertz}, and \SI{16}{\giga\byte} main memory. %, supporting the AVX2 instruction set +We compare our C++ implementation of the BoxKDE approximation as shown in algorithm~\ref{alg:boxKDE} to the R language \texttt{ks} package, which provides a FFT-based BKDE implementation based on optimized C functions at its core. +With state estimation problems in mind, we additionally provide a C++ implementation of a weighted average estimator. +An equivalently sized input sample set was used for the weighted average, as its runtime depends on the sample size and not the grid size. The results of the performance comparison are presented in \figref{fig:performance}. % O(N) gut erkennbar für box KDE und weighted average The linear complexity of the BoxKDE and the weighted average is clearly visible. % Gerade bei kleinen G bis 10^3 ist die box KDE schneller als R und FastKDE, aber das WA deutlich schneller als alle anderen -Especially for small $G$ up to $10^3$ the BoxKDE is much faster compared to BKDE and FastKDE. +Especially for small $G$ up to $10000$ grid points the BoxKDE is much faster compared to BKDE. % Bei zunehmend größeren G wird der Abstand zwischen box KDE und WA größer. -Nevertheless, the simple weighted-average approach performs the fastest, and with increasing $G$ the distance to the BoxKDE grows constantly. +Nevertheless, the simple weighted average approach performs the fastest. However, it is obvious that this comes with major disadvantages, like being prone to multimodalities, as discussed in section \ref{sec:intro}. % (Das kann auch daran liegen, weil das Binning mit größeren G langsamer wird, was ich mir aber nicht erklären kann! Vlt Cache Effekte) @@ -78,7 +80,7 @@ This behavior is caused by the underlying FFT algorithm. % Daher wird die Laufzeit sprunghaft langsamer wenn auf eine neue power of two aufgefüllt wird, ansonsten bleibt sie konstant. The FFT approach requires the input to be always rounded up to a power of two, what then causes a constant runtime behaviour within those boundaries and a strong performance deterioration at corresponding manifolds. % Der Abbruch bei G=4406^2 liegt daran, weil für größere Gs eine out of memory error ausgelöst wird. -The termination of BKDE graph at $G=4406^2$ is caused by an out of memory error for even bigger $G$ in the \texttt{ks} package. +The termination of BKDE graph at $G\approx 1.9 \cdot 10^7$ is caused by an out of memory error in the \texttt{ks} package for bigger $G$. % Der Plot für den normalen Box Filter wurde aus Gründen der Übersichtlichkeit weggelassen. % Sowohl der box filter als auch der extended box filter haben ein sehr ähnliches Laufzeit Verhalten und somit einen sehr ähnlichen Kurvenverlauf. @@ -87,14 +89,12 @@ Both discussed Gaussian filter approximations, namely box filter and extended bo While the average runtime over all values of $G$ for the standard box filter is \SI{0.4092}{\second}, the extended one has an average of \SI{0.4169}{\second}. To disambiguate \figref{fig:performance}, we only illustrated the results of the BoxKDE with the regular box filter. -The weighted-average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly. +The weighted average has the great advantage of being independent of the dimensionality of the input and can be implemented effortlessly. In contrast, the computation of the BoxKDE approach increases exponentially with increasing number of dimensions. However, due to the linear time complexity and the very simple computation scheme, the overall computation time is still sufficiently fast for many applications and much smaller compared to other methods. The BoxKDE approach presents a reasonable alternative to the weighted-average and is easily integrated into existing systems. In addition, modern CPUs do benefit from the recursive computation scheme of the box filter, as the data exhibits a high degree of spatial locality in memory and the accesses are reliably predictable. -Furthermore, the computation is easily parallelized, as there is no data dependency between the one-dimensional filter passes in algorithm~\ref{alg:boxKDE}. -Hence, the inner loops can be parallelized using threads or SIMD instructions, but the overall speedup depends on the particular architecture and the size of the input. \input{chapters/realworld} diff --git a/tex/chapters/multivariate.tex b/tex/chapters/multivariate.tex index 7df81cd..edfaad0 100644 --- a/tex/chapters/multivariate.tex +++ b/tex/chapters/multivariate.tex @@ -9,8 +9,8 @@ In order to estimate a multivariate density using KDE or BKDE, a multivariate ke Multivariate kernel functions can be constructed in various ways, however, a popular way is given by the product kernel. Such a kernel is constructed by combining several univariate kernels into a product, where each kernel is applied in each dimension with a possibly different bandwidth. -Given a multivariate random variable $\bm{X}=(x_1,\dots ,x_d)$ in $d$ dimensions. -The sample set $\mathcal{X}$ is a $n\times d$ matrix \cite{scott2015}. +Given a multivariate random variable $\bm{X}=(x_1,\dots ,x_d)^T$ in $d$ dimensions. +The sample set $\mathcal{X}=(x_{i,j})=(\bm{X}_1, \dots, \bm{X}_n)$ is a $n\times d$ matrix \cite{scott2015}. The multivariate KDE $\hat{f}$ which defines the estimate pointwise at $\bm{u}=(u_1, \dots, u_d)^T$ is given as \begin{equation} \label{eq:mvKDE} diff --git a/tex/chapters/realworld.tex b/tex/chapters/realworld.tex index 54ae0f7..b7772d5 100644 --- a/tex/chapters/realworld.tex +++ b/tex/chapters/realworld.tex @@ -1,17 +1,17 @@ \subsection{Real World} -To demonstrate the real time capabilities of the proposed method a real world scenario was chosen, namely indoor localization. +To demonstrate the capabilities of the proposed method a real world scenario was chosen, namely indoor localization. The given problem is to localize a pedestrian walking inside a building. Ebner et al. proposed a method, which incorporates multiple sensors, \eg{} Wi-Fi, barometer, step-detection and turn-detection \cite{Ebner-15}. At a given time $t$ the system estimates a state providing the most probable position of the pedestrian. It is implemented using a particle filter with sample importance resampling and \SI{5000} particles. The dynamics are modelled realistically, which constrains the movement according to walls, doors and stairs. -We arranged a \SI{223}{\meter} long walk within the first floor of a \SI{2500}{m$^2$} museum, which was built in the 13th century and therefore offers non-optimal conditions for localization. +We arranged a \SI{223}{\meter} long walk within the first floor of a \mbox{\SI{76}{} $\times$ \SI{71}{\meter}} sized museum, which was built in the 13th century and therefore offers non-optimal conditions for localization. %The measurements for the walks were recorded using a Motorola Nexus 6 at 2.4 GHz band only. % Since this work only focuses on processing a given sample set, further details of the localisation system and the described scenario can be looked up in \cite{Ebner17} and \cite{Fetzer17}. -The spacing $\delta$ of the grid was set to \SI{20}{\centimeter} for $x$ and $y$-direction. +The spacing $\delta$ of the grid was set to \SI{27}{\centimeter} for $x$ and $y$-direction, resulting in a grid size of $G=74019$. The bivariate state estimation was calculated whenever a step was recognized, about every \SI{500}{\milli \second}. %The intention of a real world experiment is to investigate the advantages and disadvantages of the here proposed method for finding a best estimate of the pedestrian's position in the wild, compared to conventional used methods like the weighted-average or choosing the maximum weighted particle. @@ -21,7 +21,7 @@ The bivariate state estimation was calculated whenever a step was recognized, ab \label{fig:realWorldMulti} \end{figure} % -\figref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions. +Fig.~\ref{fig:realWorldMulti} illustrates a frequently occurring situation, where the particle set splits apart, due to uncertain measurements and multiple possible walking directions. This results in a bimodal posterior distribution, which reaches its maximum distances between the modes at \SI{13.4}{\second} (black dotted line). Thus estimating the most probable state over time using the weighted-average results in the blue line, describing the pedestrian's position to be somewhere outside the building (light green area). In contrast, the here proposed method (orange line) is able to retrieve a good estimate compared to the ground truth path shown by the black solid line. @@ -37,15 +37,14 @@ The error over time for different estimation methods of the complete walk can be It is given by calculating the distance between estimation and ground truth at a specific time $t$. Estimates provided by simply choosing the maximum particle stand out the most. As expected beforehand, this method provides many strong peaks through continuously jumping between single particles. -Additionally, in most real world scenarios many particles share the same weight and thus multiple highest-weighted particles exist. \begin{figure} \input{gfx/errorOverTime.tex} - \caption{Error development over time calculated between estimation and ground truth. Between \SI{230}{\second} and \SI{290}{\second} to pedestrian was not moving.} + \caption{Error development over time of a single Monte Carlo run of the walk calculated between estimation and ground truth. Between \SI{230}{\second} and \SI{290}{\second} to pedestrian was not moving.} \label{fig:realWorldTime} \end{figure} -Further investigating \figref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average. +Further investigating \figref{fig:realWorldTime}, the BoxKDE performs slightly better than the weighted-average in this specific Monte Carlo run. However after deploying \SI{100} Monte Carlo runs, the difference becomes insignificant. The main reason for this are again multimodalities caused by faulty or delayed measurements, especially when entering or leaving rooms. Within our experiments the problem occurred due to slow and attenuated Wi-Fi signals inside thick-walled rooms. diff --git a/tex/chapters/usage.tex b/tex/chapters/usage.tex index 052ee06..93cc892 100644 --- a/tex/chapters/usage.tex +++ b/tex/chapters/usage.tex @@ -9,7 +9,7 @@ Consider a set of two-dimensional samples with associated weights, \eg{} generat The overall process for bivariate data is described in Algorithm~\ref{alg:boxKDE}. Assuming that the given $N$ samples are stored in a sequential list, the first step is to create a grid representation. -In order to efficiently construct the grid and to allocate the required memory, the extrema of the samples need to be known in advance. +In order to efficiently construct the grid and to allocate the required memory, the extrema of the samples in each dimension need to be known in advance. These limits might be given by the application. For example, the position of a pedestrian within a building is limited by the physical dimensions of the building. Such knowledge should be integrated into the system to avoid a linear search over the sample set, naturally reducing the computation time. @@ -74,4 +74,5 @@ Depending on the required accuracy, the extended box filter algorithm can furthe Due to its simple indexing scheme, the recursive box filter can easily be computed in parallel using SIMD operations and parallel computation cores. Finally, the most likely state can be obtained from the filtered data, \ie{} from the estimated discrete density, by searching filtered data for its maximum value. +This last step can be integrated into the last filter operation, by recording the largest output value. diff --git a/tex/egbib.bib b/tex/egbib.bib index 2c514ec..54af474 100644 --- a/tex/egbib.bib +++ b/tex/egbib.bib @@ -1722,7 +1722,7 @@ doi={10.1109/PLANS.2008.4570051},} title={{Multi Sensor 3D Indoor Localisation}}, year={2015}, publisher = {IEEE}, - address = {Banff, Canada}, + address = {}, IGNOREmonth={October}, pages={}, } @@ -2869,7 +2869,7 @@ year = {2003} year = {2016}, publisher = {IEEE}, pages = {}, - address = {Madrid, Spain}, + address = {}, issn = {} } @@ -3060,7 +3060,7 @@ year = {2003} author = {Scott, David W.}, year = {2015}, title = {Multivariate Density Estimation: Theory, Practice, and Visualization}, - address = {Hoboken, NJ}, + address = {}, edition = {2}, publisher = {Wiley}, isbn = {978-0-471-69755-8}, @@ -3081,13 +3081,13 @@ DOI = {10.3390/ijgi6080233} @INPROCEEDINGS{Fetzer17, author={T. Fetzer and F. Ebner and F. Deinzer and M. Grzegorzek}, -booktitle={2017 Int. Conf. on Indoor Positioning and Indoor Navigation (IPIN)}, +booktitle={Int. Conf. on Indoor Positioning and Indoor Navigation (IPIN)}, title={Recovering from sample impoverishment in context of indoor localisation}, year={2017}, pages={1-8}, doi={10.1109/IPIN.2017.8115863}, ISSN={}, -month={Sept},} +month={},} From cac8868b36250c785b14aa8a5ba7d0fcca9b05ad Mon Sep 17 00:00:00 2001 From: kazu Date: Wed, 14 Mar 2018 08:08:08 +0100 Subject: [PATCH 09/11] adjusted error gfx --- tex/gfx/error.eps | 710 +++++++++++++++++++--------------------- tex/gfx/error.tex | 18 +- tex/gfx/error/Box.csv | 172 +++++----- tex/gfx/error/ExBox.csv | 172 +++++----- tex/gfx/error/KDE.csv | 172 +++++----- tex/gfx/error/R.csv | 172 +++++----- 6 files changed, 680 insertions(+), 736 deletions(-) diff --git a/tex/gfx/error.eps b/tex/gfx/error.eps index 0766daa..d7cbe86 100644 --- a/tex/gfx/error.eps +++ b/tex/gfx/error.eps @@ -1,7 +1,7 @@ %!PS-Adobe-2.0 EPSF-2.0 %%Title: error.tex %%Creator: gnuplot 5.2 patchlevel 2 -%%CreationDate: Mon Feb 26 10:56:22 2018 +%%CreationDate: Wed Mar 14 07:59:35 2018 %%DocumentFonts: %%BoundingBox: 50 50 302 230 %%EndComments @@ -441,7 +441,7 @@ SDict begin [ /Creator (gnuplot 5.2 patchlevel 2) % /Producer (gnuplot) % /Keywords () - /CreationDate (Mon Feb 26 10:56:22 2018) + /CreationDate (Wed Mar 14 07:59:35 2018) /DOCINFO pdfmark end } ifelse @@ -504,30 +504,14 @@ LTb 0.500 UL LTa LCa setrgbcolor -567 1013 M +567 1168 M 4406 0 V stroke 1.000 UL LTb LCb setrgbcolor [] 0 setdash -567 1013 M -63 0 V -4343 0 R --63 0 V -stroke -LTb -0.500 UL -LTa -LCa setrgbcolor -567 1323 M -4406 0 V -stroke -1.000 UL -LTb -LCb setrgbcolor -[] 0 setdash -567 1323 M +567 1168 M 63 0 V 4343 0 R -63 0 V @@ -552,30 +536,14 @@ LTb 0.500 UL LTa LCa setrgbcolor -567 1942 M +567 2097 M 4406 0 V stroke 1.000 UL LTb LCb setrgbcolor [] 0 setdash -567 1942 M -63 0 V -4343 0 R --63 0 V -stroke -LTb -0.500 UL -LTa -LCa setrgbcolor -567 2251 M -4406 0 V -stroke -1.000 UL -LTb -LCb setrgbcolor -[] 0 setdash -567 2251 M +567 2097 M 63 0 V 4343 0 R -63 0 V @@ -602,7 +570,7 @@ LTb 0.500 UL LTa LCa setrgbcolor -567 2870 M +567 3025 M 132 0 V 1218 0 R 3056 0 V @@ -611,25 +579,7 @@ stroke LTb LCb setrgbcolor [] 0 setdash -567 2870 M -63 0 V -4343 0 R --63 0 V -stroke -LTb -0.500 UL -LTa -LCa setrgbcolor -567 3180 M -132 0 V -1218 0 R -3056 0 V -stroke -1.000 UL -LTb -LCb setrgbcolor -[] 0 setdash -567 3180 M +567 3025 M 63 0 V 4343 0 R -63 0 V @@ -837,92 +787,92 @@ Z stroke % Begin plot #1 2.000 UL LTb -0.31 0.60 0.02 C 812 1673 M -49 -195 V -49 -2 V -49 -177 V -49 -50 V -49 -101 V -49 -2 V -48 -158 V +0.31 0.60 0.02 C 812 900 M +49 3 V +49 -36 V 49 12 V -49 -94 V -49 -61 V -49 42 V -49 -27 V -49 -23 V -49 67 V -49 -73 V -49 -10 V -49 1 V -49 -24 V +49 -16 V 49 -25 V -49 42 V -49 -17 V -49 89 V -49 -70 V -49 29 V -49 27 V -49 31 V -49 -24 V -49 17 V -48 22 V -49 68 V -49 2 V -49 -9 V -49 47 V -49 1 V -49 7 V -49 67 V -49 -7 V -49 55 V -49 10 V -49 40 V -49 55 V -49 28 V -49 13 V -49 34 V -49 25 V -49 23 V -49 88 V 49 -5 V -49 38 V -49 50 V -49 7 V -48 46 V -49 30 V -49 52 V -49 29 V -49 29 V -49 14 V -49 32 V -49 21 V -49 50 V -49 35 V -49 14 V -49 64 V -49 -6 V -49 60 V -49 61 V -49 -32 V -49 61 V -49 47 V -49 12 V -49 54 V -49 27 V -49 16 V -48 18 V +48 -11 V +49 5 V +49 0 V +49 -10 V 49 25 V -49 57 V -49 56 V -49 -2 V -49 49 V -49 19 V -49 20 V -49 26 V -49 50 V -49 15 V +49 4 V +49 -1 V +49 24 V +49 16 V 49 1 V +49 6 V +49 15 V +49 17 V +49 14 V +49 16 V +49 4 V +49 45 V +49 3 V +49 45 V +49 -6 V +49 5 V +49 68 V +48 16 V +49 26 V +49 17 V +49 37 V +49 28 V +49 36 V +49 35 V +49 23 V +49 22 V +49 48 V +49 23 V +49 52 V +49 45 V +49 14 V +49 38 V +49 29 V +49 36 V +49 48 V +49 50 V +49 27 V +49 34 V +49 37 V +49 51 V +48 47 V +49 23 V +49 37 V +49 43 V +49 58 V +49 40 V +49 28 V +49 39 V +49 42 V +49 48 V +49 30 V +49 56 V +49 32 V +49 44 V +49 38 V +49 44 V +49 23 V +49 45 V +49 41 V +49 40 V +49 37 V +49 50 V +48 34 V +49 34 V +49 35 V +49 44 V +49 23 V +49 37 V +49 45 V +49 38 V +49 21 V +49 32 V +49 40 V +49 39 V stroke LTw % End plot #1 @@ -931,92 +881,92 @@ LTw LTb LCb setrgbcolor [] 0 setdash -0.80 0.00 0.00 C 812 1324 M -49 -87 V -49 25 V -49 -94 V -49 46 V -49 -151 V -49 67 V -48 -137 V -49 26 V -49 -46 V -49 -45 V -49 56 V -49 -21 V -49 28 V -49 78 V -49 -76 V -49 -11 V -49 27 V -49 -17 V -49 -11 V -49 58 V -49 -1 V -49 97 V -49 -64 V -49 39 V -49 40 V -49 44 V -49 -21 V -49 14 V -48 33 V -49 91 V -49 6 V -49 -16 V -49 53 V -49 7 V -49 12 V -49 77 V -49 -1 V -49 56 V -49 14 V -49 50 V -49 60 V -49 26 V -49 23 V -49 31 V -49 37 V -49 24 V -49 87 V +0.80 0.00 0.00 C 812 840 M +49 11 V +49 -15 V +49 22 V 49 -3 V -49 42 V -49 59 V -49 10 V -48 47 V -49 30 V -49 59 V -49 31 V -49 27 V +49 -18 V +49 5 V +48 -3 V 49 15 V -49 37 V -49 23 V -49 51 V +49 4 V +49 -3 V 49 35 V -49 17 V -49 67 V -49 -8 V -49 63 V -49 64 V -49 -29 V -49 58 V -49 50 V -49 15 V -49 55 V -49 28 V -49 17 V -48 17 V -49 26 V -49 58 V -49 59 V -49 -4 V -49 54 V +49 10 V +49 2 V +49 32 V +49 21 V +49 3 V +49 11 V +49 16 V +49 24 V 49 16 V -49 23 V -49 26 V -49 51 V 49 17 V -49 -3 V +49 6 V +49 50 V +49 3 V +49 48 V +49 -5 V +49 5 V +49 71 V +48 15 V +49 29 V +49 19 V +49 37 V +49 29 V +49 35 V +49 37 V +49 22 V +49 23 V +49 48 V +49 22 V +49 51 V +49 44 V +49 16 V +49 37 V +49 28 V +49 36 V +49 48 V +49 48 V +49 26 V +49 35 V +49 35 V +49 50 V +48 46 V +49 22 V +49 36 V +49 41 V +49 57 V +49 38 V +49 27 V +49 38 V +49 39 V +49 48 V +49 29 V +49 54 V +49 31 V +49 41 V +49 37 V +49 43 V +49 22 V +49 43 V +49 39 V +49 39 V +49 36 V +49 47 V +48 33 V +49 33 V +49 33 V +49 42 V +49 22 V +49 36 V +49 43 V +49 37 V +49 19 V +49 31 V +49 38 V +49 37 V stroke LTw % End plot #2 @@ -1025,92 +975,92 @@ LTw LTb LCb setrgbcolor [] 0 setdash -0.99 0.69 0.24 C 812 1422 M -49 -44 V -49 96 V -49 -65 V -49 92 V -49 -62 V -49 83 V -48 -172 V -49 95 V -49 -403 V -49 -39 V -49 39 V -49 -16 V -49 44 V -49 60 V -49 -81 V -49 -16 V -49 15 V -49 -25 V -49 -25 V -49 45 V -49 -19 V -49 82 V -49 -87 V -49 17 V -49 20 V -49 21 V -49 315 V -49 -16 V -48 6 V -49 62 V -49 -19 V -49 -44 V -49 22 V -49 -21 V -49 -19 V -49 44 V -49 -33 V -49 23 V -49 355 V +0.99 0.69 0.24 C 812 843 M 49 14 V -49 22 V -49 -8 V -49 -10 V -49 -4 V -49 0 V -49 -10 V -49 53 V -49 -40 V -49 5 V -49 25 V -49 -30 V -48 13 V -49 709 V -49 17 V -49 -8 V -49 -8 V +49 -14 V +49 23 V 49 -5 V -49 -5 V -49 -6 V -49 8 V -49 2 V -49 -13 V -49 25 V -49 -37 V -49 28 V -49 28 V -49 -60 V -49 472 V -49 13 V -49 -7 V -49 15 V +49 -17 V +49 -1 V +48 -8 V +49 6 V +49 1 V +49 -10 V +49 262 V 49 -2 V 49 -13 V -48 -9 V -49 -8 V -49 23 V -49 24 V -49 -29 V -49 18 V -49 -7 V -49 319 V +49 21 V +49 6 V 49 -12 V -49 26 V +49 -5 V +49 -2 V +49 9 V +49 -6 V +49 -1 V +49 -15 V +49 28 V +49 -19 V +49 24 V +49 -28 V +49 -19 V +49 42 V +48 -9 V +49 441 V +49 -5 V +49 5 V +49 -2 V +49 3 V +49 7 V +49 -11 V +49 -8 V +49 14 V +49 -12 V +49 16 V +49 9 V +49 -21 V +49 437 V +49 -7 V +49 -2 V +49 10 V +49 10 V +49 -11 V +49 -4 V +49 -4 V +49 12 V +48 7 V +49 -17 V +49 -4 V +49 3 V +49 18 V +49 -1 V +49 -14 V +49 841 V +49 -3 V +49 11 V 49 -10 V -49 -26 V +49 12 V +49 -7 V +49 3 V +49 -1 V +49 4 V +49 -13 V +49 4 V +49 2 V +49 3 V +49 -2 V +49 12 V +48 -4 V +49 533 V +49 -2 V +49 4 V +49 -8 V +49 3 V +49 6 V +49 3 V +49 -11 V +49 -3 V +49 6 V +49 4 V stroke LTw % End plot #3 @@ -1119,92 +1069,92 @@ LTw LTb LCb setrgbcolor LT2 -0.99 0.69 0.24 C 812 1180 M -49 -59 V -49 39 V -49 -53 V -49 70 V -49 -133 V -49 85 V -48 -124 V -49 34 V -49 -45 V -49 -42 V -49 57 V -49 -23 V -49 23 V -49 78 V -49 -81 V -49 -12 V -49 23 V -49 -21 V -49 -14 V -49 57 V -49 -2 V -49 97 V -49 -66 V -49 40 V -49 40 V -49 46 V -49 -20 V -49 15 V -48 35 V -49 94 V -49 9 V -49 -12 V -49 56 V -49 12 V -49 16 V -49 81 V -49 4 V -49 61 V -49 21 V -49 54 V -49 65 V -49 32 V -49 31 V -49 34 V -49 33 V -49 25 V -49 82 V -49 -2 V -49 40 V -49 55 V +0.99 0.69 0.24 C 812 839 M 49 11 V -48 47 V -49 31 V -49 58 V -49 30 V -49 29 V -49 20 V -49 36 V -49 27 V -49 50 V -49 38 V -49 21 V -49 66 V -49 -2 V -49 65 V -49 64 V -49 -20 V -49 57 V -49 52 V -49 21 V -49 55 V +49 -13 V +49 26 V +49 0 V +49 -16 V +49 5 V +48 -4 V +49 14 V +49 3 V +49 -5 V 49 33 V +49 8 V +49 -1 V +49 31 V +49 20 V +49 2 V +49 10 V +49 15 V +49 23 V +49 15 V +49 17 V +49 6 V +49 51 V +49 3 V +49 49 V +49 -5 V +49 7 V +49 72 V +48 18 V +49 29 V 49 21 V -48 24 V -49 28 V -49 58 V -49 58 V -49 4 V -49 50 V +49 40 V +49 31 V +49 38 V +49 40 V 49 26 V 49 27 V -49 23 V -49 56 V +49 52 V +49 27 V +49 55 V +49 49 V +49 21 V +49 42 V +49 34 V +49 41 V +49 54 V +49 54 V +49 33 V +49 36 V +49 33 V +49 48 V +48 44 V 49 20 V -49 4 V +49 36 V +49 40 V +49 55 V +49 39 V +49 27 V +49 39 V +49 39 V +49 49 V +49 30 V +49 54 V +49 33 V +49 43 V +49 39 V +49 45 V +49 26 V +49 45 V +49 42 V +49 42 V +49 39 V +49 51 V +48 36 V +49 36 V +49 37 V +49 45 V +49 27 V +49 41 V +49 45 V +49 41 V +49 25 V +49 34 V +49 42 V +49 40 V stroke LTw % End plot #4 diff --git a/tex/gfx/error.tex b/tex/gfx/error.tex index cbcf985..d4dc855 100644 --- a/tex/gfx/error.tex +++ b/tex/gfx/error.tex @@ -82,25 +82,19 @@ \begin{picture}(5040.00,3600.00)% \gplgaddtomacro\gplbacktext{% \csname LTb\endcsname%% - \put(435,704){\makebox(0,0)[r]{\strut{}\footnotesize{1}}}% + \put(435,704){\makebox(0,0)[r]{\strut{}\footnotesize{0}}}% \csname LTb\endcsname%% - \put(435,1013){\makebox(0,0)[r]{\strut{}\footnotesize{2}}}% - \csname LTb\endcsname%% - \put(435,1323){\makebox(0,0)[r]{\strut{}\footnotesize{3}}}% + \put(435,1168){\makebox(0,0)[r]{\strut{}\footnotesize{2}}}% \csname LTb\endcsname%% \put(435,1632){\makebox(0,0)[r]{\strut{}\footnotesize{4}}}% \csname LTb\endcsname%% - \put(435,1942){\makebox(0,0)[r]{\strut{}\footnotesize{5}}}% + \put(435,2097){\makebox(0,0)[r]{\strut{}\footnotesize{6}}}% \csname LTb\endcsname%% - \put(435,2251){\makebox(0,0)[r]{\strut{}\footnotesize{6}}}% + \put(435,2561){\makebox(0,0)[r]{\strut{}\footnotesize{8}}}% \csname LTb\endcsname%% - \put(435,2561){\makebox(0,0)[r]{\strut{}\footnotesize{7}}}% + \put(435,3025){\makebox(0,0)[r]{\strut{}\footnotesize{10}}}% \csname LTb\endcsname%% - \put(435,2870){\makebox(0,0)[r]{\strut{}\footnotesize{8}}}% - \csname LTb\endcsname%% - \put(435,3180){\makebox(0,0)[r]{\strut{}\footnotesize{9}}}% - \csname LTb\endcsname%% - \put(435,3489){\makebox(0,0)[r]{\strut{}\footnotesize{10}}}% + \put(435,3489){\makebox(0,0)[r]{\strut{}\footnotesize{12}}}% \csname LTb\endcsname%% \put(567,484){\makebox(0,0){\strut{}\footnotesize{0.1}}}% \csname LTb\endcsname%% diff --git a/tex/gfx/error/Box.csv b/tex/gfx/error/Box.csv index cd20b2a..f2c6cac 100644 --- a/tex/gfx/error/Box.csv +++ b/tex/gfx/error/Box.csv @@ -1,86 +1,86 @@ -0.15 0.0033189 -0.16 0.0031768 -0.17 0.0034876 -0.18 0.003279 -0.19 0.0035742 -0.2 0.0033766 -0.21 0.0036435 -0.22 0.0030874 -0.23 0.0033947 -0.24 0.0020935 -0.25 0.0019648 -0.26 0.0020927 -0.27 0.0020421 -0.28 0.0021843 -0.29 0.0023761 -0.3 0.0021153 -0.31 0.0020616 -0.32 0.0021101 -0.33 0.0020299 -0.34 0.0019515 -0.35 0.0020969 -0.36 0.0020339 -0.37 0.0022978 -0.38 0.0020175 -0.39 0.0020731 -0.4 0.0021381 -0.41 0.002205 -0.42 0.0032218 -0.43 0.0031708 -0.44 0.0031925 -0.45 0.00339 -0.46 0.0033311 -0.47 0.0031871 -0.48 0.0032586 -0.49 0.0031915 -0.5 0.0031306 -0.51 0.003271 -0.52 0.0031646 -0.53 0.0032397 -0.54 0.0043857 -0.55 0.0044305 -0.56 0.0045027 -0.57 0.0044764 -0.58 0.0044453 -0.59 0.004432 -0.6 0.0044327 -0.61 0.0044002 -0.62 0.0045698 -0.63 0.004442 -0.64 0.004459 -0.65 0.0045391 -0.66 0.0044415 -0.67 0.0044821 -0.68 0.0067752 -0.69 0.00683 -0.7 0.0068036 -0.71 0.0067792 -0.72 0.0067628 -0.73 0.006745 -0.74 0.0067275 -0.75 0.0067535 -0.76 0.006759 -0.77 0.0067171 -0.78 0.0067962 -0.79 0.0066775 -0.8 0.0067698 -0.81 0.0068584 -0.82 0.0066662 -0.83 0.0081892 -0.84 0.0082323 -0.85 0.0082112 -0.86 0.0082572 -0.87 0.0082531 -0.88 0.0082088 -0.89 0.0081795 -0.9 0.0081559 -0.91 0.008228 -0.92 0.0083054 -0.93 0.0082122 -0.94 0.008271 -0.95 0.0082487 -0.96 0.0092787 -0.97 0.0092404 -0.98 0.0093242 -0.99 0.0092929 -1 0.0092081 +0.15 0.00059687 +0.16 0.00065779 +0.17 0.0006009 +0.18 0.00069853 +0.19 0.00067814 +0.2 0.00060397 +0.21 0.00060019 +0.22 0.00056329 +0.23 0.00058935 +0.24 0.0005947 +0.25 0.00055101 +0.26 0.0016793 +0.27 0.0016706 +0.28 0.0016165 +0.29 0.0017043 +0.3 0.0017323 +0.31 0.001682 +0.32 0.0016604 +0.33 0.0016488 +0.34 0.0016879 +0.35 0.0016653 +0.36 0.0016577 +0.37 0.0015946 +0.38 0.001717 +0.39 0.001634 +0.4 0.0017368 +0.41 0.001615 +0.42 0.0015334 +0.43 0.0017162 +0.44 0.0016741 +0.45 0.0035742 +0.46 0.0035536 +0.47 0.0035763 +0.48 0.0035662 +0.49 0.0035824 +0.5 0.0036102 +0.51 0.0035645 +0.52 0.0035274 +0.53 0.0035878 +0.54 0.0035357 +0.55 0.0036062 +0.56 0.003644 +0.57 0.0035564 +0.58 0.0054383 +0.59 0.0054059 +0.6 0.0053969 +0.61 0.0054425 +0.62 0.0054856 +0.63 0.0054371 +0.64 0.0054208 +0.65 0.0054052 +0.66 0.0054552 +0.67 0.0054841 +0.68 0.0054098 +0.69 0.0053947 +0.7 0.0054073 +0.71 0.0054836 +0.72 0.005479 +0.73 0.0054226 +0.74 0.0090422 +0.75 0.0090316 +0.76 0.0090807 +0.77 0.0090336 +0.78 0.0090864 +0.79 0.009056 +0.8 0.0090689 +0.81 0.0090651 +0.82 0.0090826 +0.83 0.0090286 +0.84 0.0090444 +0.85 0.0090527 +0.86 0.0090639 +0.87 0.0090555 +0.88 0.0091075 +0.89 0.009093 +0.9 0.011388 +0.91 0.011381 +0.92 0.011396 +0.93 0.011363 +0.94 0.011375 +0.95 0.011399 +0.96 0.011412 +0.97 0.011366 +0.98 0.011353 +0.99 0.011379 +1 0.011397 diff --git a/tex/gfx/error/ExBox.csv b/tex/gfx/error/ExBox.csv index c2436f5..4995f70 100644 --- a/tex/gfx/error/ExBox.csv +++ b/tex/gfx/error/ExBox.csv @@ -1,86 +1,86 @@ -0.15 0.0025393 -0.16 0.0023461 -0.17 0.0024735 -0.18 0.0023035 -0.19 0.0025283 -0.2 0.0020988 -0.21 0.0023745 -0.22 0.0019721 -0.23 0.0020824 -0.24 0.0019375 -0.25 0.0018002 -0.26 0.0019852 -0.27 0.0019122 -0.28 0.0019868 -0.29 0.0022381 -0.3 0.0019765 -0.31 0.0019362 -0.32 0.0020113 -0.33 0.0019421 -0.34 0.001899 -0.35 0.002081 -0.36 0.0020747 -0.37 0.0023881 -0.38 0.002177 -0.39 0.0023051 -0.4 0.0024355 -0.41 0.002583 -0.42 0.002519 -0.43 0.0025674 -0.44 0.0026817 -0.45 0.0029832 -0.46 0.0030142 -0.47 0.0029741 -0.48 0.003154 -0.49 0.0031947 -0.5 0.0032465 -0.51 0.0035065 -0.52 0.0035215 -0.53 0.0037178 -0.54 0.0037841 -0.55 0.0039603 -0.56 0.0041717 -0.57 0.004275 -0.58 0.0043752 -0.59 0.0044831 -0.6 0.0045918 -0.61 0.0046701 -0.62 0.0049349 -0.63 0.0049283 -0.64 0.0050605 -0.65 0.0052368 -0.66 0.005273 -0.67 0.0054247 -0.68 0.0055228 -0.69 0.0057119 -0.7 0.0058101 -0.71 0.0059023 -0.72 0.0059666 -0.73 0.0060823 -0.74 0.0061708 -0.75 0.0063323 -0.76 0.0064536 -0.77 0.0065243 -0.78 0.0067353 -0.79 0.0067307 -0.8 0.006941 -0.81 0.0071459 -0.82 0.0070831 -0.83 0.0072652 -0.84 0.0074325 -0.85 0.0075025 -0.86 0.0076805 -0.87 0.0077856 -0.88 0.0078533 -0.89 0.0079306 -0.9 0.0080219 -0.91 0.0082087 -0.92 0.0083966 -0.93 0.0084095 -0.94 0.0085726 -0.95 0.0086554 -0.96 0.0087418 -0.97 0.008817 -0.98 0.008999 -0.99 0.0090629 -1 0.0090768 +0.15 0.00058275 +0.16 0.00062877 +0.17 0.00057353 +0.18 0.00068519 +0.19 0.00068682 +0.2 0.00061422 +0.21 0.00063777 +0.22 0.00062259 +0.23 0.00067975 +0.24 0.00069164 +0.25 0.00067199 +0.26 0.00081329 +0.27 0.00084847 +0.28 0.00084521 +0.29 0.00097837 +0.3 0.001063 +0.31 0.0010714 +0.32 0.001116 +0.33 0.001181 +0.34 0.0012786 +0.35 0.001345 +0.36 0.0014194 +0.37 0.0014429 +0.38 0.001663 +0.39 0.0016755 +0.4 0.0018851 +0.41 0.0018659 +0.42 0.0018964 +0.43 0.0022078 +0.44 0.0022825 +0.45 0.0024083 +0.46 0.002501 +0.47 0.0026722 +0.48 0.0028053 +0.49 0.0029689 +0.5 0.0031423 +0.51 0.0032524 +0.52 0.0033713 +0.53 0.0035952 +0.54 0.0037081 +0.55 0.003948 +0.56 0.0041574 +0.57 0.0042471 +0.58 0.0044276 +0.59 0.0045744 +0.6 0.0047528 +0.61 0.0049851 +0.62 0.0052184 +0.63 0.0053606 +0.64 0.0055147 +0.65 0.0056563 +0.66 0.0058633 +0.67 0.0060525 +0.68 0.0061413 +0.69 0.0062951 +0.7 0.0064694 +0.71 0.0067058 +0.72 0.0068741 +0.73 0.0069869 +0.74 0.007156 +0.75 0.0073243 +0.76 0.0075373 +0.77 0.0076646 +0.78 0.0078997 +0.79 0.0080406 +0.8 0.0082263 +0.81 0.0083948 +0.82 0.0085871 +0.83 0.0087002 +0.84 0.0088932 +0.85 0.0090744 +0.86 0.0092566 +0.87 0.0094215 +0.88 0.0096413 +0.89 0.0097972 +0.9 0.0099539 +0.91 0.010114 +0.92 0.010305 +0.93 0.010425 +0.94 0.010599 +0.95 0.010793 +0.96 0.010969 +0.97 0.011077 +0.98 0.011225 +0.99 0.011406 +1 0.011578 diff --git a/tex/gfx/error/KDE.csv b/tex/gfx/error/KDE.csv index f78d3f4..f611822 100644 --- a/tex/gfx/error/KDE.csv +++ b/tex/gfx/error/KDE.csv @@ -1,86 +1,86 @@ -0.15 0.0041324 -0.16 0.0034997 -0.17 0.0034953 -0.18 0.0029233 -0.19 0.0027619 -0.2 0.0024336 -0.21 0.0024278 -0.22 0.0019188 -0.23 0.0019581 -0.24 0.0016537 -0.25 0.0014557 -0.26 0.0015912 -0.27 0.0015055 -0.28 0.0014301 -0.29 0.001647 -0.3 0.0014108 -0.31 0.0013784 -0.32 0.0013799 -0.33 0.0013041 -0.34 0.0012221 -0.35 0.0013579 -0.36 0.0013049 -0.37 0.0015929 -0.38 0.0013657 -0.39 0.00146 -0.4 0.0015476 -0.41 0.0016478 -0.42 0.0015688 -0.43 0.0016247 -0.44 0.0016942 -0.45 0.0019139 -0.46 0.0019209 -0.47 0.0018917 -0.48 0.0020427 -0.49 0.002047 -0.5 0.002069 -0.51 0.0022854 -0.52 0.0022626 -0.53 0.0024412 -0.54 0.0024725 -0.55 0.0026018 -0.56 0.0027801 -0.57 0.0028716 -0.58 0.002914 -0.59 0.0030222 -0.6 0.0031041 -0.61 0.0031796 -0.62 0.003462 -0.63 0.0034449 -0.64 0.0035697 -0.65 0.0037295 -0.66 0.0037533 -0.67 0.0039028 -0.68 0.003999 -0.69 0.0041675 -0.7 0.0042605 -0.71 0.0043543 -0.72 0.0044005 -0.73 0.0045022 -0.74 0.0045723 -0.75 0.004732 -0.76 0.0048443 -0.77 0.0048918 -0.78 0.0050967 -0.79 0.0050769 -0.8 0.005271 -0.81 0.0054686 -0.82 0.0053674 -0.83 0.0055645 -0.84 0.0057138 -0.85 0.0057533 -0.86 0.0059284 -0.87 0.0060163 -0.88 0.0060672 -0.89 0.0061255 -0.9 0.0062068 -0.91 0.0063892 -0.92 0.0065702 -0.93 0.0065635 -0.94 0.0067229 -0.95 0.0067839 -0.96 0.0068494 -0.97 0.0069324 -0.98 0.0070951 -0.99 0.0071419 -1 0.0071465 +0.15 0.00084575 +0.16 0.0008555 +0.17 0.00070335 +0.18 0.00075477 +0.19 0.00068716 +0.2 0.00057767 +0.21 0.00055525 +0.22 0.00050787 +0.23 0.00053075 +0.24 0.00052867 +0.25 0.00048694 +0.26 0.00059659 +0.27 0.00061149 +0.28 0.00060552 +0.29 0.00070919 +0.3 0.00078033 +0.31 0.00078231 +0.32 0.00081119 +0.33 0.00087494 +0.34 0.00094694 +0.35 0.0010085 +0.36 0.0010783 +0.37 0.0010941 +0.38 0.0012901 +0.39 0.0012999 +0.4 0.0014947 +0.41 0.0014695 +0.42 0.0014919 +0.43 0.0017827 +0.44 0.0018517 +0.45 0.0019647 +0.46 0.0020368 +0.47 0.0021959 +0.48 0.0023185 +0.49 0.0024715 +0.5 0.0026232 +0.51 0.002723 +0.52 0.0028187 +0.53 0.0030264 +0.54 0.0031228 +0.55 0.0033492 +0.56 0.0035401 +0.57 0.0036015 +0.58 0.0037649 +0.59 0.0038893 +0.6 0.0040476 +0.61 0.0042546 +0.62 0.0044674 +0.63 0.0045828 +0.64 0.004733 +0.65 0.0048899 +0.66 0.0051112 +0.67 0.0053131 +0.68 0.0054134 +0.69 0.0055695 +0.7 0.0057556 +0.71 0.0060067 +0.72 0.0061785 +0.73 0.0062981 +0.74 0.0064684 +0.75 0.0066467 +0.76 0.0068563 +0.77 0.006984 +0.78 0.0072279 +0.79 0.0073651 +0.8 0.0075512 +0.81 0.0077164 +0.82 0.0079074 +0.83 0.008007 +0.84 0.0082 +0.85 0.0083749 +0.86 0.0085491 +0.87 0.0087095 +0.88 0.0089233 +0.89 0.0090684 +0.9 0.0092171 +0.91 0.0093685 +0.92 0.0095564 +0.93 0.0096567 +0.94 0.0098157 +0.95 0.010008 +0.96 0.010175 +0.97 0.010262 +0.98 0.010403 +0.99 0.010573 +1 0.010741 diff --git a/tex/gfx/error/R.csv b/tex/gfx/error/R.csv index 4600b8e..1d46fae 100644 --- a/tex/gfx/error/R.csv +++ b/tex/gfx/error/R.csv @@ -1,86 +1,86 @@ -0.15 0.0030048 -0.16 0.0027235 -0.17 0.0028037 -0.18 0.002501 -0.19 0.0026474 -0.2 0.0021599 -0.21 0.0023761 -0.22 0.0019351 -0.23 0.0020188 -0.24 0.0018681 -0.25 0.0017232 -0.26 0.0019055 -0.27 0.0018362 -0.28 0.0019274 -0.29 0.0021803 -0.3 0.0019327 -0.31 0.0018995 -0.32 0.0019861 -0.33 0.0019309 -0.34 0.001895 -0.35 0.0020826 -0.36 0.0020798 -0.37 0.0023937 -0.38 0.0021845 -0.39 0.0023111 -0.4 0.0024397 -0.41 0.0025829 -0.42 0.0025166 -0.43 0.0025595 -0.44 0.0026668 -0.45 0.0029602 -0.46 0.0029813 -0.47 0.0029307 -0.48 0.0030994 -0.49 0.0031245 -0.5 0.0031607 -0.51 0.0034095 -0.52 0.0034081 -0.53 0.0035874 -0.54 0.0036336 -0.55 0.0037945 -0.56 0.0039894 -0.57 0.0040735 -0.58 0.0041469 -0.59 0.0042489 -0.6 0.0043679 -0.61 0.0044458 -0.62 0.0047273 -0.63 0.0047177 -0.64 0.0048532 -0.65 0.0050419 -0.66 0.0050739 -0.67 0.0052268 -0.68 0.0053241 -0.69 0.0055141 -0.7 0.0056137 -0.71 0.005701 -0.72 0.0057504 -0.73 0.005869 -0.74 0.0059459 -0.75 0.0061087 -0.76 0.0062209 -0.77 0.0062782 -0.78 0.0064925 -0.79 0.0064665 -0.8 0.0066716 -0.81 0.0068794 -0.82 0.0067834 -0.83 0.0069727 -0.84 0.0071346 -0.85 0.0071815 -0.86 0.0073587 -0.87 0.0074514 -0.88 0.0075061 -0.89 0.0075594 -0.9 0.0076439 -0.91 0.0078311 -0.92 0.0080213 -0.93 0.0080097 -0.94 0.0081828 -0.95 0.008234 -0.96 0.0083084 -0.97 0.0083941 -0.98 0.00856 -0.99 0.0086127 -1 0.0086044 +0.15 0.00058513 +0.16 0.00063311 +0.17 0.00056715 +0.18 0.00066154 +0.19 0.0006489 +0.2 0.00057419 +0.21 0.00059376 +0.22 0.00058064 +0.23 0.00064472 +0.24 0.00066439 +0.25 0.00065271 +0.26 0.0008034 +0.27 0.00084639 +0.28 0.00085144 +0.29 0.0009914 +0.3 0.0010808 +0.31 0.0010957 +0.32 0.0011434 +0.33 0.0012088 +0.34 0.0013151 +0.35 0.0013819 +0.36 0.001457 +0.37 0.001481 +0.38 0.0016998 +0.39 0.0017104 +0.4 0.001917 +0.41 0.0018942 +0.42 0.0019193 +0.43 0.0022234 +0.44 0.00229 +0.45 0.0024113 +0.46 0.0024951 +0.47 0.0026559 +0.48 0.0027777 +0.49 0.0029289 +0.5 0.003088 +0.51 0.0031833 +0.52 0.0032852 +0.53 0.0034919 +0.54 0.003586 +0.55 0.0038063 +0.56 0.0039949 +0.57 0.0040647 +0.58 0.0042219 +0.59 0.0043447 +0.6 0.0044993 +0.61 0.0047046 +0.62 0.0049136 +0.63 0.0050255 +0.64 0.005173 +0.65 0.0053243 +0.66 0.0055417 +0.67 0.0057377 +0.68 0.0058326 +0.69 0.0059893 +0.7 0.0061679 +0.71 0.0064099 +0.72 0.0065768 +0.73 0.0066899 +0.74 0.0068573 +0.75 0.0070248 +0.76 0.0072316 +0.77 0.0073546 +0.78 0.0075865 +0.79 0.0077214 +0.8 0.007899 +0.81 0.0080582 +0.82 0.0082415 +0.83 0.0083389 +0.84 0.0085233 +0.85 0.0086915 +0.86 0.0088592 +0.87 0.0090121 +0.88 0.0092176 +0.89 0.0093578 +0.9 0.0095 +0.91 0.0096446 +0.92 0.0098242 +0.93 0.0099207 +0.94 0.010073 +0.95 0.010258 +0.96 0.010418 +0.97 0.0105 +0.98 0.010635 +0.99 0.010799 +1 0.010958 From 885ff2a87e4320a41604b36a733c7d1fa1a5dd43 Mon Sep 17 00:00:00 2001 From: MBulli Date: Wed, 14 Mar 2018 18:04:13 +0100 Subject: [PATCH 10/11] Fixed bandwidth values --- tex/chapters/experiments.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tex/chapters/experiments.tex b/tex/chapters/experiments.tex index 3735d24..a26bc72 100644 --- a/tex/chapters/experiments.tex +++ b/tex/chapters/experiments.tex @@ -31,7 +31,7 @@ A minimum error is obtained with $h=0.35$, for larger values oversmoothing occur Both the BKDE and the ExBoxKDE resemble the error curve of the KDE quite well and stable. They are rather close to each other, with a tendency to diverge for larger $h$. -In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.40, 0.67, 0.82\}$. +In contrast, the error curve of the BoxKDE has noticeable jumps at $h=\{0.25, 0.44, 0.57, 0.73, 0.89\}$. These jumps are caused by the rounding of the integer-valued box width given by \eqref{eq:boxidealwidth}. As the extended box filter is able to approximate an exact $\sigma$, such discontinuities do not appear. From 19b340e1d4c772205ace4cdb2481f4dbd79d3fb9 Mon Sep 17 00:00:00 2001 From: kazu Date: Wed, 14 Mar 2018 18:34:56 +0100 Subject: [PATCH 11/11] fixed gffixed gx --- tex/gfx/error.eps | 4 ++-- tex/gfx/error.gp | 2 +- tex/gfx/error.tex | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tex/gfx/error.eps b/tex/gfx/error.eps index d7cbe86..584ec5d 100644 --- a/tex/gfx/error.eps +++ b/tex/gfx/error.eps @@ -1,7 +1,7 @@ %!PS-Adobe-2.0 EPSF-2.0 %%Title: error.tex %%Creator: gnuplot 5.2 patchlevel 2 -%%CreationDate: Wed Mar 14 07:59:35 2018 +%%CreationDate: Wed Mar 14 18:28:39 2018 %%DocumentFonts: %%BoundingBox: 50 50 302 230 %%EndComments @@ -441,7 +441,7 @@ SDict begin [ /Creator (gnuplot 5.2 patchlevel 2) % /Producer (gnuplot) % /Keywords () - /CreationDate (Wed Mar 14 07:59:35 2018) + /CreationDate (Wed Mar 14 18:28:39 2018) /DOCINFO pdfmark end } ifelse diff --git a/tex/gfx/error.gp b/tex/gfx/error.gp index 5307745..1dff7f0 100644 --- a/tex/gfx/error.gp +++ b/tex/gfx/error.gp @@ -15,7 +15,7 @@ set rmargin 0.5 set format x "\\footnotesize{%H}" set format y "\\footnotesize{%H}" -set xlabel "\\footnotesize{$h$}" +set xlabel "\\footnotesize{bandwidth $h$}" set ylabel "\\footnotesize{MISE ($\\times 10^{-3}$)}" offset +2.0,0 plot \ diff --git a/tex/gfx/error.tex b/tex/gfx/error.tex index d4dc855..2b77e82 100644 --- a/tex/gfx/error.tex +++ b/tex/gfx/error.tex @@ -119,7 +119,7 @@ \gplgaddtomacro\gplfronttext{% \csname LTb\endcsname%% \put(83,2096){\rotatebox{-270}{\makebox(0,0){\strut{}\footnotesize{MISE ($\times 10^{-3}$)}}}}% - \put(2770,154){\makebox(0,0){\strut{}\footnotesize{$h$}}}% + \put(2770,154){\makebox(0,0){\strut{}\footnotesize{bandwidth $h$}}}% \csname LTb\endcsname%% \put(1917,2656){\makebox(0,0)[r]{\strut{}\footnotesize{KDE} }}% \csname LTb\endcsname%%