Box & boxKDE algos + notation fixes
This commit is contained in:
@@ -1,22 +1,60 @@
|
||||
\section{Usage}
|
||||
The objective of our method is to allow a reliable recover of the most probable state from a time-sequential Monte Carlo sensor fusion system.
|
||||
Assuming a sample based representation, our method allows to estimate the density of the unknown distribution of the state space in a narrow time frame.
|
||||
Such systems are often used to obtain an estimation of the most probable state in near real time.
|
||||
As the density estimation poses only a single step in the whole process, its computation needs to be as fast as possible.
|
||||
%The objective of our method is to allow a reliable recover of the most probable state from a time-sequential Monte Carlo sensor fusion system.
|
||||
%Assuming a sample based representation, our method allows to estimate the density of the unknown distribution of the state space in a narrow time frame.
|
||||
%Such systems are often used to obtain an estimation of the most probable state in near real time.
|
||||
%As the density estimation poses only a single step in the whole process, its computation needs to be as fast as possible.
|
||||
% not taking to much time from the frame
|
||||
|
||||
%Consider a set of two-dimensional samples, presumably generated from e.g. particle filter system.
|
||||
Assuming that the generated samples are often stored in a sequential list, the first step is to create a grid representation.
|
||||
In order to efficiently compute the grid and to allocate the required memory the extrema of the samples need to be known in advance.
|
||||
\begin{algorithm}[ht]
|
||||
\caption{Bivariate \textsc{boxKDE}}
|
||||
\label{alg:boxKDE}
|
||||
\begin{algorithmic}[1]
|
||||
\Statex \textbf{Input:} Samples $\bm{X}_1, \dots, \bm{X}_N$ and weights $w_1, \dots, w_N$
|
||||
\Statex \textbf{Output:} Approximative density estimate $\hat{f}$ on $G_1 \times G_2$
|
||||
\Statex
|
||||
|
||||
\For{$i=1 \textbf{ to } N$} \Comment{Data binning}
|
||||
\State Find the $4$ nearest grid points to $\bm{X}_i$
|
||||
\State Compute bin count $C_{i,j}$ as recommended by \cite{wand1994fast}
|
||||
\EndFor
|
||||
|
||||
\Statex
|
||||
|
||||
\State $\tilde{\bm{h}} := \bm{\delta}^{-1} \bm{h}$ \Comment{Scaled bandwidth}
|
||||
\State $\bm{L} := \floor{\sqrt{12\tilde{\bm{h}}^2n^{-1}+\bm{1}}}$ \Comment{\eqref{eq:boxidealwidth}}
|
||||
% \State $l := \floor{(L-1)/2}$
|
||||
|
||||
\Statex
|
||||
|
||||
%\For{$1 \textbf{ to } n$}
|
||||
\Loop{ $n$ \textbf{times}} \Comment{$n$ box filter iterations}
|
||||
|
||||
|
||||
\For{$ i=1 \textbf{ to } G_1$}
|
||||
\State Compute $\hat{f}_{i,1:G_2} \gets B_{L_2} * C_{i,1:G_2}$ \Comment{Alg. \ref{alg:naiveboxalgo}}
|
||||
\EndFor
|
||||
|
||||
\For{$ j=1 \textbf{ to } G_2$}
|
||||
\State Compute $\hat{f}_{1:G_1,j} \gets B_{L_1} * C_{1:G_1,j}$ \Comment{Alg. \ref{alg:naiveboxalgo}}
|
||||
\EndFor
|
||||
|
||||
\EndLoop
|
||||
\end{algorithmic}
|
||||
\end{algorithm}
|
||||
|
||||
Consider a set of two-dimensional samples with associated weights, e.g. presumably generated from a particle filter system.
|
||||
The overall process for bivariate data is described in Algorithm~\ref{alg:boxKDE}.
|
||||
|
||||
Assuming that the given $N$ samples are stored in a sequential list, the first step is to create a grid representation.
|
||||
In order to efficiently construct the grid and to allocate the required memory the extrema of the samples need to be known in advance.
|
||||
These limits might be given by the application, for example, the position of a pedestrian within a building is limited by the physical dimensions of the building.
|
||||
Such knowledge should be integrated into the system to avoid a linear search over the sample set, naturally reducing the computation time.
|
||||
|
||||
The second parameter to be defined by the application is the size of the grid, which can be set directly or defined in terms of bin sizes.
|
||||
Given the extreme values of the samples and grid sizes $G_1$ and $G_2$ defined by the user, a $G_1\times G_2$ grid can be constructed, using a binning rule from \eqref{eq:simpleBinning} or \eqref{eq:linearBinning}.
|
||||
As the number of grid points directly affects both computation time and accuracy, a suitable grid should be as coarse as possible but at the same time narrow enough to produce an estimate sufficiently fast with an acceptable approximation error.
|
||||
|
||||
Given the extreme values of the samples and the number of grid points $G$, the computation of the grid has a linear complexity of \landau{N} where $N$ is the number of samples.
|
||||
If the extreme values are unknown, an additional $\landau{N}$ search is required.
|
||||
The grid is stored as an linear array in memory, thus its space complexity is $\landau{G}$.
|
||||
If the extreme values are known in advanced, the computation of the grid is $\landau{N}$, otherwise an additional $\landau{N}$ search is required.
|
||||
The grid is stored as an linear array in memory, thus its space complexity is $\landau{G_1\cdot G_2}$.
|
||||
|
||||
Next, the binned data is filtered with a Gaussian using the box filter approximation.
|
||||
The box filter width is derived from the standard deviation of the approximated Gaussian, which is in turn equal to the bandwidth of the KDE.
|
||||
@@ -28,7 +66,7 @@ For this reason, $h$ needs to be divided by the bin size to account the discrepa
|
||||
Given the scaled bandwidth the required box filter width can be computed. % as in \eqref{label}
|
||||
Due to its best runtime performance the recursive box filter implementation is used.
|
||||
If multivariate data is processed, the algorithm is easily extended due to its separability.
|
||||
Each filter pass is computed in $\landau{G}$ operations, however an additional memory buffer is required.
|
||||
Each filter pass is computed in $\landau{G}$ operations, however, an additional memory buffer is required.
|
||||
|
||||
While the integer-sized box filter requires fewest operations, it causes a larger approximation error due to rounding errors.
|
||||
Depending on the required accuracy the extended box filter algorithm can further improve the estimation results, with only a small additional overhead.
|
||||
@@ -40,4 +78,3 @@ Finally, the most likely state can be obtained from the filtered data, i.e. from
|
||||
|
||||
Würde es Sinn machen das obere irgendwie Algorithmisch darzustellen? Also mit Pseudocode? Weil irgendwie/wo müssen wir ja "DAS IST UNSER APPROACH" stehen haben}.
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user