Convolution, Correlation, and the FFT

Most scientists and programmers understand the basic implementation details of their chosen math library. However, when algorithms are ported from one library to another, problems are hard to avoid. This seems to be particularly so when dealing with convolutions, correlations and the FFT – fundamental building blocks in many areas of computation. Frequently the theoretical concepts are clear, but when the bits hit the silicon, the confusion (at least for myself) starts.

To start eliminating some of this confusion, it’s important to understand two fundamental relationships between these three transforms.

$\mathcal{F}\{f*g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}$

This is known as the Convolution Theorem, where the italic F represents the Fourier transform, and the splat, convolution. This basic equality, along with the FFT, is used to compute large convolutions efficiently. The correlation operator has a similar analogous theorem, and this is where some of the problems start.

$\mathcal{F}\{f\star g\}=(\mathcal{F}\{f\})^* \cdot \mathcal{F}\{g\}$

The star is the correlation operator. Note that the first Fourier transform is conjugated, and that this breaks some basic symmetries in correlation that are found in convolution.

Convolution in R

If you happen to be porting code from R, the R language (v. 2.10.0) is distributed with a convolve function which actually, by default, returns the correlation. This is unfortunate and is a source of confusion for anybody porting a prototype from R to CenterSpace’s NMath, or to other math library for that matter.

Briefly, in R

kernel <- c(1, 2, 3, 1, 0, 0)
data <- c(1, 2, 3, 4, 5, 6)
convolve(kernel, data)

will result in:

[18 17 22 33 32 25]

Now this is a strange result, on a couple of fronts. First, this isn't the convolution, and second, it isn't the correlation either! Supposing that we want to compute the correlation between this kernel and signal, we would be expecting precisely,

[0 0 1 5 11 18 25 32 32 17 6 ].

Conflating correlation with convolution in one function is certain to cause confusion, because among many other reasons, convolution is communitive, and correlation is not. In general, for correlation,

$f\star g \ne g \star f$

yet for convolution,

$f * g = g * f$

If g or h satisfy certain symmetry properties, correlation can gain back the communitive property.

Back to our correlation example above, if we exchange the arguments kernel and data in R's convolve() function we get a different answer.

convolve(data,kernel)

gives us,

[18 25 32 33 22 17 ].

There is part of a correlation swimming around in that vector, but the last three numbers given by R are not part of a linear correlation. Many users naturally take those 6 numbers incorrectly as the linear correlation (or worse convolution) of the kernel and data . This brings us to our next topic.

Fast Convolution

The fast Fourier transform is used to compute the convolution or correlation for performance reasons. This FFT based algorithm is often referred to as 'fast convolution', and is given by,

$\small f*g = \mathcal{F}^{-1} \{\mathcal{F}\{f\} \cdot \mathcal{F}\{g\}\}.$

In the discrete case, when the two sequences are the same length, N , the FFT based method requires O(N log N) time, where a direct summation would require O(N*N) time.

This asymptotic runtime performance makes the FFT method the defacto standard for computing convolution. However, this is unfortunate because if the kernel is much smaller than the data, the direct summation is actually faster than using the FFT. This is not a rare special case, and is actually very common in signal filtering, wavelet transforms, and image processing applications. This also brings to light that many libraries (including R) require both inputs to be zero padded to the same length (typically a power of 2) - immediately eliminating this optimization and always forcing the use of the FFT technique.

Returning to our example above. If we remove the unnecessary padding from the kernel, and recompute the correlation, we arrive at,

$\small \texttt{[1 2 3 1]} \star \texttt{[1 2 3 4 5 6] = [1 5 11 18 25 32 32 17 6 ]}$

Now since both the correlation (and the convolution) spread the signal data by kernel.Length() - 1 = 3 elements, most (engineering) users are interested in the correlation exclusively where the kernel fully overlaps the signal data. This windowing would then give us,

$\small \texttt{[18 25 32]}$

which are the first three numbers provided by R's convolve function. The latter three numbers are the results of a circular , not a linear, correlation. This is probably not the result most engineers are looking for unless they are filtering a periodic signal or an image wrapped on a cylinder. Circular correlation wraps the data end-to-end in a continuous loop when summing, by effectively joining the first and last elements of the data array.

The circular correlation for this running example would look like the following table.

[1 2 3 4 5 6]
[1 2 3 1 - -] = 18
[- 1 2 3 1 -] = 25
[- - 1 2 3 1] = 32
[1 - - 1 2 3] = 33 (circular)
[3 1 - - 1 2] = 22 (circular)
[2 3 1 - - 1] = 17 (circular)

The top array is the data, and the arrays below represent the kernel sweeping across the data step by step.

The difference between the circular and linear correlation is restricted to the edges of the correlation where the (unpadded) kernel does not fully overlap the data. The circular and linear correlations are identical in the areas where the kernel fully overlaps the data - which in many applications is the area of interest.

Convolution & Correlation Classes in the NMath library

CenterSpace's convolution and correlation classes rigorously and efficiently compute their respective transformation correctly, regardless of the computational technique used. This means that zero padding by the application programmer is no longer necessary, and in fact is discouraged. As is reflexively using the 'fast convolution' technique when direct summation is actually faster.

When a NMath convolution or correlation class is constructed, it estimates the number of MFlops needed by all competing techniques and chooses the fastest computational method. Zero padding will introduce errors into this MFlops estimation process.

Classes

The CenterSpace NMath library offers the following eight classes.

{Double | Float}1DConvolution
{DoubleComplex | FloatComplex}1DConvolution
{Double | Float}1DCorrelation
{DoubleComplex | FloatComplex}1DCorrelation

The two sets of correlation and convolution classes have completely symmetric interfaces.

Code Examples
If you are currently porting code from a system that uses the FFT 'fast correlation' technique, I will now outline how you would port that code to NMath.

Porting our running R-example from above to NMath, and assuming that what you need is linear correlation, the NMath code would look like:

DoubleVector kernel = new DoubleVector(1, 2, 3, 1);
DoubleVector data = new DoubleVector(1, 2, 3, 4, 5, 6);
      
Double1DCorrelation corr = new Double1DCorrelation(kernel, data.Length);
DoubleVector correlation = corr.Correlate(data);
DoubleVector corr_full_kernel_overlap = 
  corr.TrimConvolution(correlation, CorrelationBase.Windowing.FullKernelOverlap);
      
DoubleVector corr_centered = 
  corr.TrimConvolution(correlation, CorrelationBase.Windowing.CenterWindow);

// correlation =            [1 5 11 18 25 32 32 17 6]
// corr_centered =              [11 18 25 32 32 17]
// corr_full_kernel_overlap =      [18 25 32]

Note that the windowing method, TrimConvolution() does not copy any data. It just creates a windowed view (reference) into the underlying convolution vector. Windowing of native arrays are not supported because a copy would be required.

The CenterSpace NMath libraries currently do not support circular convolution, so if that is required due to the circular symmetry / periodicity of the data, the circular convolution or correlation must be computed using our FFT classes directly.

// Compute circular correlation via FFT's.
// Zero-padding is required here.
// Typically pad to the nearest power of 2.
double[] nhkernel = { 1, 2, 3, 1, 0, 0};      
double[] data = { 1, 2, 3, 4, 5, 6 };
      
// Build FFT classes
 // and setup the correct scaling.
 fft = new DoubleComplexForward1DFFT(nhkernel.Length);
 ifft = new DoubleComplexBackward1DFFT(nhkernel.Length);
 ifft.SetScaleFactorByLength();

// Build the complex vectors of the real data
DoubleComplexVector kernelz = 
new DoubleComplexVector(new DoubleVector(nhkernel), new DoubleVector(nhkernel.Length));

DoubleComplexVector dataz = 
  new DoubleComplexVector(new DoubleVector(data), new DoubleVector(nhkernel.Length));

// Compute.  The next five lines 
// implement the fast correlation algorithm.
fft.FFTInPlace(kernelz);
fft.FFTInPlace(dataz);
dataz = NMathFunctions.Conj(dataz)
DoubleComplexVector prodz = kernelz * dataz;
ifft.FFTInPlace(prod);
r = new DoubleVector(NMathFunctions.Real(prod));
// r = [18 17 22 33 32 25]

-Paul

See our FFT landing page for complete documentation and code examples.

Convolution in R

Fast Convolution

Convolution & Correlation Classes in the NMath library

Classes

Leave a Reply