Convolution in CenterSpace’s NMATH 4.0

Convolution is a fundamental operation in data smoothing and filtering, and is used in many other applications ranging from discrete wavelet transform’s to LTI system theory. NMath supports a high performance, forward scaling set of convolution classes that support both complex and real data. These classes will scale in performance in proportion to the number of processing cores – eliminating code rewrites to take advantage of new multi-core hardware upgrades.

The following four convolution classes are available in the NMath 4.0 library.

  • {Double | DoubleComplex}1DConvolution
  • {Float | FloatComplex}1DConvolution

Additionally a symmetric set of correlation classes will be available.

  • {Double | DoubleComplex}1DCorrelation
  • {Float | FloatComplex}1DCorrelation

Example

Computing a convolution is as simple as defining the convolution kernel, creating the right class object for the data type, and running the convolution.

// Create some random signal data using the 
// Mersenne Twist random number generator.
RandomNumberGenerator rand = new RandGenMTwist(4230987);
DoubleVector data = new DoubleVector(500, rand);
      
// Create a simple averaging kernel.
DoubleVector kernel =  new DoubleVector("[ .25 .25 .25 .25 ]");

// Create the real number domain convolution class.
Double1DConvolution conv = 
    new Double1DConvolution(kernel, data.Length);

// Compute the convolution.
DoubleVector smoothed_data = conv.Convolve(data); 
Optimal performance for all convolution problems

Exploiting the fundamental duality between convolution and the Fourier transform, and the O(n ln n) FFT algorithm, convolutions can be computed in O(n ln n) time between two sequences g and h.



For convolutions on very large data sets this is clearly the most time efficient algorithm, even though two forward and one backward FFT are required. However for shorter data sequences, this is slower in practice that just directly summing the convolution sum – particularly so on modern multi-core processors with large on-chip caches. Also, direct summation is often faster when the kernel is much shorter that the data, which is frequently the case in signal processing applications.

The decision machinery for choosing which technique to use for the problem at hand is done automatically when the class is constructed. This way the user is always getting the best available convolution performance without worrying about which technique to use.


-Paul

Leave a Reply

Your email address will not be published. Required fields are marked *

Top