Blog

Archive for the ‘NMath Tutorial’ Category

Filtering with Wavelet Transforms

Friday, December 18th, 2015

Discrete time wavelet transforms have found engineering applications in computer vision, pattern recognition, signal filtering and perhaps most widely in signal and image compression. In 2000 the ISO JPEG committee proposed a new JPEG2000 image compression standard that is based on the wavelet transform using two Daubechies wavelets. This standard made the relatively new image decomposition algorithm ubiquitous on desktops around the world.

In signal processing, wavelets have been widely investigated for use in filtering bio-electric signals, among many other applications. Bio-electric signals are good candidates for multi-resolution wavelet filtering over standard Fourier analysis due to their non-stationary character. In this article we’ll discuss the filtering of electrocardiograms or ECGs and demonstrate with code examples how to filter an ECG waveform using NMath‘s new wavelet classes; keeping in mind that the techniques and code shown here apply to a wide class of time series measurements. If wavelets and their applications to filtering are unfamiliar to the reader, read a gentle and brief introduction to the subject in Wavelets for Kids: A Tutorial Introduction [5].

Filtering a Time Series with Wavelets

PhysioNet provides free access to a large collections of recorded physiologic signals, including many ECG’s. The ECG signal we will filter here, named aami-ec13 on PhysioNet, is shown below.

ECG Signal

ECG Signal

Our goal will be to remove the high frequency noise while preserving the character of the wave form, including the high frequency transitions at the signal peaks. Fourier based filter methods are ill suited for filtering this type of signal due to both it’s non-stationarity, as mentioned, but also the need to preserve the peak locations (phase) and shape.

A Wavelet Filter

As with Fourier analysis there are three basic steps to filtering signals using wavelets.

  1. Decompose the signal using the DWT.
  2. Filter the signal in the wavelet space using thresholding.
  3. Invert the filtered signal to reconstruct the original, now filtered signal, using the inverse DWT.

Briefly, the filtering of signals using wavelets is based on the idea that as the DWT decomposes the signal into details and approximation parts, at some scale the details contain mostly the insignificant noise and can be removed or zeroed out using thresholding without affecting the signal. This idea is discussed in more detail in the introductory paper [5]. To implement this DWT filtering scheme there are two basic filter design parameters: the wavelet type and a threshold. Typically the shape and form of the signal to be filtered is qualitatively matched to the general shape of the wavelet. In this example we will use the Daubechies forth order wavelet.

ECG Signal closeup

ECG Waveform

Daubechies 4 wavelet

Daubechies 4 wavelet

The general shape of this wavelet roughly matches, at various scales, the morphology of the ECG signal. Currently NMath supports the following wavelet families: Harr, Daubechies, Symlet, Best Localized, and Coiflet, 27 in all. Additionally, any custom wavelet of your invention can be created by passing in the wavelet’s low & high pass decimation filter values. The wavelet class then imposes the wavelet’s symmetry properties to compute the reconstruction filters.

   // Build a Coiflet wavelet.
   var wavelet = new FloatWavelet( Wavelet.Wavelets.C4 );
 
   // Build a custom reverse bi-orthogonal wavelet.
   var wavelet = new DoubleWavelet( new double[] {0.0, 0.0, 0.7071068, 0.7071068, 0.0, 0.0}, new double[] {0.0883883, 0.0883883, -0.7071068, 0.7071068, -0.0883883, -0.0883883} );

The FloatDWT class provides four different thresholding strategies: Universal, UniversalMAD, Sure, and Hybrid (a.k.a SureShrink). We’ll use the Universal threshold strategy here. This is a good starting point but this strategy can over smooth the signal. Typically some empirical experimentation is done here to find the best threshold for the data (see [1], also see [4] for a good overview of common thresholding strategies.)

Wavelet Filtering Code

The three steps outlined above are easily coded using two classes in the NMath library: the FloatDWT class and the FloatWavelet class. As always in NMath, the library offers both a float precision and a double precision version of each of these classes. Let’s look at a code snippet that implements a DWT based filter with NMath.

   // Choose wavelet, the Daubechies 4 wavelet
   var wavelet = new FloatWavelet( Wavelet.Wavelets.D4 );
 
   // Build DWT object using our wavelet & data
   var dwt = new FloatDWT( data, wavelet );
 
   // Decompose signal with DWT to level 5
   dwt.Decompose( 5 );
 
   // Find Universal threshold & threshold all detail levels
   double lambdaU = dwt.ComputeThreshold( FloatDWT.ThresholdMethod.Universal, 1 );
   dwt.ThresholdAllLevels( FloatDWT.ThresholdPolicy.Soft, new double[] { lambdaU, 
       lambdaU, lambdaU, lambdaU, lambdaU } );
 
   // Rebuild the filtered signal.
   float[] reconstructedData = dwt.Reconstruct();

The first two lines of code build the wavelet object and the DWT object using both the input data signal and the abbreviated Daubechies wavelet name Wavelet.Wavelets.D4. The third line of code executes the wavelet decomposition at five consecutive scales. Both the signal’s details and approximations are stored in the DWT object at each step in the decomposition. Next, the Universal threshold is computed and the wavelet details are thresholded using the same threshold with a Soft policy (see [1], pg. 63). Lastly, the now filtered signal is reconstructed.

Below, the chart on the left shows the unfiltered ECG signal and the chart on the right shows the wavelet filtered ECG signal. It’s clear that this filter very effectively removed the noise while preserving the signal.

Raw ECG Signal

Raw ECG Signal

Filtered ECG Signal

Filtered ECG Signal

These two charts below show a detail from the chart above from indicies 500 to 1000. Note how well the signal shape, phase, and amplitude has been preserved in this non-stationary wavelet-filtered signal.

Detail Raw ECG Signal

Detail Raw ECG Signal

Detail Filtered ECG Signal

Detail Filtered ECG Signal

It is this ability to preserve phase, form, and amplitude in DWT based filters all while having a O(n log n) runtime that Fourier-based filters enjoy that has made wavelets such an important part of signal processing today. The complete code for this example along with a link to the ECG data is provided below.

Paul

References

[1] Guomin Luo and Daming Zhang (2012). Wavelet Denoising, Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology, Dr. Dumitru Baleanu (Ed.), ISBN: 978-953-51-0494-0, InTech, pp. 59-80. Available from: http://www.intechopen.com/books/advances-in-wavelet-theory-and-their-applicationsin-engineering-physics-and-technology/wavelet-denoising
[2] Burhan Ergen (2012). Signal and Image Denoising Using Wavelet Transform, Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology, Dr. Dumitru Baleanu (Ed.), ISBN: 978-953-51-0494-0, InTech, DOI: 10.5772/36434. Available from: http://www.intechopen.com/books/advances-in-wavelet-theory-and-their-applications-in-engineering-physics-and-technology/wavelet-signal-and-image-denoising
[3] Rami Cohen: Signal Denoising Using Wavelets, Project Report, 2012. Available from: http://tx.technion.ac.il/~rc/SignalDenoisingUsingWavelets_RamiCohen.pdf
[4] M. C. E. Rosas-Orea, M. Hernandez-Diaz, V. Alarcon-Aquino, and L. G. Guerrero-Ojeda, A Comparative Simulation Study of Wavelet Based Denoising Algorithms, Proceedings of the 15th International Conference on Electronics, Communications and Computers (CONIELECOMP 2005), 2005 © IEEE
[5] Brani Vidakovic and Peter Mueller, Wavelets for Kids: A Tutorial Introduction, Duke University, 1991. Available from: http://gtwavelet.bme.gatech.edu/wp/kidsA.pdf

Test Data

To copy the data file provided by PhysioNet for this example click: ECG_AAMIEC13.data
This ECG data was taken from the ANSI EC13 test data set waveforms.

Complete Test Code

 
    public void BlogECGExample()
    {
      // Define your own dataDir
      var dataDir = "................";
 
      // Load ECG wave from physionet.org data file.
      string filename = Path.Combine( dataDir, "ECG_AAMIEC13.data.txt" );
      string line;
      int cnt = 0;
      FloatVector ecgMeasurement = new FloatVector( 3000 );
      var fileStrm = new System.IO.StreamReader( filename );
      fileStrm.ReadLine(); fileStrm.ReadLine();
      while ( ( line = fileStrm.ReadLine() ) != null && cnt < 3000 )
      {
        ecgMeasurement[cnt] = Single.Parse( line.Split( ',' )[1] );
        cnt++;
      }
 
      // Choose wavelet
      var wavelet = new FloatWavelet( Wavelet.Wavelets.D4 );
 
      // Build DWT object
      var dwt = new FloatDWT( ecgMeasurement.DataBlock.Data, wavelet );
 
      // Decompose signal with DWT to level 5
      dwt.Decompose( 5 );
 
      // Find Universal threshold & threshold all detail levels with lambdaU
      double lambdaU = dwt.ComputeThreshold( FloatDWT.ThresholdMethod.Universal, 1 );
      dwt.ThresholdAllLevels( FloatDWT.ThresholdPolicy.Soft, new double[] { lambdaU, lambdaU, lambdaU, lambdaU, lambdaU } );
 
      // Rebuild the signal to level 1 - the original (filtered) signal.
      float[] reconstructedData = dwt.Reconstruct();
 
      // Display DWT results.
      BlogECGExampleBuildCharts( dwt, ecgMeasurement, reconstructedData );
 
    }
 
    public void BlogECGExampleBuildCharts( FloatDWT dwt, FloatVector ECGMeasurement, float[] ReconstructedData )
    {
 
      // Plot out approximations at various levels of decomposition.
      var approxAllLevels = new FloatVector();
      for ( int n = 5; n > 0; n-- )
      {
        var approx = new FloatVector( dwt.WaveletCoefficients( DiscreteWaveletTransform.WaveletCoefficientType.Approximation, n ) );
        approxAllLevels.Append( new FloatVector( approx ) );
      }
 
      var detailsAllLevels = new FloatVector();
      for ( int n = 5; n > 0; n-- )
      {
        var approx = new FloatVector( dwt.WaveletCoefficients( DiscreteWaveletTransform.WaveletCoefficientType.Details, n ) );
        detailsAllLevels.Append( new FloatVector( approx ) );
      }
 
      // Create and display charts.
      Chart chart0 = NMathChart.ToChart( detailsAllLevels );
      chart0.Titles.Add( "Concatenated DWT Details to Level 5" );
      chart0.ChartAreas[0].AxisY.Title = "DWT Details";
      chart0.Height = 270;
      NMathChart.Show( chart0 );
 
      Chart chart1 = NMathChart.ToChart( approxAllLevels );
      chart1.Titles.Add("Concatenated DWT Approximations to Level 5");
      chart1.ChartAreas[0].AxisY.Title = "DWT Approximations";
      chart1.Height = 270;
      NMathChart.Show( chart1 );
 
      Chart chart2 = NMathChart.ToChart( (new FloatVector( ReconstructedData ))[new Slice(500,500)] );
      chart2.Titles[0].Text = "Thresholded & Reconstructed ECG Signal";
      chart2.ChartAreas[0].AxisY.Title = "mV";
      chart2.Height= 270;
      NMathChart.Show( chart2 );
 
      Chart chart3 = NMathChart.ToChart( (new FloatVector( ECGMeasurement ))[new Slice(500,500)] );
      chart3.Titles[0].Text = "Raw ECG Signal";
      chart3.ChartAreas[0].AxisY.Title = "mV";
      chart3.Height = 270;
      NMathChart.Show( chart3 );
 
    }

NMath Tutorial videos

Thursday, October 31st, 2013

We are proud to announce a series of NMath tutorial videos on how to use CenterSpace’s math library products. We are starting, naturally, with Getting Started with NMath.

You can download it here: MP4

Please let us know which topics you want us to cover. Email support@centerspace.net

Cheers,
Trevor

Distribution Fitting Demo

Monday, April 9th, 2012

A customer recently asked how to fit a normal (Gaussian) distribution to a vector of experimental data. Here’s a demonstration of how to do it.

Let’s start by creating a data set: 100 values drawn from a normal distribution with known parameters (mean = 0.5, variance = 2.0).

int n = 100;
double mean = .5;
double variance = 2.0;
var data = new DoubleVector( n, new RandGenNormal( mean, variance ) );

Now, compute y values based on the empirical cumulative distribution function (CDF), which returns the probability that a random variable X will have a value less than or equal to x–that is, f(x) = P(X <= x). Here’s an easy way to do, although not necessarily the most efficient for larger data sets:

var cdfY = new DoubleVector( data.Length );
var sorted = NMathFunctions.Sort( data );
for ( int i = 0; i < data.Length; i++ )
{
  int j = 0;
  while ( j < sorted.Length && sorted[j] <= data[i] ) j++;
  cdfY[i] = j / (double)data.Length;
}

The data is sorted, then for each value x in the data, we iterate through the sorted vector looking for the first value that is greater than x.

We’ll use one of NMath’s non-linear least squares minimization routines to fit a normal distribution CDF() function to our empirical CDF. NMath provides classes for fitting generalized one variable functions to a set of points. In the space of the function parameters, beginning at a specified starting point, these classes finds a minimum (possibly local) in the sum of the squared residuals with respect to a set of data points.

A one variable function takes a single double x, and returns a double y:

y = f(x)

A generalized one variable function additionally takes a set of parameters, p, which may appear in the function expression in arbitrary ways:

y = f(p1, p2,..., pn; x)

For example, this code computes y=a*sin(b*x + c):

public double MyGeneralizedFunction( DoubleVector p, double x )
{
  return p[0] * Math.Sin( p[1] * x + p[2] );
}

In the distribution fitting example, we want to define a parameterized function delegate that returns CDF(x) for the distribution described by the given parameters (mean, variance):

Func<DoubleVector, double, double> f =
  ( DoubleVector p, double x ) =>
    new NormalDistribution( p[0], p[1] ).CDF( x );

Now that we have our data and the function we want to fit, we can apply the curve fitting routine. We’ll use a bounded function fitter, because the variance of the fitted normal distribution must be constrained to be greater than 0.

var fitter = new BoundedOneVariableFunctionFitter<TrustRegionMinimizer>( f );
var start = new DoubleVector( new double[] { 0.1, 0.1 } );
var lowerBounds = new DoubleVector( new double[] { Double.MinValue, 0 } );
var upperBounds = 
   new DoubleVector( new double[] { Double.MaxValue, Double.MaxValue } );
var solution = fitter.Fit( data, cdfY, start, lowerBounds, upperBounds );
var fit = new NormalDistribution( solution[0], solution[1] );
 
Console.WriteLine( "Fitted distribution:\nmean={0}\nvariance={1}",
  fit.Mean, fit.Variance );

The output for one run is

Fitted distribution: 
mean=0.567334190790594
variance=2.0361207956132

which is a reasonable approximation to the original distribution (given 100 points).

We can also visually inspect the fit by plotting the original data and the CDF() function of the fitted distribution.

ToChart( data, cdfY, SeriesChartType.Point, fit,
  NMathStatsChart.DistributionFunction.CDF );
 
private static void ToChart( DoubleVector x, DoubleVector y,
  SeriesChartType dataChartType, NormalDistribution dist,
  NMathStatsChart.DistributionFunction distFunction )
{
  var chart = NMathStatsChart.ToChart( dist, distFunction );
  chart.Series[0].Name = "Fit";
 
  var series = new Series() {
    Name = "Data",
    ChartType = dataChartType
  };
  series.Points.DataBindXY( x, y );
  chart.Series.Insert( 0, series );
 
  chart.Legends.Add( new Legend() );
  NMathChart.Show( chart );
}

CDF() of fitted distribution

We can also look at the probability density function (PDF) of the fitted distribution, but to do so we must first construct an empirical PDF using a histogram. The x-values are the midpoints of the histogram bins, and the y-values are the histogram counts converted to probabilities, scaled to integrate to 1.

int numBins = 10;
var hist = new Histogram( numBins, data );
 
var pdfX = new DoubleVector( hist.NumBins );
var pdfY = new DoubleVector( hist.NumBins );
for ( int i = 0; i < hist.NumBins; i++ )
{
  // use bin midpoint for x value
  Interval bin = hist.Bins[i];
  pdfX[i] = ( bin.Min + bin.Max ) / 2;
 
   // convert histogram count to probability for y value
   double binWidth = bin.Max - bin.Min;
   pdfY[i] = hist.Count( i ) / ( data.Length * binWidth );
}
 
ToChart( pdfX, pdfY, SeriesChartType.Column, fit,
  NMathStatsChart.DistributionFunction.PDF );

PDF() of fitted distribution

You might be tempted to try to fit a distribution PDF() function directly to the histogram data, rather than using the CDF() function like we did above, but this is problematic for several reasons. The bin counts have different variability than the original data. They also have a fixed sum, so they are not independent measurements. Also, for continuous data, fitting a model based on aggregated histogram counts, rather than the original data, throws away information.

Ken

Clearing a vector

Wednesday, November 9th, 2011

A customer recently asked us for the best method to zero out a vector. We decided to run some tests to find out. Here are the five methods we tried followed by performance timing and any drawbacks.

The following tests were performed on a DoubleVector of length 10,000,000.

1) Create a new vector. This isn’t really clearing out an existing vector but we thought we should include it for completeness.

1
 DoubleVector v2 = new DoubleVector( v.Length, 0.0 );

The big drawback here is that you’re creating new memory. Time: 419.5ms

2) Probably the first thing to come to mind is to simply iterate through the vector and set everything to zero.

1
2
3
4
for ( int i = 0; i < v.Length; i++ )
{
  v[i] = 0.0;
}

We have to do some checking in the index operator. No new memory is created. Time: 578.5ms

3) In some cases, you could iterate through the underlying array of data inside the DoubleVector.

1
2
3
4
for ( int i = 0; i &lt; v.DataBlock.Data.Length; i++ )
{
  v.DataBlock.Data[i] = 0.0;
}

This is a little less intuitive. And, very importantly, it will not work with many views into other data structures. For example, a row slice of a matrix. However, it’s easier for the CLR to optimize this loop. Time: 173.5ms

4) We can use the power of Intel’s MKL to multiply the vector by zero.

1
 v.Scale( 0.0 );

Scale() does this in-place. No new memory is created. In this example, we assume that MKL has already been loaded and is ready to go which is true if another MKL-based NMath call was already made or if NMath was initialized. This method will work on all views of other data structures. Time: 170ms

5) This surprised us a bit but the best method we could find was to clear out the underlying array using Array.Clear() in .NET

1
 Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );

This creates no new memory. However, this will not work with non-contiguous views. However, this method is very fast. Time: 85.8ms

To make efficiently clearing a vector simpler for NMath users we have created a Clear() method and a Clear( Slice ) method on the vector and matrix classes. It will do the right thing in the right circumstance. It will be released in NMath 5.2 in 2012.

Test Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
using System;
using CenterSpace.NMath.Core;
 
namespace Test
{
  class ClearVector
  {
    static int size = 100000000;
    static int runs = 10;
    static int methods = 5;
 
    static void Main( string[] args )
    {
      System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
      DoubleMatrix times = new DoubleMatrix( runs, methods );
      NMathKernel.Init();
 
      for ( int run = 0; run < runs; run++ )
      {
        Console.WriteLine( "Run {0}...", run );
        DoubleVector v = null;
 
        // Create a new one
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Start();
        DoubleVector v2 = new DoubleVector( v.Length, 0.0 );
        sw.Stop();
        times[run, 0] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v2 ) );
 
        // iterate through vector
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        for ( int i = 0; i < v.Length; i++ )
        {
          v[i] = 0.0;
        }
        sw.Stop();
        times[run, 1] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );
 
        // iterate through array
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        for ( int i = 0; i < v.DataBlock.Data.Length; i++ )
        {
          v.DataBlock.Data[i] = 0.0;
        }
        sw.Stop();
        times[run, 2] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );
 
        // scale
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        v.Scale( 0.0 );
        sw.Stop();
        times[run, 3] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );
 
        // Array Clear
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );
        sw.Stop();
        times[run, 4] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );
        Console.WriteLine( times.Row( run ) );
      }
      Console.WriteLine( "Means: " + NMathFunctions.Mean( times ) );
    }
 
    private static bool Assert( DoubleVector v )
    {
      if ( v.Length != size )
      {
        return false;
      }
      for ( int i = 0; i < v.Length; ++i )
      {
        if ( v[i] != 0.0 )
        {
          return false;
        }
      }
      return true;
    }
  }
}

– Trevor

Initializing NMath

Wednesday, November 9th, 2011

NMath uses Intel’s Math Kernel Library (MKL) internally. This code contains native, optimized code to wring out the best performance possible.

There is a one-time delay when the appropriate x86 or x64 native code is loaded. This cost can be easily controlled by the developer by using the NMathKernel.Init() method. Please see Initializing NMath for more details.

– Trevor