Blog

Archive for the ‘.NET’ Category

Clearing a vector

Wednesday, November 9th, 2011

A customer recently asked us for the best method to zero out a vector. We decided to run some tests to find out. Here are the five methods we tried, with any drawbacks and the timings.

These were performed on a DoubleVector, v,  of length 10,000,000.

1) Create a new vector. This isn’t really clearing out an existing vector but we thought we should include it for completeness.

 DoubleVector v2 = new DoubleVector( v.Length, 0.0 );

The big drawback here is that you’re creating new memory. Time: 419.5ms

2) Probably the first thing to come to mind is to simply iterate through the vector and set everything to zero.

for ( int i = 0; i < v.Length; i++ )
{
  v[i] = 0.0;
}

We have to do some checking in the index operator. No new memory created. Time: 578.5ms

3) In some cases, you could iterate through the underlying array of data inside the DoubleVector.

for ( int i = 0; i < v.DataBlock.Data.Length; i++ )
{
  v.DataBlock.Data[i] = 0.0;
}

This is a little less intuitive. And, very importantly, it will not work with many views into other data structures. For example, a row slice of a matrix. However, it’s easier for the CLR to optimize this loop. Time: 173.5ms

4) We can use the power of Intel’s MKL to multiply the vector by zero.

 v.Scale( 0.0 );

Scale() does this in-place. No new memory is created. In this example, we assume that MKL has already been loaded and is ready to go which is true if another MKL-based NMath call was already made or if NMath was initialized. This method will work on all views of other data structures. Time: 170ms

5) This surprised us a bit but the best method we could find was to clear out the underlying array using Array.Clear() in .NET

 Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );

This creates no new memory. However, this will not work with non-contiguous views. However, this method is very fast. Time: 85.8ms

To make this much simpler, we have created a Clear() method and a Clear( Slice) method on vectors and matrices. It will do the right thing in the right circumstance. It will be released in NMath 5.2 in 2012.

And, here is the code we used for this test:

using System;
using CenterSpace.NMath.Core;

namespace Test
{
  class ClearVector
  {
    static int size = 100000000;
    static int runs = 10;
    static int methods = 5;

    static void Main( string[] args )
    {
      System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
      DoubleMatrix times = new DoubleMatrix( runs, methods );
      NMathKernel.Init();

      for ( int run = 0; run < runs; run++ )
      {
        Console.WriteLine( "Run {0}...", run );
        DoubleVector v = null;

        // Create a new one
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Start();
        DoubleVector v2 = new DoubleVector( v.Length, 0.0 );
        sw.Stop();
        times[run, 0] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v2 ) );

        // iterate through vector
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        for ( int i = 0; i < v.Length; i++ )
        {
          v[i] = 0.0;
        }
        sw.Stop();
        times[run, 1] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );

        // iterate through array
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        for ( int i = 0; i < v.DataBlock.Data.Length; i++ )
        {
          v.DataBlock.Data[i] = 0.0;
        }
        sw.Stop();
        times[run, 2] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );

        // scale
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        v.Scale( 0.0 );
        sw.Stop();
        times[run, 3] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );

        // Array Clear
        v = new DoubleVector( size, 1.0, 2.0 );
        sw.Reset();
        sw.Start();
        Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );
        sw.Stop();
        times[run, 4] = sw.ElapsedMilliseconds;
        Console.WriteLine( Assert( v ) );
        Console.WriteLine( times.Row( run ) );
      }
      Console.WriteLine( "Means: " + NMathFunctions.Mean( times ) );
    }

    private static bool Assert( DoubleVector v )
    {
      if ( v.Length != size )
      {
        return false;
      }
      for ( int i = 0; i < v.Length; ++i )
      {
        if ( v[i] != 0.0 )
        {
          return false;
        }
      }
      return true;
    }
  }
}

- Trevor

Share

FFT Performance Benchmarks in .NET

Wednesday, January 5th, 2011

We’ve had a number of inquires about the CenterSpace FFT benchmarks, so I thought I would code up a few tests and run them on my machine. I’ve included our FFT performance numbers and the code that generated those numbers so you can try them on your machine. (If you don’t have NMath, you’ll need to download the eval version). I also did a comparison of 1 dimensional real DFTs, with FFTW, one of the fastest desktop FFT implementations available.

Benchmarks

These benchmarks were run on a 2.80 Ghz, Intel Core i7 CPU, with 4Gb of memory installed.

The clock resolution is 0.003 ns
1024 point, forward, real FFT required 4361.364 ns, Mflops 4069
1000 point, forward, real FFT required 5338.785 ns, Mflops 3235
4096 point, forward, real FFT required 21708.565 ns, Mflops 3924
4095 point, forward, real FFT required 43012.010 ns, Mflops 1980
1024 * 1024 point, forward, real FFT required 15.635 ms, Mflops 2324

I’m estimating the megaflop performance during the FFT using:



This is the asymptotic number of floating point operations for the radix-2 Cooley-Tukey FFT algorithm. This FFT MFlop estimate is used in a number of FFT benchmark reports and serves as a good basis for comparing algorithm efficiency.

As expected we take a performance hit for non-power of 2 lengths, but due to various optimizations for processing prime length FFT kernels (3, 5, 7 & 11), the performance hit is minimal in many cases. The 1000-point FFT has prime factors (2)(2)(2)(5)(5)(5), and the 4095-point FFT has prime factors (3)(3)(5)(7)(13), so those larger prime factors in the 4095-point FFT cost us some performance. Typically, user’s zero pad their data vectors to a power-of-two length to get optimal performance.

Side by side comparison with FFTW

FFTW claims to be the “Fastest Fourier Transform in the West”, and is a clever, high performance implementation of the discrete Fourier transform. This algorithm is shipped with all copies of MATLAB. FFTW is implemented in C and has the reputation as being one of the fastest desktop FFT algorithm.

Both the NMath FFT and the FFTW have a pre-computation setup that establishes the best algorithmic approach for the DFT at hand, before computing any FFT’s. This pre-computational phase is not included in the times below. In the case of the NMath FFT classes, this pre-computational phase in done in the class constructor; Therefore users must avoid constructing NMath FFT classes in tight loops for best performance (as shown in the benchmark code below). Below is a small side-by-side comparison between FFTW and NMath’s FFT (using the numbers from above).

Comparison of a forward, real, out-of-place FFT.
FFT length FFTW NMATH FFT
1024 4.14 μs 4.36 μs
1000 5.98 μs 5.33 μs
4096 20.31 μs 21.71 μs
4095 49.90 μs 43.01 μs
1024^2 17.16 ms 15.63 ms

Clearly NMATH is very competitive with, and at times out-performs FFTW for real FFT’s of both power-of-2 length signals and otherwise. I chose 1D real signals as a test case because this is one of the most frequent use cases of our NMATH FFT library.

On a subjective scale, running a 1024-point FFT on a desktop commodity machine at around (an algorithm normalized) 4 GFlops is amazing. That means that in a real time measurement situation, users can compute 1024-point FFT’s at around 220kHz – all with just a couple of lines of code.

Happy Computing,
Paul

Benchmark Code

 public void BenchMarks()
    {
      Double numberTrials = 10000;
      Double flops;
 
      Stopwatch timer = new System.Diagnostics.Stopwatch();
      Console.WriteLine( String.Format("The clock resolution is {0:0.000} ns", Stopwatch.Frequency / 1000000000.0 ) );
 
      // Snip one - power of two
      RandGenUniform rand = new RandGenUniform();
      DoubleForward1DFFT fft = new DoubleForward1DFFT( 1024 );
      DoubleVector realsignal = new DoubleVector( 1024, rand );
 
      DoubleVector result = new DoubleVector( 1024 * 1024 );
 
      timer.Reset();
      for( int i = 0; i < numberTrials; i++ )
      {
        timer.Start();
        fft.FFT( realsignal, ref result );
        timer.Stop();
      }
      flops = (2.5 * 1024 * NMathFunctions.Log(1024)) / (((timer.ElapsedTicks / numberTrials) / Stopwatch.Frequency) * 1000000.0 );
      Console.WriteLine( String.Format( "1024 point, forward, real FFT required {0:0.000} ns, Mflops {1:0}", ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000000.0, flops ) );
 
      // length 1000
      fft = new DoubleForward1DFFT( 1000 );
      realsignal = new DoubleVector( 1000, rand );
 
      timer.Reset();
      for( int i = 0; i < numberTrials; i++ )
      {
        timer.Start();
        fft.FFT( realsignal, ref result );
        timer.Stop();
      }
      flops = ( 2.5 * 1000 * NMathFunctions.Log( 1000 ) ) / ( ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000.0 );
      Console.WriteLine( String.Format( "1000 point, forward, real FFT required {0:0.000} ns, Mflops {1:0}", ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000000.0, flops ) );
 
      // length 4096
      fft = new DoubleForward1DFFT( 4096 );
      realsignal = new DoubleVector( 4096, rand );
 
      timer.Reset();
      for( int i = 0; i < numberTrials; i++ )
      {
        timer.Start();
        fft.FFT( realsignal, ref result );
        timer.Stop();
      }
      flops = ( 2.5 * 4096 * NMathFunctions.Log( 4096 ) ) / ( ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000.0 );
      Console.WriteLine( String.Format( "4096 point, forward, real FFT required {0:0.000} ns, Mflops {1:0}", ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000000.0, flops ) );
 
      // length 4095
      fft = new DoubleForward1DFFT( 4095 );
      realsignal = new DoubleVector( 4095, rand );
 
      timer.Reset();
      for( int i = 0; i < numberTrials; i++ )
      {
        timer.Start();
        fft.FFT( realsignal, ref result );
        timer.Stop();
      }
      flops = ( 2.5 * 4095 * NMathFunctions.Log( 4095 ) ) / ( ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000.0 );
      Console.WriteLine( String.Format( "4095 point, forward, real FFT required {0:0.000} ns, Mflops {1:0}", ( ( timer.ElapsedTicks / numberTrials ) / Stopwatch.Frequency ) * 1000000000.0, flops ) );
 
 
      // length 1M
      fft = new DoubleForward1DFFT( 1024 * 1024 );
      realsignal = new DoubleVector( 1024 * 1024, rand );
 
      timer.Reset();
      for( int i = 0; i < 100; i++ )
      {
        timer.Start();
        fft.FFT( realsignal, ref result );
        timer.Stop();
      }
      flops = ( 2.5 * 1024 * 1024 * NMathFunctions.Log( 1024 * 1024 ) ) / ( ( ( timer.ElapsedTicks / 100.0 ) / Stopwatch.Frequency ) * 1000000.0 );
      Console.WriteLine( String.Format( "Million point (1024 * 1024), forward, real point FFT required {0:0.000} ms, Mflops {1:0}", ( ( timer.ElapsedTicks / 100.0 ) / Stopwatch.Frequency ) * 1000.0, flops ) );
 
    }
Share

ClickOnce Deployment

Tuesday, August 17th, 2010

In general, when you deploy an NMath application you want to ensure that all NMath DLLs in the installation Assemblies directory are installed in either the GAC or next to the application. (You can read an overview of NMath deployment in Section 1.3 the NMath User’s Guide.)

ClickOnce automagically tries to determine the appropriate dependencies to deploy. However, since the NMath kernel and native DLLs are not required at compile-time, but are required at run-time, the ClickOnce mechanism fails. The best solution we’ve found is to:

  1. Add NMathKernelx86.dll, NMathKernelx64.dll, nmath_native_x86.dll, nmath_native_x64.dll as files to the project.
  2. Set the Build Action to Content.
  3. Set Copy to Output Directory to Copy Always.
  4. Publish your project, then go to the publish directory\Application Files to verify that these four DLLs have been included.

- Trevor

Share

Nevron Partnership

Wednesday, March 31st, 2010

All of us at CenterSpace software are proud to announce a partnership with Nevron, leaders in data visualization. Nevron Chart Visualizations Nevron builds .NET components which enable developers to quickly build clean, enterprise quality, user interfaces incorporating well rendered maps, gauges, diagrams, or charts. Nevron has built it’s visualization framework with the developer in mind and has created components that are powerful, yet quick to pick up and work with.

Free Project Examples

As part of of this partnership our software engineers are working together to develop a series of example applications to help our customers quickly become productive using our tool set. Our first example demonstrates how to use the Nevron .NET chart component with some of NMath Stat’s fundamental capabilities: regression and prediction, polynomial curve fitting, and data smoothing. To build and run this example, you’ll need to download this VS8 project, along with evaluation copies of the CenterSpace NMath Suite and the Nevron .NET Chart package. All of this is freely available from our partnership page.

As I mentioned this is the first in a series of examples that we will be jointly developing. Our next example application is in the planning stages, and will cover the building of common charts in statistical process control. Statistical process control is employed across many industries ranging from manufacturing to medicine and has been an area of interest by customers of both companies. We’ll post up a blog article about the example once it’s finished and you’ll see a tweet about it if you are following us.

If you have an idea for an example using Nevon’s and CenterSpace’s tools together, which would help speed along your own project, leave a comment or drop us an email. We just may cook it up!

Discount Bundle

As part of our partnership with Nevron, we are able to offer a substantial discount on package licenses. If you are interested, please email our sales staff and request the Nevron partnership discount.

Thanks for reading,

-The CenterSpace Team

Resources

Share

Using Excel with NMath

Monday, February 15th, 2010

We’ve had several customers ask about porting their Excel model to a .NET language in order to leverage the performance and functionality of NMath or NMath Stats. NMath does have good crossover functionality with Excel making this porting job easier. It is also possible to accelerate your Excel models by calling the NMath .NET assemblies directly from a VBA macro in Excel . This post provides some guidance for porting all or just a portion of your Excel model to C# and NMath.

Excel is designed to interoperate with external assemblies from VBA using COM. Type libraries built in .NET are not directly COM compatible, however all .NET class libraries including NMath and NMath Stats can be made to present a COM interface making Excel interop possible. This interop is acheived by building the COM type library directly from the assembly – no recompiling needed – using a tool shipped with the .NET framework, and then adding this type library as a reference to an Excel sheets’ VBA macro. Because of the many differences between the C# language and VBA, only a small portion of NMath will be accessable from Excel using this procedure, however a simple remedy will be outlined below that can expand the available functionality to all of NMath.

  1. To build the type library interface to the NMath.dll use the regasm.exe utility shipped with the .NET framework
     >regasm.exe NMath.dll /tlb:NMathCom.tlb

    The COM compatible type library now resides in the NMathCom.tlb file. You will see some warning messages regarding incompatibilities between COM and NMath.

  2. Open a spreadsheet, right click on a sheet tab and choose View Code to open the VBA development environment.
  3. In the Tools menu select References… and browse to the location of the new NMath type library.

Now the COM compatible portions of NMath are now available for use from VBA. At this point we can code up simple example for generating random numbers in VBA to test out the NMath interoperability

Private Sub TestNMath()
  Dim rand As New NMath.RandGenLogNormal
  rand.Mean = 50
  rand.Variance = 10
 
  Dim i As Integer
  For i = 1 To 10
    Cells(i, 1) = rand.Next()
  Next
End Sub

This simple VBA script populates cells A1:A10 with LogNormal distributed random numbers with a mean of 50 and a variance of 10. This is not so easily achieved within Excel natively.
(more…)

Share