A customer recently asked us for the best method to zero out a vector. We decided to run some tests to find out. Here are the five methods we tried followed by performance timing and any drawbacks.
The following tests were performed on a DoubleVector
of length 10,000,000.
1) Create a new vector. This isn’t really clearing out an existing vector but we thought we should include it for completeness.
DoubleVector v2 = new DoubleVector( v.Length, 0.0 );
The big drawback here is that you’re creating new memory. Time: 419.5ms
2) Probably the first thing to come to mind is to simply iterate through the vector and set everything to zero.
for ( int i = 0; i < v.Length; i++ )
{
v[i] = 0.0;
}
We have to do some checking in the index operator. No new memory is created. Time: 578.5ms
3) In some cases, you could iterate through the underlying array of data inside the DoubleVector.
for ( int i = 0; i < v.DataBlock.Data.Length; i++ )
{
v.DataBlock.Data[i] = 0.0;
}
This is a little less intuitive. And, very importantly, it will not work with many views into other data structures. For example, a row slice of a matrix. However, it's easier for the CLR to optimize this loop. Time: 173.5ms
4) We can use the power of Intel's MKL to multiply the vector by zero.
v.Scale( 0.0 );
Scale() does this in-place. No new memory is created. In this example, we assume that MKL has already been loaded and is ready to go which is true if another MKL-based NMath call was already made or if NMath was initialized. This method will work on all views of other data structures. Time: 170ms
5) This surprised us a bit but the best method we could find was to clear out the underlying array using Array.Clear() in .NET
Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );
This creates no new memory. However, this will not work with non-contiguous views. However, this method is very fast. Time: 85.8ms
To make efficiently clearing a vector simpler for NMath users we have created a Clear()
method and a Clear( Slice )
method on the vector and matrix classes. It will do the right thing in the right circumstance. It will be released in NMath 5.2 in 2012.
Test Code
using System;
using CenterSpace.NMath.Core;
namespace Test
{
class ClearVector
{
static int size = 100000000;
static int runs = 10;
static int methods = 5;
static void Main( string[] args )
{
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
DoubleMatrix times = new DoubleMatrix( runs, methods );
NMathKernel.Init();
for ( int run = 0; run < runs; run++ )
{
Console.WriteLine( "Run {0}...", run );
DoubleVector v = null;
// Create a new one
v = new DoubleVector( size, 1.0, 2.0 );
sw.Start();
DoubleVector v2 = new DoubleVector( v.Length, 0.0 );
sw.Stop();
times[run, 0] = sw.ElapsedMilliseconds;
Console.WriteLine( Assert( v2 ) );
// iterate through vector
v = new DoubleVector( size, 1.0, 2.0 );
sw.Reset();
sw.Start();
for ( int i = 0; i < v.Length; i++ )
{
v[i] = 0.0;
}
sw.Stop();
times[run, 1] = sw.ElapsedMilliseconds;
Console.WriteLine( Assert( v ) );
// iterate through array
v = new DoubleVector( size, 1.0, 2.0 );
sw.Reset();
sw.Start();
for ( int i = 0; i < v.DataBlock.Data.Length; i++ )
{
v.DataBlock.Data[i] = 0.0;
}
sw.Stop();
times[run, 2] = sw.ElapsedMilliseconds;
Console.WriteLine( Assert( v ) );
// scale
v = new DoubleVector( size, 1.0, 2.0 );
sw.Reset();
sw.Start();
v.Scale( 0.0 );
sw.Stop();
times[run, 3] = sw.ElapsedMilliseconds;
Console.WriteLine( Assert( v ) );
// Array Clear
v = new DoubleVector( size, 1.0, 2.0 );
sw.Reset();
sw.Start();
Array.Clear( v.DataBlock.Data, 0, v.DataBlock.Data.Length );
sw.Stop();
times[run, 4] = sw.ElapsedMilliseconds;
Console.WriteLine( Assert( v ) );
Console.WriteLine( times.Row( run ) );
}
Console.WriteLine( "Means: " + NMathFunctions.Mean( times ) );
}
private static bool Assert( DoubleVector v )
{
if ( v.Length != size )
{
return false;
}
for ( int i = 0; i < v.Length; ++i )
{
if ( v[i] != 0.0 )
{
return false;
}
}
return true;
}
}
}
- Trevor