NMath Stats User's Guide

TOC | Previous | Next | Index

3.1 Column Types (.NET, C#, CSharp, VB, Visual Basic, F#)

Most functions in class StatsFunctions require numeric data, although they accept any instance of IDFColumn. If a column is not an instance of DFIntColumn or DFNumericColumn, an attempt is made to convert the data to double using System.Convert.ToDouble().

NOTE—An NMathFormatException is raised if the data cannot be converted to double.

For instance, these functions will work with a DFStringColumn containing numbers represented as strings.

Code Example – C#

DFStringColumn col =
  new DFStringColumn( "Col1", "1.5", "2", "1.33", "4.76" );
double mean = StatsFunctions.Mean( col );;

However, there is a processing penalty due to such type conversion. If you need to perform many statistical functions on a column, first create a new DFIntColumn or DFNumericColumn from your data column, so type conversion occurs only once. For example, if column 4 in data frame df is a DFGenericColumn containing decimal types, this works:

Code Example – C#

double mean = StatsFunctions.Mean( df[4] );
double stdev = StatsFunctions.StandardDeviation( df[4] );

but the decimal data is converted to doubles twice. This code first creates a new DFNumericColumn containing doubles from the generic column, then computes the statistics:

Code Example – C#

var col = new DFNumericColumn( df[4].Name, df[4] );
double mean = StatsFunctions.Mean( col );
double stdev = StatsFunctions.StandardDeviation( col );

In some cases, you may want to replace the original generic column in the data frame with the new DFNumericColumn:

Code Example – C#

df.RemoveColumn( 4 );
df.InsertColumn( 4, col );
double mean = StatsFunctions.Mean( df[4] );
double stdev = StatsFunctions.StandardDeviation( df[4] );

Note that sometimes you may not even be aware that your data is stored in a generic column. (You can always return the type of a column using the ColumnType property.) This is most likely to occur when you read data from a text file or database directly into a DataFrame. For example, if your database stores data using SQL NUMERIC or DECIMAL types, these get mapped to System.Decimal in ADO. NMath does not silently convert decimals to doubles, because of the loss of precision, so they are stored in the dataframe as objects in a DFGenericColumn. If you intend to perform multiple statistical functions on the data, convert the column to a DFNumericColumn first, as shown above.


Top

Top