NMath Stats User's Guide

TOC | Previous | Next | Index

2.2 Creating DataFrames (.NET, C#, CSharp, VB, Visual Basic, F#)

Data frames can be constructed in a variety of ways.

Creating Empty DataFrames

The default constructor creates an empty data frame with no rows or columns. Columns and rows can then be added to the new data frame.

Code Example – C#

var df = new DataFrame();

// Add some columns
df.AddColumn( new DFStringColumn( "Sex" ));  
df.AddColumn( new DFStringColumn( "AgeGroup" ));
df.AddColumn( new DFIntColumn( "Weight" ) );

// Add some rows
df.AddRow( "John Smith", "M", "Child", 45 );
df.AddRow( "Ruth Barnes", "F", "Senior", 115 );
df.AddRow( "Jane Jones", "F", "Adult", 115 );
df.AddRow( "Timmy Toddler", "M", "Child", 42 );
df.AddRow( "Betsy Young", "F", "Adult", 130 );
df.AddRow( "Arthur Smith", "M", "Senior", 142 );
df.AddRow( "Lucy Doe", "F", "Child", 30 );
df.AddRow( "Emma Allen", "F", "Child", 35 );

NOTE—The first parameter to the AddRow() method is the row key. See Section 2.3 and Section 2.4, respectively, for more information on adding columns and rows to a data frame.

Creating DataFrames from Arrays of Columns

You can also construct and populate columns independently, then combine them into a data frame:

Code Example – C#

var col1 = new DFNumericColumn( "Col1", 1.1, 2.2, 3.3, 4.4 );
var col2 = new DFBoolColumn ( "Col2", true, true, false, true );
var col3 =
  new DFStringColumn ( "Col3", "John", "Paulo", "Sam", "Becky" );
var cols = new DFColumn[] { col1, col2, col3 };
var df = new DataFrame( cols );

An InvalidArgumentException is thrown if the columns are not all of the same length.

In this case, the row keys are set to nulls; they can later be initialized using the SetRowKeys() method. Alternatively, you can pass in a collection of row keys at construction time:

Code Example – C#

var keys = new object[] { "Row1", "Row2", "Row3", "Row4" };
var df = new DataFrame( cols, keys );

Creating DataFrames from Matrices

You can construct a data frame from a DoubleMatrix and an array of column names. A new DFNumericColumn is added for each column in the matrix. For instance, this code creates a data frame from an 8 x 3 matrix:

Code Example – C#

var A = new DoubleMatrix( 8, 3, 0, 1 );
var colNames = new string[] { "A", "B", "C" };
var df = new DataFrame( A, colNames );

The number of column names must match the number of columns in the matrix.

Creating DataFrames from ADO.NET Objects

You can construct a data frame from an ADO.NET DataTable. For example, assuming table is a DataTable instance:

Code Example – C#

var df = new DataFrame( table );

In this case, the row keys are set to the default rowIndex + 1—that is, 1...n. You can also specify the row keys in various ways. This code passes in an array of row keys:

Code Example – C#

var keys = new object[] { "Row1", "Row2", "Row3", "Row4" };
var df = new DataFrame( table, keys );

Alternatively, you can indicate a column in the DataTable, either by column index or column name, to use for the row keys. This code uses column ID for row keys:

Code Example – C#

var df = new DataFrame( table, "ID" );

Creating DataFrames from Strings

You can construct a data frame from a string representation. For example, if str is a tab-delimited string containing:

Code Example – C#

Key  Col1 Col2   Col3
Row1 1.1  true   A
Row2 2.2  true   B
Row3 3.3  false  A
Row4 4.4  true   C

Then you could construct a data frame like so:

var df = new DataFrame( str );

For more control, you can also indicate:

whether the first row of data contains column headers

whether the first column of data contains row keys

the delimiter used to separate columns

whether to parse the column types, or to treat everything as string data

For example, if str is a comma-delimited string containing column headers but no row keys:

Code Example – C#

Col1,Col2,Col3
1.1,true,A
2.2,true,B
3.3,false,A
4.4,true,C

you could construct a data frame like so:

Code Example – C#

var df = new DataFrame( str, true, false, ",", true );

Top

Top