VB Data Frame Example

← All NMath Code Examples

 

Imports System
Imports System.Collections

Imports CenterSpace.NMath.Core


Namespace CenterSpace.NMath.Examples.VisualBasic

  A .NET example in Visual Basic showing how to manipulate data using the DataFrame class.
  
  The statistical functions in NMath Stats support the NMath Core types
  DoubleVector and DoubleMatrix, as well as simple arrays of doubles. In many
  cases, these types are sufficient for storing and manipulating your
  statistical data. However, they suffer from two limitations: they can only
  store numeric data, and they have limited support for adding, inserting, removing,
  and reordering data. Therefore, NMath Stats provides the DataFrame class which
  represents a two-dimensional data object consisting of a list of columns of the
  same length. Columns are themselves lists of different types of data: numeric,
  string, boolean, generic, and so on.                                                        

  Module DataFrameExample

    Sub Main()

      Create an empty data frame.
      Dim Data As New DataFrame()

      Add some columns. These data describe the relationship between
      the size of acorns and various oak tree species. Columns in a data frame
      can be accessed by numeric index (0...n-1) or by a name supplied at
      construction time.
      Data.AddColumn(New DFStringColumn("Region"))
      Data.AddColumn(New DFNumericColumn("AcornSize"))
      Data.AddColumn(New DFNumericColumn("TreeHeight"))
      Data.AddColumn(New DFBoolColumn("Threatened"))

      Add some rows of data. Rows can be accessed by numeric index (0...n-1)
      or by a key object. The first parameter to the AddRow() method, in this
      case the name of the oak tree species, is the row key.
      Data.AddRow("Quercus alba L.", "Atlantic", 1.4, 27, False)
      Data.AddRow("Quercus bicolor Willd.", "Atlantic", 3.4, 21, False)
      Data.AddRow("Quercus macrocarpa Michx.", "Atlantic", 9.1, 25, False)
      Data.AddRow("Quercus Chapmanii Sarg.", "Atlantic", 0.9, 15, False)
      Data.AddRow("Quercus Durandii Buckl.", "Atlantic", 0.8, 23, True)
      Data.AddRow("Quercus laurifolia Michx.", "Atlantic", 1.1, 27, False)
      Data.AddRow("Quercus marilandica Muenchh.", "Atlantic", 3.7, 9, False)
      Data.AddRow("Quercus nigra L.", "Atlantic", 1.1, 24, True)
      Data.AddRow("Quercus palustris Muenchh.", "Atlantic", 1.1, 23, False)
      Data.AddRow("Quercus texana Buckl.", "Atlantic", 1.1, 9, False)
      Data.AddRow("Quercus coccinea Muenchh.", "Atlantic", 1.2, 4, False)
      Data.AddRow("Quercus Douglasii Hook. & Arn", "California", 4.1, 18, False)
      Data.AddRow("Quercus dumosa Nutt.", "California", 1.6, 6, False)
      Data.AddRow("Quercus Engelmannii Greene", "California", 2.0, 17, False)
      Data.AddRow("Quercus Garryana Hook.", "California", 5.5, 20, True)
      Data.AddRow("Quercus chrysolepis Liebm.", "California", 17.1, 15, False)
      Data.AddRow("Quercus vaccinifolia Engelm.", "California", 0.4, 1, False)
      Data.AddRow("Quercus tomentella Engelm", "California", 7.1, 18, True)

      Display the entire, original data frame.
      Console.WriteLine()
      Console.WriteLine(Data)
      Console.WriteLine()

      Reorder some columns. Lets move the AcornSize column to the end.
      Data.PermuteColumns(0, 3, 1, 2)
      Console.WriteLine(Data)
      Console.WriteLine()

      If you dont know the index of a column you can query for it by name.
      Dim AcornSizeCol As Integer = Data.IndexOfColumn("AcornSize")
      Dim treeHeightCol As Integer = Data.IndexOfColumn("TreeHeight")

      Sort the rows. Lets sort the rows by AcornSize in ascending order, and secondarily
      by TreeHeight in descending order. 
      Dim ColIndices As Integer() = {AcornSizeCol, treeHeightCol}
      Dim SortingTypes As SortingType() = {SortingType.Ascending, SortingType.Descending}

      Console.WriteLine(Data)
      Console.WriteLine()

      Remove some columns and rows.
      Data.RemoveColumn("Threatened")
      Data.RemoveRow("Quercus nigra L.")
      Data.RemoveRow(2)
      Console.WriteLine(Data)
      Console.WriteLine()

      Update a value by row and column index.
      Dim RowIndex As Integer = Data.IndexOfKey("Quercus chrysolepis Liebm.")
      Dim ColIndex As Integer = Data.IndexOfColumn("AcornSize")
      Data(RowIndex, ColIndex) = 17.2

      Get a row dictionary for one species of oak tree. The keys are the column names,
      and the values are the row data.
      Dim Dict As IDictionary = Data.GetRowDictionary("Quercus palustris Muenchh.")
      Console.WriteLine("Quercus palustris Muenchh.")

      Dim Keys As IEnumerator = Dict.Keys().GetEnumerator()
      Dim Key As String
      While Keys.MoveNext
        Key = Keys.Current
        Console.WriteLine(Key & ": " & Dict(Key))
      End While
      Console.WriteLine()

      Get a column dictionary for the TreeHeight column. The keys are the row keys, and
      values are the column data.
      Dict = Data.GetColumnDictionary("TreeHeight")
      Console.WriteLine("TreeHeight")

      Keys = Dict.Keys().GetEnumerator()
      While Keys.MoveNext
        Key = Keys.Current
        Console.WriteLine(Key & ": " & Dict(Key))
      End While
      Console.WriteLine()

      Compute some descriptive statistics
      Console.WriteLine("Acorn Size:")
      Console.WriteLine("Mean = " & StatsFunctions.Mean(Data("AcornSize")))
      Console.WriteLine("Var = " & StatsFunctions.Variance(Data("AcornSize")))
      Console.WriteLine()

      Export data to a DoubleMatrix. Non-numeric columns are ignored.
      Dim A As DoubleMatrix = Data.ToDoubleMatrix()
      Console.WriteLine(A)
      Console.WriteLine()

      Get a DoubleVector for the values in the AcornSize column.
      Dim V As DoubleVector = Data("AcornSize").ToDoubleVector()
      Console.WriteLine(V)
      Console.WriteLine()

      Console.WriteLine()
      Console.WriteLine("Press Enter Key")
      Console.Read()


    End Sub

  End Module

End Namespace

← All NMath Code Examples
Top