VB Factor Example

← All NMath Code Examples

 

Imports System
Imports System.Collections
Imports Microsoft.VisualBasic

Imports CenterSpace.NMath.Core

Imports System.IO

Namespace CenterSpace.NMath.Examples.VisualBasic

  A .NET example in Visual Basic showing how to create and manipulate factors.

  The Factor class represents a categorical vector in which all
  elements are drawn from a finite number of factor levels. Thus, a Factor contains
  two parts: a string array of factor levels, and an integer array of
  categorical data, of which each element is an index into the array of levels.
  Module FactorExample

    Sub Main()

      Read in data from the file. The data show test scores for 17 children on a
      simple reading test. The childs gender ("male" or "female") and grade
      (4, 5, or 6) is also recorded.
      Dim DF As DataFrame = DataFrame.Load("FactorExample.dat", True, False, ControlChars.Tab, True)

      Console.WriteLine()
      Console.WriteLine(DF)
      Console.WriteLine()

      Factors are usually constructed from a data frame column using the
      GetFactor() method, which creates a Factor with levels for the sorted, unique
      values in the column.
      Dim Gender As Factor = DF.GetFactor("Gender")

      Display the levels and categorical data for the gender factor.
      Console.WriteLine("Gender factor: " & Gender.ToString())
      Console.WriteLine("Gender levels: " & Gender.LevelsToString())
      Console.WriteLine("Gender data: " & Gender.DataToString())
      Console.WriteLine()

      Construct a factor for grade level.
      Dim Grade As Factor = DF.GetFactor("Grade")

      Display the levels and categorical data for the grade factor.
      Console.WriteLine("Grade factor: " & Grade.ToString())
      Console.WriteLine("Grade levels: " & Grade.LevelsToString())
      Console.WriteLine("Grade data: " & Grade.DataToString())
      Console.WriteLine()

      The principal use of factors is in conjunction with the
      GetGroupings() methods on Subset. One overload of this method accepts
      a single Factor and returns an array of subsets containing the indices
      for each level of the given factor.
      Dim Genders As Subset() = Subset.GetGroupings(Gender)
      Dim Grades As Subset() = Subset.GetGroupings(Grade)

      Display overall mean
      Console.WriteLine("Grand mean = " & Math.Round(StatsFunctions.Mean(DF("Score"))))
      Console.WriteLine()

      Display mean for each level of the Gender and Grade factors.
      Console.WriteLine("Marginal Means")
      Dim I As Integer
      Dim Mean As Double

      For I = 0 To Gender.NumberOfLevels - 1
        Mean = StatsFunctions.Mean(DF(DF.IndexOfColumn("Score"), Genders(I)))
        Mean = Math.Round(Mean, 2)
        Console.WriteLine("Mean for gender " & Gender.Levels(I) & " = " & Mean)
      Next

      For I = 0 To Grade.NumberOfLevels - 1
        Mean = StatsFunctions.Mean(DF(DF.IndexOfColumn("Score"), Grades(I)))
        Mean = Math.Round(Mean, 2)
        Console.WriteLine("Mean for grade " & Grade.Levels(I) & " = " & Mean)
      Next
      Console.WriteLine()

      Another overload of GetGroupings() accepts two Factor objects and returns
      a two-dimensional jagged array of subsets containing the indices for
      each combination of levels in the two factors.
      Console.WriteLine("Cell Means")
      Dim Cells As Subset(,) = Subset.GetGroupings(Gender, Grade)

      Dim J As Integer
      For I = 0 To Gender.NumberOfLevels - 1
        For J = 0 To Grade.NumberOfLevels - 1
          Mean = StatsFunctions.Mean(DF(DF.IndexOfColumn("Score"), Cells(I, J)))
          Mean = Math.Round(Mean, 2)
          Console.WriteLine("Mean for gender " & Gender.Levels(I) & " in grade " & Grade.Levels(I) & " = " & Mean)
        Next
      Next
      Console.WriteLine()

      Combining DataFrame.GetFactor()with Subset.GetGroupings() to access “cells”
      is such a common operation that class DataFrame also provides the Tabulate()
      method as a convenience. This method accepts one or two grouping columns, a
      data column, and a delegate to apply to each data column subset. This code 
      displays the same marginal and cell means shown above, but with far fewer
      lines of code:
      Dim MeanFunction As New Func(Of IDFColumn, Double)(AddressOf StatsFunctions.Mean)
      Console.WriteLine("Same results using cross-tabulation:" & Environment.NewLine)
      Console.WriteLine(DF.Tabulate("Grade", "Score", MeanFunction).ToString() & Environment.NewLine)
      Console.WriteLine(DF.Tabulate("Gender", "Score", MeanFunction).ToString() & Environment.NewLine)
      Console.WriteLine(DF.Tabulate("Grade", "Gender", "Score", MeanFunction).ToString() & Environment.NewLine)

      Factors are used internally by ANOVA classes for grouping data. 
      Dim Anova As TwoWayAnova = New TwoWayAnova(DF, DF.IndexOfColumn("Gender"), DF.IndexOfColumn("Grade"), DF.IndexOfColumn("Score"))
      Console.WriteLine(Anova)

      Console.WriteLine()
      Console.WriteLine("Press Enter Key")
      Console.Read()

    End Sub

  End Module

End Namespace


← All NMath Code Examples
Top