← All NMath Core Code Examples

```ï»¿Imports System
Imports System.IO

Imports CenterSpace.NMath.Core

Namespace CenterSpace.NMath.Core.Examples.VisualBasic

' A .NET example in Visual Basic

Sub Main()

' NMath Stats provide classes for performing a factor analysis on a set of case data.
' Case data should be provided to these classes in matrix form - the variable values
' in columns and each row representing a case. In this example we look at
' a hypothetical sample of 300 responses on 6 items from a survey of college students'
' favorite subject matter. The items range in value from 1 to 5, which represent a scale
' from Strongly Dislike to Strongly Like. Our 6 items asked students to rate their liking
' of different college subject matter areas, including biology (BIO), geology (GEO),
' chemistry (CHEM), algebra (ALG), calculus (CALC), and statistics (STAT).

' First load the data, which is in a comma delimited form.

' NMath Stats provides three classes for
' performing factor analysis. All will perform analysis on the correlation matrix
' or the covariance matrix of case data. In addition each of these classes has
' two class parameters, on specifying the algorithm used to extract the factors,
' and the other specifying a factor rotation method. Here we use the class
' FactorAnalysisCovariance, which analyzes the covariance matrix of the case data,
' with principal factors extraction and varimax rotation.
' The other two factor analysis classes are FactorAnalysisCorrelation, for analyzing
' the correlation matrix, and DoubleFactorAnalysis which can be used if you don't
' have access to the original case data, just the correlation or covariance matrix
' (DoubleFactorAnalysis is a base class for FactorAnalysisCorrelation and
' FactorAnalysisCovariance).

' Construct the factor analysis object we use for our analysis. Here we
' first construct instance of the factor extraction and rotation classes
' and use them in the factor analysis object construction. This gives
' us control of the parameters affecting these algorithms.

' Construct a principal components factor extraction object specifying the
' function object for determining the number of factors to extract. The
' type of this argument is Func<DoubleVector, DoubleMatrix, int>, it
' takes as arguments the vector of eigenvalues and the matrix of eigenvectors
' and returns the number of factors to extract. The class NumberOfFactors
' contains static methods for creating functors for several common
' strategies. Here we extract factors whose eigenvalues are greater
' than 1.2 times the mean of the eigenvalues.
Dim FactorExtraction As New PCFactorExtraction(NumberOfFactors.EigenvaluesGreaterThanMean(1.2))

' Next construct an instance of the rotation algorithm we want to use,
' which is the varimax algorithm. Here we specify convergence criteria
' be setting the tolerance to 1e-6. Iteration will stop when the relative
' change in the sum of the singular values is less than this number.
' We also specify that we do NOT want Kaiser normalization to be performed.
Dim FactorRotation As New VarimaxRotation
FactorRotation.Tolerance = 0.000001
FactorRotation.Normalize = False

' We now construct our factor analysis object. We provide the case data as a matrix (columns
' correspond to variables and rows correspond to cases), the bias type - variances will be
' computed as biased, and our extraction and rotation objects.
Dim FA As New FactorAnalysisCovariance(Of PCFactorExtraction, VarimaxRotation)(FavoriteSubject.ToDoubleMatrix(), _
BiasType.Biased, FactorExtraction, FactorRotation)

Console.WriteLine()
Console.WriteLine("Number of factors extracted:  " & FA.NumberOfFactors)
' Looks like we will retain two factors.

' Extracted communalities are estimates of the proportion of variance in each variable
' accounted for by the factors.
Dim ExtractedCommunalities As DoubleVector = FA.ExtractedCommunalities
Console.WriteLine()
Console.WriteLine("Predictor" & ControlChars.Tab & "Extracted Communality")
Console.WriteLine("-------------------------------------")

For I As Integer = 0 To FavoriteSubject.Cols - 1
Console.Write(FavoriteSubject(I).Name & ControlChars.Tab & ControlChars.Tab)
Console.WriteLine(ExtractedCommunalities(I).ToString("G3"))
Next

Console.WriteLine()

' We can get a little better picture of the communalities by looking at their
' rescaled values. The FactorAnalysisCovariance class provides many 'rescaled'
' results for calculations involving the extracted factors. In the rescaled
' version the factors are first rescaled by dividing by the standard deviations
' of the case variables before being used in the calculation.
'
' The rescaled communalities have their values are between 0 and 1. Most of the values
' are close to 1, except for STAT. Maybe we should extract another factor?
Dim RescaledCommunalities As DoubleVector = FA.RescaledExtractedCommunalities
Console.WriteLine("Predictor" & ControlChars.Tab & "Rescaled Communality")
Console.WriteLine("-------------------------------------")

For I = 0 To FavoriteSubject.Cols - 1
Console.Write(FavoriteSubject(I).Name & ControlChars.Tab & ControlChars.Tab)
Console.WriteLine(RescaledCommunalities(I).ToString("G3"))
Next
Console.WriteLine()

' Next we look at the variance explained by the initial solution
' by printing out a table of these values.
' The first column will just be the extracted factor number.
'
' The second 'Total' column gives the eigenvalue, or amount of
' variance in the original variables accounted for by each factor.
' Note that only the first two factors will be kept because their
' value is greater than 1.2 times the mean of the eigenvalues.
'
' The % of Variance column gives the ratio, expressed as a percentage,
' of the variance accounted for by each factor to the total
' variance in all of the variables.
'
' The Cumulative % column gives the percentage of variance accounted
' for by the first n factors. For example, the cumulative percentage
' for the second factor is the sum of the percentage of variance
' for the first and second factors.
Console.WriteLine("factor" & ControlChars.Tab & "Total" & ControlChars.Tab & "Variance" & ControlChars.Tab _
& "Cumulative")
Console.WriteLine("----------------------------------------------------")

For I = 0 To FA.VarianceProportions.Length - 1
Console.Write(I)
Console.Write(ControlChars.Tab & FA.FactorExtraction.Eigenvalues(I).ToString("G4") & ControlChars.Tab)
Console.Write(FA.VarianceProportions(I).ToString("P4") & ControlChars.Tab)
Console.WriteLine(FA.CumulativeVarianceProportions(I).ToString("P4"))
Next

' Looks like we retain over 75% of the variance with just two factors.

' Next we look at the the percentages of variance explained by the
' extracted rotated factors. Comparing this table with the first
' three rows of the previous one (three factors are extracted)
' we see that the cumulative percentage of variation explained by the
' extracted factors is maintained by the rotated factors,
' but that variation is now spread more evenly over the factors,
' but not by a lot. Maybe we could skip rotation, or try a
' different rotation type.
Dim EigenValueSum As Double = NMathFunctions.Sum(FA.FactorExtraction.Eigenvalues)
Console.WriteLine()
Console.WriteLine()
Console.WriteLine("Factor" & ControlChars.Tab & "Total" & ControlChars.Tab & "Variance" & ControlChars.Tab & "Cumulative")
Console.WriteLine("----------------------------------------------------")

Dim Cumulative As Double = 0
For I = 0 To FA.NumberOfFactors - 1
Console.Write(I)
Console.WriteLine(ControlChars.Tab & Cumulative.ToString("P4"))
Next

Console.WriteLine()

' The rotated factor matrix helps you to determine what the factors represent.
Dim RotatedComponentMatrix As DoubleMatrix = FA.RotatedFactors
Console.WriteLine("Rotated Factor Matrix")
Console.WriteLine()
Console.WriteLine("Predictor" & ControlChars.Tab & "Factor")
Console.WriteLine("" & ControlChars.Tab & ControlChars.Tab & "1" & ControlChars.Tab & "2")
Console.WriteLine("-------------------------------------")

For I = 0 To FavoriteSubject.Cols - 1
Console.Write(FavoriteSubject(I).Name & ControlChars.Tab & ControlChars.Tab)
If FavoriteSubject(I).Name.Length >= 8 Then
Console.Write(ControlChars.Tab)
End If
Console.Write(RotatedComponentMatrix(I, 0).ToString("G4") & ControlChars.Tab & ControlChars.Tab)
Console.WriteLine(RotatedComponentMatrix(I, 1).ToString("G4"))
Next

' The first factor is most highly correlated with BIO, GEO, CHEM.
' CHEM a better representative, however, because it is less correlated
' with the other factor.
'
' The second factor is most highly correlated ALG, CALC, and STAT.

Console.WriteLine()
Console.WriteLine("Press Enter Key")