# F# Principal Component Example

← All NMath Core Code Examples

```namespace CenterSpace.NMath.Core.Examples.FSharp

open System

open CenterSpace.NMath.Core

/// <summary>
/// A .NET example in C# showing how to perform a principal component analysis on a data set.
/// </summary>
module PrincipalComponentExample =

// Read in data from a file. These data give air pollution and related values
// for 41 U.S. cities.
//   SO2: Sulfur dioxide content of air in micrograms per cubic meter
//   Temp: Average annual temperature in degrees Fahrenheit
//   Man: Number of manufacturing enterprises employing 20 or more workers
//   Pop: Population size in thousands from the 1970 census
//   Wind: Average annual wind speed in miles per hour
//   Rain: Average annual precipitation in inches
//   RainDays: Average number of days with precipitation per year
// Source: http://lib.stat.cmu.edu/DASL/Datafiles/AirPollution.html

let df = DataFrame.Load("..\\..\\PrincipalComponentExample.dat", true, true, "\t", true)

printfn "%s" (df.ToString())
printfn ""

// Class DoublePCA performs a double-precision principal component
// analysis on a given data set. The data may optionally be centered and
// scaled before analysis takes place. By default, variables are centered
// but not scaled.
let pca = new DoublePCA(df)

// Once your data is analyzed, you can can retrieve information about the data.
// If centering was specified, the column means are substracted from
// the column values before analysis takes place. If scaling was specified,
// column values are scaled to have unit variance before analysis by dividing
// by the column norm.
printfn "Number of Observations = %A" pca.NumberOfObservations
printfn "Number of Variables = %A" pca.NumberOfVariables
printfn "Column Means = %s" (pca.Means.ToString())
printfn "Column Norms = %s" (pca.Norms.ToString())
printfn "Data was centered? = %A" pca.IsCentered
printfn "Data was scaled? = %A" pca.IsScaled
printfn ""

printfn ""

// You can retrieve a particular principal component using the indexer.
printfn "First principal component = %s" (pca..ToString())
printfn ""
printfn "Second principal component = %s" (pca..ToString())
printfn ""

// The first principal component accounts for as much of the variability in the
// data as possible, and each succeeding component accounts for as much of the
// remaining variability as possible.
printfn "Variance Proportions = %s" (pca.VarianceProportions.ToString())
printfn ""
printfn "Cumulative Variance Proportions = %s" (pca.CumulativeVarianceProportions.ToString())
printfn ""

// You can also get the number of principal components required to account for
// a given proportion of the total variance. In this case, a plane fit to the
// original 7-dimensional space accounts for 99% of the variance.
printfn "PCs that account for 99%% of the variance = %s" (pca.Threshold(0.99).ToString())
printfn ""

// The Score matrix is the data formed by transforming the original data into
// the space of the principal components.
printfn "Scores = %s" (pca.Scores.ToString())
printfn ""

printfn "Press Enter Key"