Imports System Imports System.IO Namespace CenterSpace.NMath.Examples.VisualBasic A .NET example in Visual Basic showing how to manipulate data that has missing values. Module MissingValuesExample Sub Main() In the data, missing values are denoted by -1. Set the defaults accordingly. StatsSettings.IntegerMissingValue = -1 Read in data from a tab-delimited file. Data has fields for name, manufacturer, type, calories, protein, fat, sodium, fiber, carbohydrates, sugars, potassium, vitamins, shelf weight, cups, rating, age, gender and grade. Dim Data As DataFrame = DataFrame.Load("MissingValuesExample.dat") Print out initial data. Console.WriteLine() Console.WriteLine(Data) Console.WriteLine() Check how many missing values are in each numeric column Dim Total As Integer = Data.Rows Dim Valid As Integer Dim C As Integer For C = 0 To Data.Cols - 1 If Data(C).IsNumeric Then Console.Write(Data(C).Label) Console.Write(": ") Dim NC As Integer = StatsFunctions.NaNCount(Data(C)) Valid = Total - NC Console.WriteLine(Valid) End If Next Console.WriteLine() The columns "carbo", "sugars" and "potass" contain missing values. We can still perform descriptive statistics on them. Console.WriteLine("Average sugar content: " + StatsFunctions.NaNMean(Data("sugars")).ToString("G5")) Sorting routines give ambiguous results when columns contain NaN values. We can strip the missing values first. Dim Stripped As DFIntColumn = CType(StatsFunctions.NaNRemove(Data("sugars")), DFIntColumn) Console.WriteLine("Median sugar content: " + StatsFunctions.Median(Stripped).ToString("G5")) Console.WriteLine("90th percentile sugar content: " + StatsFunctions.Percentile(Stripped, 0.9).ToString("G5")) Console.WriteLine() Create a sub-frame that contains only rows without missing values. Dim CleanData As DataFrame = Data.CleanRows() Console.WriteLine("Stripped " & (Data.Rows - CleanData.Rows) & " rows containing missing values.") Console.WriteLine() Console.WriteLine("Press Enter Key") Console.Read() End Sub End Module End Namespace← All NMath Code Examples