<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>normal distribution Archives - CenterSpace</title>
	<atom:link href="https://www.centerspace.net/tag/normal-distribution/feed" rel="self" type="application/rss+xml" />
	<link>https://www.centerspace.net/tag/normal-distribution</link>
	<description>.NET numerical class libraries</description>
	<lastBuildDate>Mon, 31 Aug 2020 19:48:02 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1.1</generator>
<site xmlns="com-wordpress:feed-additions:1">104092929</site>	<item>
		<title>Distribution Fitting Demo</title>
		<link>https://www.centerspace.net/distribution-fitting-demo</link>
					<comments>https://www.centerspace.net/distribution-fitting-demo#respond</comments>
		
		<dc:creator><![CDATA[Ken Baldwin]]></dc:creator>
		<pubDate>Mon, 09 Apr 2012 14:49:02 +0000</pubDate>
				<category><![CDATA[NMath Tutorial]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[CDF]]></category>
		<category><![CDATA[CDF C#]]></category>
		<category><![CDATA[gaussian distribution]]></category>
		<category><![CDATA[nonlinear least squares]]></category>
		<category><![CDATA[normal distribution]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[PDF C#]]></category>
		<category><![CDATA[probability distribution]]></category>
		<category><![CDATA[Trust Region minimization]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=3719</guid>

					<description><![CDATA[<p><img class="excerpt" title="Distribution Fit" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png" alt="Distribution Fit" /><br />
A customer recently asked how to fit a normal (Gaussian) distribution to a vector of experimental data. Here's a demonstration of how to do it.</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/distribution-fitting-demo">Distribution Fitting Demo</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>A customer recently asked how to fit a normal (Gaussian) distribution to a vector of experimental data. Here&#8217;s a demonstration of how to do it.</p>
<p>Let&#8217;s start by creating a data set: 100 values drawn from a normal distribution with known parameters (mean = 0.5, variance = 2.0).</p>
<pre lang="csharp">int n = 100;
double mean = .5;
double variance = 2.0;
var data = new DoubleVector( n, new RandGenNormal( mean, variance ) );</pre>
<p>Now, compute y values based on the empirical cumulative distribution function (CDF), which returns the probability that a random variable X will have a value less than or equal to x&#8211;that is, f(x) = P(X &lt;= x). Here&#8217;s an easy way to do, although not necessarily the most efficient for larger data sets:</p>
<pre lang="csharp">var cdfY = new DoubleVector( data.Length );
var sorted = NMathFunctions.Sort( data );
for ( int i = 0; i &lt; data.Length; i++ )
{
  int j = 0;
  while ( j &lt; sorted.Length &amp;&amp; sorted[j] &lt;= data[i] ) j++;
  cdfY[i] = j / (double)data.Length;
}</pre>
<p>The data is sorted, then for each value x in the data, we iterate through the sorted vector looking for the first value that is greater than x.</p>
<p>We&#8217;ll use one of NMath&#8217;s non-linear least squares minimization routines to fit a normal distribution CDF() function to our empirical CDF. NMath provides classes for fitting generalized one variable functions to a set of points. In the space of the function parameters, beginning at a specified starting point, these classes finds a minimum (possibly local) in the sum of the squared residuals with respect to a set of data points.</p>
<p>A one variable function takes a single double x, and returns a double y:</p>
<pre class="code">y = f(x)</pre>
<p>A <em>generalized</em> one variable function additionally takes a set of parameters, p, which may appear in the function expression in arbitrary ways:</p>
<pre class="code">y = f(p1, p2,..., pn; x)</pre>
<p>For example, this code computes y=a*sin(b*x + c):</p>
<pre lang="csharp">public double MyGeneralizedFunction( DoubleVector p, double x )
{
  return p[0] * Math.Sin( p[1] * x + p[2] );
}</pre>
<p>In the distribution fitting example, we want to define a parameterized function delegate that returns CDF(x) for the distribution described by the given parameters (mean, variance):</p>
<pre lang="csharp">Func<doublevector, double,="" double=""> f =
  ( DoubleVector p, double x ) =&gt;
    new NormalDistribution( p[0], p[1] ).CDF( x );</doublevector,></pre>
<p>Now that we have our data and the function we want to fit, we can apply the curve fitting routine. We&#8217;ll use a bounded function fitter, because the variance of the fitted normal distribution must be constrained to be greater than 0.</p>
<pre lang="csharp">var fitter = new BoundedOneVariableFunctionFitter<trustregionminimizer>( f );
var start = new DoubleVector( new double[] { 0.1, 0.1 } );
var lowerBounds = new DoubleVector( new double[] { Double.MinValue, 0 } );
var upperBounds = 
   new DoubleVector( new double[] { Double.MaxValue, Double.MaxValue } );
var solution = fitter.Fit( data, cdfY, start, lowerBounds, upperBounds );
var fit = new NormalDistribution( solution[0], solution[1] );

Console.WriteLine( "Fitted distribution:\nmean={0}\nvariance={1}",
  fit.Mean, fit.Variance );</trustregionminimizer></pre>
<p>The output for one run is</p>
<pre class="code">Fitted distribution: 
mean=0.567334190790594
variance=2.0361207956132</pre>
<p>which is a reasonable approximation to the original distribution (given 100 points).</p>
<p>We can also visually inspect the fit by plotting the original data and the CDF() function of the fitted distribution.</p>
<pre lang="csharp">ToChart( data, cdfY, SeriesChartType.Point, fit,
  NMathStatsChart.DistributionFunction.CDF );

private static void ToChart( DoubleVector x, DoubleVector y,
  SeriesChartType dataChartType, NormalDistribution dist,
  NMathStatsChart.DistributionFunction distFunction )
{
  var chart = NMathStatsChart.ToChart( dist, distFunction );
  chart.Series[0].Name = "Fit";

  var series = new Series() {
    Name = "Data",
    ChartType = dataChartType
  };
  series.Points.DataBindXY( x, y );
  chart.Series.Insert( 0, series );

  chart.Legends.Add( new Legend() );
  NMathChart.Show( chart );
}</pre>
<p><a href="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_cdf.png"><img decoding="async" class="aligncenter size-full wp-image-3727" title="distribution_fit_cdf" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_cdf.png" alt="CDF() of fitted distribution" width="482" height="488" srcset="https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_cdf.png 482w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_cdf-296x300.png 296w" sizes="(max-width: 482px) 100vw, 482px" /></a></p>
<p>We can also look at the probability density function (PDF) of the fitted distribution, but to do so we must first construct an empirical PDF using a histogram. The x-values are the midpoints of the histogram bins, and the y-values are the histogram counts converted to probabilities, scaled to integrate to 1.</p>
<pre lang="csharp">int numBins = 10;
var hist = new Histogram( numBins, data );

var pdfX = new DoubleVector( hist.NumBins );
var pdfY = new DoubleVector( hist.NumBins );
for ( int i = 0; i &lt; hist.NumBins; i++ )
{
  // use bin midpoint for x value
  Interval bin = hist.Bins[i];
  pdfX[i] = ( bin.Min + bin.Max ) / 2;

   // convert histogram count to probability for y value
   double binWidth = bin.Max - bin.Min;
   pdfY[i] = hist.Count( i ) / ( data.Length * binWidth );
}

ToChart( pdfX, pdfY, SeriesChartType.Column, fit,
  NMathStatsChart.DistributionFunction.PDF );</pre>
<p><a href="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png"><img decoding="async" loading="lazy" class="aligncenter size-full wp-image-3728" title="distribution_fit_pdf" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png" alt="PDF() of fitted distribution" width="485" height="484" srcset="https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf.png 485w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf-150x150.png 150w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf-300x300.png 300w" sizes="(max-width: 485px) 100vw, 485px" /></a></p>
<p>You might be tempted to try to fit a distribution PDF() function directly to the histogram data, rather than using the CDF() function like we did above, but this is problematic for several reasons. The bin counts have different variability than the original data. They also have a fixed sum, so they are not independent measurements. Also, for continuous data, fitting a model based on aggregated histogram counts, rather than the original data, throws away information.</p>
<p>Ken</p>
<p>Download the <a href="https://drive.google.com/open?id=1KlctDEKniD8SdmQiBGmcrJWMuhvU-WYP">source code</a></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/distribution-fitting-demo">Distribution Fitting Demo</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/distribution-fitting-demo/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3719</post-id>	</item>
		<item>
		<title>Probability Distributions in NMath Stats</title>
		<link>https://www.centerspace.net/probability-distributions-in-nmath-stats</link>
					<comments>https://www.centerspace.net/probability-distributions-in-nmath-stats#respond</comments>
		
		<dc:creator><![CDATA[Paul Shirkey]]></dc:creator>
		<pubDate>Tue, 02 Mar 2010 19:59:48 +0000</pubDate>
				<category><![CDATA[NMath Stats]]></category>
		<category><![CDATA[CDF]]></category>
		<category><![CDATA[CDF C#]]></category>
		<category><![CDATA[kolmogorov-smirnov test]]></category>
		<category><![CDATA[KS test]]></category>
		<category><![CDATA[normal distribution]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[PDF C#]]></category>
		<category><![CDATA[probability distribution]]></category>
		<category><![CDATA[weibull distribution]]></category>
		<guid isPermaLink="false">http://centerspace.net/blog/?p=1633</guid>

					<description><![CDATA[<p><img src="http://centerspace.net/blog/wp-content/uploads/2010/03/CDF-of-running-data.png" alt="Strands 5K finishing times with a Weibull CDF." class="excerpt" />Probability distributions are central to many applications in statistical analysis.  The NMath Stats library offers a large set of probability distributions, covering most domains of application, all with an easy to use common interface.  Each distribution class uses numerically stable, highly accurate, algorithms to compute both the probability distribution and the cumulative distribution. In this post we'll look at some code examples using these distribution classes. </p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/probability-distributions-in-nmath-stats">Probability Distributions in NMath Stats</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<figure style="width: 160px" class="wp-caption alignleft"><a href="http://centerspace.net/blog/wp-content/uploads/2010/03/Normal-PDF-CDF.png"><img decoding="async" class="alignleft " title="Normal Distribution PDF CDF" src="http://centerspace.net/blog/wp-content/uploads/2010/03/Normal-PDF-CDF.png" alt="Normal Distribution PDF CDF" width="160" /></a><figcaption class="wp-caption-text">Gaussian Distribution</figcaption></figure>
<p>Probability distributions are central to many applications in statistical analysis. The NMath Stats library offers a large set of probability distributions, covering most domains of application, all with an easy to use common interface. Each distribution class uses numerically stable accurate algorithms to compute both the probability distribution and the cumulative distribution. In this post we&#8217;ll look at some code examples using these distribution classes. All of the charts in this post were generated using the <a href="http://www.infragistics.com/products/wpf#Overview">Infragistics WPF tool set</a>.</p>
<h3>Available Distributions</h3>
<p>The NMath Stats library offers the following set of probability distributions, with each name linked to their API documentation page. More information can be found on the CenterSpace probability distribution <a href="/probability-distributions/">landing page</a>, including links to code examples in C# and VB, and more documentation.</p>
<table>
<tbody>
<tr>
<th colspan="2"><strong>Distributions in NMath Stats</strong></th>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_NormalDistribution.htm">Normal Distribution (Gaussian)</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_LognormalDistribution.htm">Log Normal Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_PoissonDistribution.htm">Poisson Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_GeometricDistribution.htm">Geometric Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_WeibullDistribution.htm">Weibull Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_UniformDistribution.htm">Uniform Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_ChiSquareDistribution.htm">Chi-Square Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_BinomialDistribution.htm">Binomial Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_NegativeBinomialDistribution.htm">Negative Binomial Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_ExponentialDistribution.htm">Exponential Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_TDistribution.htm">T Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_FDistribution.htm">F Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_TriangularDistribution.htm">Triangular Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_LogisticDistribution.htm">Logistic Distribution</a></td>
</tr>
<tr>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_BetaDistribution.htm">Beta Distribution</a></td>
<td><a href="/doc/NMathSuite/ref/html/T_CenterSpace_NMath_Core_GammaDistribution.htm">Gamma Distribution</a></td>
</tr>
</tbody>
</table>
<h3>Distribution of Running Times</h3>
<p>Corvallis hosts many foot races during the year, and this application note analyzes the finishing times data from two of those: the annual Fall Festival 10K Fun Run, and the one-off Strands 5K. The central limit theorem tells us to expect the running times for a foot race (of enough participants) to be normally distributed. But what happens to that distribution when a large prize is offered? The Fall Festival 10K Fun Run offers a prize of exactly $0, where the Strands 5K offered an amazing $10,000 prize.</p>
<p>We can estimate a normal distribution from the two data sets, and then use a Kolmogorov-Smirnov test to determine if the distribution passed the K-S null hypothesis. If the Kolmogorov-Smirnov null hypothesis is not rejected, then under this statistic, the data points are said to be drawn from the reference distribution (in this case the normal distribution).</p>
<pre lang="csharp">using System.IO;
using CenterSpace.NMath.Core;
using CenterSpace.NMath.Stats;

public Main()
{
  // Load fall festival 10K data and strands 5K data
  StreamReader reader = new StreamReader("fall_festival_times.txt", false);
  DoubleVector fallfestival10k = new DoubleVector(reader);
  
  reader = new StreamReader("strands_times.txt", false);
  DoubleVector strands10k = new DoubleVector(reader);

  // Estimate Normal (Gaussian) Distributions and 
  // check the Kolmogorov-Smirnov Test
  NormalDistribution ndist_ff = new NormalDistribution(
    StatsFunctions.Mean(fallfestival10k), StatsFunctions.Variance(fallfestival10k));
  OneSampleKSTest kstest = new OneSampleKSTest(fallfestival10k, ndist_ff);
  bool rejectNH = kstest.Reject; // False

  NormalDistribution ndist_s = new NormalDistribution(
    StatsFunctions.Mean(strands10k), StatsFunctions.Variance(strands10k));
  kstest = new OneSampleKSTest(strands10k, ndist_s);
  rejectNH = kstest.Reject;  // True
}
</pre>
<p>A look at the data makes the results of the Kolmogorov-Smirnov test look plausible.</p>
<figure id="attachment_1717" aria-describedby="caption-attachment-1717" style="width: 400px" class="wp-caption aligncenter"><a href="http://centerspace.net/blog/wp-content/uploads/2010/03/CDF-of-running-data.png"><img decoding="async" class="size-full wp-image-1717" title="CDF of Fall Festival 10K &amp; the Strands 5K finishing times." src="http://centerspace.net/blog/wp-content/uploads/2010/03/CDF-of-running-data.png" alt="CDF of Fall Festival 10K &amp; the Strands 5K finishing times." width="400" srcset="https://www.centerspace.net/wp-content/uploads/2010/03/CDF-of-running-data.png 562w, https://www.centerspace.net/wp-content/uploads/2010/03/CDF-of-running-data-299x219.png 299w" sizes="(max-width: 562px) 100vw, 562px" /></a><figcaption id="caption-attachment-1717" class="wp-caption-text">CDF of Fall Festival 10K &amp; the Strands 5K finishing times with normalized finishing times.</figcaption></figure>
<p>The Strand 5K finishing times are not normally distributed because the big prize prompted many fast runners to show up and many average runners to enjoy the race from the sidelines. This grouped the finishing times around the winner (many close finishers) and so they were no longer normally distributed. This distribution looks more like a <a href="https://en.wikipedia.org/wiki/Weibull_distribution">Weibull</a> and we can test against that intuition with the code snippit.</p>
<pre lang="csharp">  WeibullDistribution wdist_s = new WeibullDistribution(25,4);

  kstest = new OneSampleKSTest(strands10k, wdist_s);
  rejectNH = kstest.Reject; // now False
</pre>
<p>Let&#8217;s look at the CDF of this weibull versus the data again.</p>
<figure id="attachment_1721" aria-describedby="caption-attachment-1721" style="width: 400px" class="wp-caption aligncenter"><a href="http://centerspace.net/blog/wp-content/uploads/2010/03/Weibull-CDF-passes-KS-test.png"><img decoding="async" class="size-full wp-image-1721" title="Strands 5K finishing times with a Weibull CDF." src="http://centerspace.net/blog/wp-content/uploads/2010/03/Weibull-CDF-passes-KS-test.png" alt="Strands 5K finishing times with a Weibull CDF." width="400" srcset="https://www.centerspace.net/wp-content/uploads/2010/03/Weibull-CDF-passes-KS-test.png 564w, https://www.centerspace.net/wp-content/uploads/2010/03/Weibull-CDF-passes-KS-test-300x119.png 300w" sizes="(max-width: 564px) 100vw, 564px" /></a><figcaption id="caption-attachment-1721" class="wp-caption-text">Normalized Strands 5K finishing times overlayed on a Weibull CDF.</figcaption></figure>
<h3>Simple Coin Flipping Example</h3>
<p>I&#8217;ll present one more simple example using a discrete distribution. The binomial distribution is used for modeling most coin flipping games, as it represents the distribution of successes in a sequence of independent yes/no questions. The binomial distribution is parametrized on the number of trials <code>n</code>, and the probability of each independent success, <code>p</code>. For example, using this binomial distribution we can model, say, the number of heads founds in a sequence of <code>10</code> coin flips, using <code> n=10</code> and <code>p=1/2</code></p>
<figure id="attachment_1683" aria-describedby="caption-attachment-1683" style="width: 280px" class="wp-caption alignleft"><a href="http://centerspace.net/blog/wp-content/uploads/2010/03/BinomialDistribution-n10-p0.5.png"><img decoding="async" loading="lazy" class="size-full wp-image-1683" title="Binomial Distribution n=10, p=0.5" src="http://centerspace.net/blog/wp-content/uploads/2010/03/BinomialDistribution-n10-p0.5.png" alt="Binomial Distribution n=10, p=0.5" width="280" height="204" srcset="https://www.centerspace.net/wp-content/uploads/2010/03/BinomialDistribution-n10-p0.5.png 560w, https://www.centerspace.net/wp-content/uploads/2010/03/BinomialDistribution-n10-p0.5-300x219.png 300w" sizes="(max-width: 280px) 100vw, 280px" /></a><figcaption id="caption-attachment-1683" class="wp-caption-text">Binomial Distribution with n=10 and p=0.5</figcaption></figure>
<p>As expected, the most likely number of heads would occur at 5 (with a probability of <code>0.246</code>), and the probability of either getting 3, 4, 5, or 6 heads in 10 flips would be the difference of the CDF at 6 and 3, equal to <code>0.656</code>. Below is the simple C# code used to compute the answers to these questions.</p>
<pre lang="csharp">using CenterSpace.NMath.Core;
using CenterSpace.NMath.Stats;

public Main()
{
  int number_of_trials = 10;
  int prob_of_success = 0.5;
  BinomialDistribution dist = 
    new BinomialDistribution(number_of_trials, prob_of_success);

  // Probability of landing 5 heads in 10 flips ( = 0.246)
  Double five_heads = dist.PDF(5); 

  // Probability of landing 3, 4, 5, or 6 heads in 10 flips ( = 0.656)
  Double three_to_six_heads = dist.CDF(6) - dist.CDF(3); 
}
</pre>
<p>I hope these code examples can help you get started using the NMath Stats distribution classes quickly and correctly.</p>
<p>-Happy Computing,</p>
<p><em>Paul</em></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/probability-distributions-in-nmath-stats">Probability Distributions in NMath Stats</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/probability-distributions-in-nmath-stats/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1633</post-id>	</item>
	</channel>
</rss>
