<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>Trust Region minimization Archives - CenterSpace</title>
	<atom:link href="https://www.centerspace.net/tag/trust-region-minimization/feed" rel="self" type="application/rss+xml" />
	<link>https://www.centerspace.net/tag/trust-region-minimization</link>
	<description>.NET numerical class libraries</description>
	<lastBuildDate>Thu, 17 Oct 2019 01:13:04 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1.1</generator>
<site xmlns="com-wordpress:feed-additions:1">104092929</site>	<item>
		<title>Distribution Fitting Demo</title>
		<link>https://www.centerspace.net/distribution-fitting-demo</link>
					<comments>https://www.centerspace.net/distribution-fitting-demo#respond</comments>
		
		<dc:creator><![CDATA[Ken Baldwin]]></dc:creator>
		<pubDate>Mon, 09 Apr 2012 14:49:02 +0000</pubDate>
				<category><![CDATA[NMath Tutorial]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[CDF]]></category>
		<category><![CDATA[CDF C#]]></category>
		<category><![CDATA[gaussian distribution]]></category>
		<category><![CDATA[nonlinear least squares]]></category>
		<category><![CDATA[normal distribution]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[PDF C#]]></category>
		<category><![CDATA[probability distribution]]></category>
		<category><![CDATA[Trust Region minimization]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=3719</guid>

					<description><![CDATA[<p><img class="excerpt" title="Distribution Fit" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png" alt="Distribution Fit" /><br />
A customer recently asked how to fit a normal (Gaussian) distribution to a vector of experimental data. Here's a demonstration of how to do it.</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/distribution-fitting-demo">Distribution Fitting Demo</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>A customer recently asked how to fit a normal (Gaussian) distribution to a vector of experimental data. Here&#8217;s a demonstration of how to do it.</p>
<p>Let&#8217;s start by creating a data set: 100 values drawn from a normal distribution with known parameters (mean = 0.5, variance = 2.0).</p>
<pre lang="csharp">int n = 100;
double mean = .5;
double variance = 2.0;
var data = new DoubleVector( n, new RandGenNormal( mean, variance ) );</pre>
<p>Now, compute y values based on the empirical cumulative distribution function (CDF), which returns the probability that a random variable X will have a value less than or equal to x&#8211;that is, f(x) = P(X &lt;= x). Here&#8217;s an easy way to do, although not necessarily the most efficient for larger data sets:</p>
<pre lang="csharp">var cdfY = new DoubleVector( data.Length );
var sorted = NMathFunctions.Sort( data );
for ( int i = 0; i &lt; data.Length; i++ )
{
  int j = 0;
  while ( j &lt; sorted.Length &amp;&amp; sorted[j] &lt;= data[i] ) j++;
  cdfY[i] = j / (double)data.Length;
}</pre>
<p>The data is sorted, then for each value x in the data, we iterate through the sorted vector looking for the first value that is greater than x.</p>
<p>We&#8217;ll use one of NMath&#8217;s non-linear least squares minimization routines to fit a normal distribution CDF() function to our empirical CDF. NMath provides classes for fitting generalized one variable functions to a set of points. In the space of the function parameters, beginning at a specified starting point, these classes finds a minimum (possibly local) in the sum of the squared residuals with respect to a set of data points.</p>
<p>A one variable function takes a single double x, and returns a double y:</p>
<pre class="code">y = f(x)</pre>
<p>A <em>generalized</em> one variable function additionally takes a set of parameters, p, which may appear in the function expression in arbitrary ways:</p>
<pre class="code">y = f(p1, p2,..., pn; x)</pre>
<p>For example, this code computes y=a*sin(b*x + c):</p>
<pre lang="csharp">public double MyGeneralizedFunction( DoubleVector p, double x )
{
  return p[0] * Math.Sin( p[1] * x + p[2] );
}</pre>
<p>In the distribution fitting example, we want to define a parameterized function delegate that returns CDF(x) for the distribution described by the given parameters (mean, variance):</p>
<pre lang="csharp">Func<doublevector, double,="" double=""> f =
  ( DoubleVector p, double x ) =&gt;
    new NormalDistribution( p[0], p[1] ).CDF( x );</doublevector,></pre>
<p>Now that we have our data and the function we want to fit, we can apply the curve fitting routine. We&#8217;ll use a bounded function fitter, because the variance of the fitted normal distribution must be constrained to be greater than 0.</p>
<pre lang="csharp">var fitter = new BoundedOneVariableFunctionFitter<trustregionminimizer>( f );
var start = new DoubleVector( new double[] { 0.1, 0.1 } );
var lowerBounds = new DoubleVector( new double[] { Double.MinValue, 0 } );
var upperBounds = 
   new DoubleVector( new double[] { Double.MaxValue, Double.MaxValue } );
var solution = fitter.Fit( data, cdfY, start, lowerBounds, upperBounds );
var fit = new NormalDistribution( solution[0], solution[1] );

Console.WriteLine( "Fitted distribution:\nmean={0}\nvariance={1}",
  fit.Mean, fit.Variance );</trustregionminimizer></pre>
<p>The output for one run is</p>
<pre class="code">Fitted distribution: 
mean=0.567334190790594
variance=2.0361207956132</pre>
<p>which is a reasonable approximation to the original distribution (given 100 points).</p>
<p>We can also visually inspect the fit by plotting the original data and the CDF() function of the fitted distribution.</p>
<pre lang="csharp">ToChart( data, cdfY, SeriesChartType.Point, fit,
  NMathStatsChart.DistributionFunction.CDF );

private static void ToChart( DoubleVector x, DoubleVector y,
  SeriesChartType dataChartType, NormalDistribution dist,
  NMathStatsChart.DistributionFunction distFunction )
{
  var chart = NMathStatsChart.ToChart( dist, distFunction );
  chart.Series[0].Name = "Fit";

  var series = new Series() {
    Name = "Data",
    ChartType = dataChartType
  };
  series.Points.DataBindXY( x, y );
  chart.Series.Insert( 0, series );

  chart.Legends.Add( new Legend() );
  NMathChart.Show( chart );
}</pre>
<p><a href="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_cdf.png"><img decoding="async" class="aligncenter size-full wp-image-3727" title="distribution_fit_cdf" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_cdf.png" alt="CDF() of fitted distribution" width="482" height="488" srcset="https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_cdf.png 482w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_cdf-296x300.png 296w" sizes="(max-width: 482px) 100vw, 482px" /></a></p>
<p>We can also look at the probability density function (PDF) of the fitted distribution, but to do so we must first construct an empirical PDF using a histogram. The x-values are the midpoints of the histogram bins, and the y-values are the histogram counts converted to probabilities, scaled to integrate to 1.</p>
<pre lang="csharp">int numBins = 10;
var hist = new Histogram( numBins, data );

var pdfX = new DoubleVector( hist.NumBins );
var pdfY = new DoubleVector( hist.NumBins );
for ( int i = 0; i &lt; hist.NumBins; i++ )
{
  // use bin midpoint for x value
  Interval bin = hist.Bins[i];
  pdfX[i] = ( bin.Min + bin.Max ) / 2;

   // convert histogram count to probability for y value
   double binWidth = bin.Max - bin.Min;
   pdfY[i] = hist.Count( i ) / ( data.Length * binWidth );
}

ToChart( pdfX, pdfY, SeriesChartType.Column, fit,
  NMathStatsChart.DistributionFunction.PDF );</pre>
<p><a href="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png"><img decoding="async" loading="lazy" class="aligncenter size-full wp-image-3728" title="distribution_fit_pdf" src="https://www.centerspace.net/blog/wp-content/uploads/2012/04/distribution_fit_pdf.png" alt="PDF() of fitted distribution" width="485" height="484" srcset="https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf.png 485w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf-150x150.png 150w, https://www.centerspace.net/wp-content/uploads/2012/04/distribution_fit_pdf-300x300.png 300w" sizes="(max-width: 485px) 100vw, 485px" /></a></p>
<p>You might be tempted to try to fit a distribution PDF() function directly to the histogram data, rather than using the CDF() function like we did above, but this is problematic for several reasons. The bin counts have different variability than the original data. They also have a fixed sum, so they are not independent measurements. Also, for continuous data, fitting a model based on aggregated histogram counts, rather than the original data, throws away information.</p>
<p>Ken</p>
<p>Download the <a href="https://drive.google.com/open?id=1KlctDEKniD8SdmQiBGmcrJWMuhvU-WYP">source code</a></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/distribution-fitting-demo">Distribution Fitting Demo</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/distribution-fitting-demo/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3719</post-id>	</item>
	</channel>
</rss>
