<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>Statistics Archives - CenterSpace</title>
	<atom:link href="https://www.centerspace.net/category/statistics/feed" rel="self" type="application/rss+xml" />
	<link>https://www.centerspace.net/category/statistics</link>
	<description>.NET numerical class libraries</description>
	<lastBuildDate>Sat, 17 Jul 2021 20:09:29 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1.1</generator>
<site xmlns="com-wordpress:feed-additions:1">104092929</site>	<item>
		<title>Fitting the Weibull Distribution</title>
		<link>https://www.centerspace.net/fitting-the-weibull-distribution</link>
					<comments>https://www.centerspace.net/fitting-the-weibull-distribution#respond</comments>
		
		<dc:creator><![CDATA[Paul Shirkey]]></dc:creator>
		<pubDate>Wed, 24 Jul 2019 18:30:45 +0000</pubDate>
				<category><![CDATA[NMath]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[.NET weibull]]></category>
		<category><![CDATA[C# weibull]]></category>
		<category><![CDATA[fitting the Weibull distribution]]></category>
		<category><![CDATA[Weibull]]></category>
		<category><![CDATA[weibull distribution]]></category>
		<guid isPermaLink="false">https://www.centerspace.net/?p=7434</guid>

					<description><![CDATA[<p>The Weibull distribution is widely used in reliability analysis, hazard analysis, for modeling part failure rates and in many other applications. The NMath library currently includes 19 probably distributions and has recently added a fitting function to the Weibull distribution class at the request of a customer. The Weibull probability distribution, over the random variable [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/fitting-the-weibull-distribution">Fitting the Weibull Distribution</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>The Weibull distribution is widely used in reliability analysis, hazard analysis, for modeling part failure rates and in many other applications.  The <strong>NMath </strong>library currently includes 19 probably distributions and has recently added a fitting function to the Weibull distribution class at the request of a customer.  </p>



<p>The Weibull probability distribution, over the random variable <em>x</em>, has two parameters:</p>



<ul><li>k &gt; 0, is the <em>shape parameter</em></li><li>λ &gt; 0, is the <em>scale parameter </em></li></ul>



<p>Frequently engineers have data that is known to be well modeled by the Weibull distribution but the shape and scale parameters are unknown. In this case a data fitting strategy can be used; <strong>NMath </strong>now has a maximum likelihood Weibull fitting function demonstrated in the code example below.</p>



<pre class="wp-block-code"><code>    public void WiebullFit()
    {
      double[] t = new double[] { 16, 34, 53, 75, 93, 120 };
      double initialShape = 2.2;
      double initialScale = 50.0;

      WeibullDistribution fittedDist = WeibullDistribution.Fit( t, initialScale, initialShape );

      // fittedDist.Shape parameter will equal 1.933
      // fittedDist.Scale parameter will equal 73.526
    }</code></pre>



<p>If the Weibull fitting algorithm fails the returned distribution will be <code>null</code>.  In this case improving the initial parameter guesses can help. The <code>WeibullDistribution.Fit()</code> function accepts either arrays, as seen above, or <code>DoubleVectors</code>.</p>



<p>The latest version of <strong>NMath</strong>, including this maximum likelihood Weibull fit function, is available on the CenterSpace <a href="https://www.nuget.org/profiles/centerspace">NuGet</a> gallery.</p>



<p></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/fitting-the-weibull-distribution">Fitting the Weibull Distribution</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/fitting-the-weibull-distribution/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7434</post-id>	</item>
		<item>
		<title>Principal Components Regression: Part 3 – The NIPALS Algorithm</title>
		<link>https://www.centerspace.net/principal-components-regression</link>
					<comments>https://www.centerspace.net/principal-components-regression#respond</comments>
		
		<dc:creator><![CDATA[Steve Sneller]]></dc:creator>
		<pubDate>Tue, 29 Nov 2016 19:23:13 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[NIPALS]]></category>
		<category><![CDATA[PCR]]></category>
		<category><![CDATA[PCR c#]]></category>
		<category><![CDATA[PCR estimator]]></category>
		<category><![CDATA[principal component analysis C#]]></category>
		<category><![CDATA[Principal Components Regression]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/?p=7075</guid>

					<description><![CDATA[<p>In this final entry of our three part series on Principle Component Regression (PCR) we described the NIPALS algorithm used to compute the principle components.  This is followed by a theoretical discussion of why the NIPALS algorithm works that is accessible to non-experts. </p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/principal-components-regression">Principal Components Regression: Part 3 – The NIPALS Algorithm</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2>Principal Components Regression: Recap of Part 2</h2>



<p>Recall that the least squares solution <img src="https://s0.wp.com/latex.php?latex=%5Cbeta&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;beta" class="latex" /> to the multiple linear problem <img src="https://s0.wp.com/latex.php?latex=X+%5Cbeta+%3D+y&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X &#92;beta = y" class="latex" /> is given by<br>(1) <img decoding="async" src="https://s0.wp.com/latex.php?latex=%5Chat%7B%5Cbeta%7D+%3D+%28X%5ET+X%29%5E%7B-1%7D+X%5ET+y+&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{&#92;beta} = (X^T X)^{-1} X^T y " class="latex" /></p>



<p>And that problems occurred finding <img src="https://s0.wp.com/latex.php?latex=%5Chat%7B%5Cbeta%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{&#92;beta}" class="latex" /> when the matrix<br>(2) <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /></p>



<p>was close to being singular. The Principal Components Regression approach to addressing the problem is to replace <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> in equation (1) with a better conditioned approximation. This approximation is formed by computing the eigenvalue decomposition for <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> and retaining only the r largest eigenvalues. This yields the PCR solution:<br>(3) <img src="https://s0.wp.com/latex.php?latex=%5Chat%7B%5Cbeta%7D_r%3D+V_1+%5CLambda_1%5E%7B-1%7D+V_1%5ET+X%5ET+y&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{&#92;beta}_r= V_1 &#92;Lambda_1^{-1} V_1^T X^T y" class="latex" /></p>



<p>where <img src="https://s0.wp.com/latex.php?latex=%5CLambda_1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;Lambda_1" class="latex" /> is an r x r diagonal matrix consisting of the r largest eigenvalues of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X%2CV_1%3D%28v_1%2C...%2Cv_r+%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X,V_1=(v_1,...,v_r )" class="latex" /><br>are the corresponding eigenvectors of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" />. In this piece we shall develop code for computing the PCR solution using the NMath libraries.</p>


<p>[eds: This blog article is final entry of a three part series on principal component regression. The first article in this series, &#8220;Principal Component Regression: Part 1 – The Magic of the SVD&#8221; is <a href="https://www.centerspace.net/theoretical-motivation-behind-pcr">here</a>. And the second, &#8220;Principal Components Regression: Part 2 – The Problem With Linear Regression&#8221; is <a href="https://www.centerspace.net/priniciple-components-regression-in-csharp">here</a>.]</p>



<h2>Review: Eigenvalues and Singular Values</h2>



<p>In order to develop the algorithm, I want to go back to the Singular Value Decomposition (SVD) of a matrix and its relationship to the eigenvalue decomposition. Recall that the SVD of a matrix X is given by<br>(4) <img src="https://s0.wp.com/latex.php?latex=X%3DU+%5CSigma+V%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X=U &#92;Sigma V^T" class="latex" /></p>



<p>Where U is the matrix of left singular vectors, V is the matrix of right singular vectors, and Σ is a diagonal matrix with positive entries equal to the singular values. The eigenvalue decomposition of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> is given by<br>(5) <img src="https://s0.wp.com/latex.php?latex=X%5ET+X%3DV+%5CLambda+V%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X=V &#92;Lambda V^T" class="latex" /></p>



<p>Where the eigenvalues of X are the diagonal entries of the diagonal matrix <img src="https://s0.wp.com/latex.php?latex=%5CLambda&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;Lambda" class="latex" /> and the columns of V are the eigenvectors of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> (V is also composed of the right singular vectors of X).<br>Recall further that if the matrix X has rank r then X can be written as<br>(6) <img src="https://s0.wp.com/latex.php?latex=X%3D+%5Csum_%7Bj%3D1%7D%5E%7Br%7D+%5Csigma_j+u_j+v_j%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X= &#92;sum_{j=1}^{r} &#92;sigma_j u_j v_j^T" class="latex" /></p>



<p>Where <img src="https://s0.wp.com/latex.php?latex=%5Csigma_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;sigma_j" class="latex" /> is the jth singular value (jth diagonal element of the diagonal matrix <img src="https://s0.wp.com/latex.php?latex=%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;Sigma" class="latex" />), <img src="https://s0.wp.com/latex.php?latex=u_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u_j" class="latex" /> is the jth column of U, and <img src="https://s0.wp.com/latex.php?latex=v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v_j" class="latex" /> is the jth column of V. An equivalent way of expressing the PCR solution (3) to the least squares problem in terms of the SVD for X is that we’ve replaced X in the solution (1) by its rank r approximation shown in (6).</p>



<h2>Principal Components</h2>



<p>The subject here is Principal Components Regression (PCR), but we have yet to mention principal components. All we have talked about are eigenvalues, eigenvectors, singular values, and singular vectors. We’ve seen how singular stuff and eigen stuff are related, but what are principal components?<br>Principal component analysis applies when one considers statistical properties of data. In linear regression each column of our matrix X represents a variable and each row is a set of observed value for these variables. The variables being observed are random variables and as such have means and variances. If we center the matrix X by subtracting from each column of X its corresponding mean, then we’ve normalized the random variables being observed so that they have zero mean. Once the matrix X is centered in this way, the matrix <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> is then proportional to the variance/covariance for the variables. In this context the eigenvectors of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> are called the Principal Components of X. For completeness (and because they are used in discussing the PCR algorithm), we define two more terms.<br>In the SVD given by equation (4), define the matrix T by<br>(7) <img src="https://s0.wp.com/latex.php?latex=T%3DU%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T=U&#92;Sigma" class="latex" /></p>



<p>The matrix T is called the <em>scores</em> for X. Note that T is orthogonal, but not necessarily orthonormal. Substituting this into the SVD for X yields<br>(8) <img src="https://s0.wp.com/latex.php?latex=X%3DTV%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X=TV^T" class="latex" /></p>



<p>Using the fact that V is orthogonal we can also write<br>(9) <img src="https://s0.wp.com/latex.php?latex=T%3DXV&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T=XV" class="latex" /></p>



<p>We call the matrix V the <em>loadings</em>. The goal of our algorithm is to obtain the representation given by equation (8) for X, retaining all the most significant principal components (or eigenvalues, or singular values – depending on where your heads at at the time).</p>



<h2>Computing the Solution</h2>



<p>Using equation (3) to compute the solution to our problem involves forming the matrix <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" /> and obtaining its eigenvalue decomposition. This solution is fairly straight forward and has reasonable performance for moderately sized matrices X. However, in practice, the matrix X can be quite large, containing hundreds, even thousands of columns. In addition, many procedures for choosing the optimal number r of eigenvalues/singular values to retain involve computing the solution for many different values of r and comparing them. We therefore introduce an algorithm which computes only the number of eigenvalues we need.</p>



<h2>The NIPALS Algorithm</h2>



<p>We will be using an algorithm known as NIPALS (Nonlinear Iterative PArtial Least Squares). The NIPALS algorithm for the matrix X in our least squares problem and r, the number of retained principal components, proceeds as follows:<br>Initialize <img src="https://s0.wp.com/latex.php?latex=j%3D1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="j=1" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=X_1%3DX&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_1=X" class="latex" />. Then iterate through the following steps –</p>



<ol><li>Choose <img src="https://s0.wp.com/latex.php?latex=t_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j" class="latex" /> as any column of <img src="https://s0.wp.com/latex.php?latex=X_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_j" class="latex" /></li><li>Let <img src="https://s0.wp.com/latex.php?latex=v_j+%3D+%28X_j%5ET+t_j%29+%2F+%5Cleft+%5C%7C+X_j%5ET+t_j+%5Cright+%5C%7C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v_j = (X_j^T t_j) / &#92;left &#92;| X_j^T t_j &#92;right &#92;|" class="latex" /></li><li>Let <img src="https://s0.wp.com/latex.php?latex=t_j%3D+X_j+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j= X_j v_j" class="latex" /></li><li>If <img src="https://s0.wp.com/latex.php?latex=t_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j" class="latex" /> is unchanged continue to step 5. Otherwise return to step 2.</li><li>Let <img src="https://s0.wp.com/latex.php?latex=X_%7Bj%2B1%7D%3D+X_j-+t_j+v_j%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_{j+1}= X_j- t_j v_j^T" class="latex" /></li><li>If <img src="https://s0.wp.com/latex.php?latex=j%3Dr&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="j=r" class="latex" /> stop. Otherwise increment j and return to step 1.</li></ol>



<h2>Properties of the NIPALS Algorithm</h2>



<p>Let us see how the NIPALS algorithm produces principal components for us.<br>Let <img src="https://s0.wp.com/latex.php?latex=%5Clambda_j+%3D+%5Cleft+%5C%7C+X%5ET+t_j+%5Cright+%5C%7C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;lambda_j = &#92;left &#92;| X^T t_j &#92;right &#92;|" class="latex" /> and write step (2) as<br>(10) <img src="https://s0.wp.com/latex.php?latex=X%5ET+t_j+%3D+%5Clambda_j+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T t_j = &#92;lambda_j v_j" class="latex" /></p>



<p>Setting <img src="https://s0.wp.com/latex.php?latex=t_j+%3D+X+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j = X v_j" class="latex" /> in step 3 yields<br>(11) <img src="https://s0.wp.com/latex.php?latex=X%5ET+X+v_j%3D+%5Clambda_j+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X v_j= &#92;lambda_j v_j" class="latex" /></p>



<p>This equation is satisfied upon completion of the loop 2-4. This shows that <img src="https://s0.wp.com/latex.php?latex=%5Clambda_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;lambda_j" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=p_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p_j" class="latex" /> are an eigenvalue and eigenvector of <img src="https://s0.wp.com/latex.php?latex=X%5ET+X&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X^T X" class="latex" />. The astute reader will note that the loop 2-4 is essentially the power method for computing a dominant eigenvalue and eigenvector for a linear transformation. Note further that using <img src="https://s0.wp.com/latex.php?latex=t_j%3DX+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j=X v_j" class="latex" /> and equation (11) we obtain<br>(12)</p>



<ul><li><img src="https://s0.wp.com/latex.php?latex=t_j%5ET+t_j%3D+v_j%5ET+X%5ET+Xv_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j^T t_j= v_j^T X^T Xv_j" class="latex" /></li><li><img src="https://s0.wp.com/latex.php?latex=%3D+v_j%5ET+%28X%5ET+Xv_j+%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="= v_j^T (X^T Xv_j )" class="latex" /></li><li><img src="https://s0.wp.com/latex.php?latex=%3D+%5Clambda_j+v_j%5ET+v_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="= &#92;lambda_j v_j^T v_j" class="latex" /></li><li><img src="https://s0.wp.com/latex.php?latex=%3D+%5Clambda_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="= &#92;lambda_j" class="latex" /></li></ul>



<p>After one iteration of the NIPALS algorithm we end up at step 5 with <img src="https://s0.wp.com/latex.php?latex=j%3D1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="j=1" class="latex" /> and<br>(13) <img src="https://s0.wp.com/latex.php?latex=X%3D+t_1+v_1%5ET%2B+X_2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X= t_1 v_1^T+ X_2" class="latex" /></p>



<p>Note that <img src="https://s0.wp.com/latex.php?latex=t_1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_1" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=X_2%3DX+-+t_1+v_1%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_2=X - t_1 v_1^T" class="latex" /><br>are orthogonal:<br>(14)</p>



<ul><li><img src="https://s0.wp.com/latex.php?latex=%28X-+t_1+v_1%5ET+%29%5ET+t_1+%3D+X%5ET+t_1-+v_1+t_1%5ET+t_1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="(X- t_1 v_1^T )^T t_1 = X^T t_1- v_1 t_1^T t_1" class="latex" /></li><li><img src="https://s0.wp.com/latex.php?latex=%3D+X%5ET+X+v_1-+v_1+%5Clambda_1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="= X^T X v_1- v_1 &#92;lambda_1" class="latex" /></li><li><img src="https://s0.wp.com/latex.php?latex=%3D0&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="=0" class="latex" /></li></ul>



<p>Furthermore, since <img src="https://s0.wp.com/latex.php?latex=t_2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_2" class="latex" /> is initially picked as a column of <img src="https://s0.wp.com/latex.php?latex=X_2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_2" class="latex" />, it is orthogonal to <img src="https://s0.wp.com/latex.php?latex=t_1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_1" class="latex" />. Upon completion of the algorithm we form the following two matrices:</p>



<ul><li><img src="https://s0.wp.com/latex.php?latex=T_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r" class="latex" />, whose columns are the vectors <img src="https://s0.wp.com/latex.php?latex=t_i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_i" class="latex" />, <img src="https://s0.wp.com/latex.php?latex=T_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r" class="latex" /> is orthogonal</li><li><img src="https://s0.wp.com/latex.php?latex=V_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="V_r" class="latex" /> whose columns are the <img src="https://s0.wp.com/latex.php?latex=v_i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v_i" class="latex" />, <img src="https://s0.wp.com/latex.php?latex=V_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="V_r" class="latex" /> is orthonormal.</li></ul>



<p>(15) <img src="https://s0.wp.com/latex.php?latex=X_r%3DT_r+V_r%5ET&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X_r=T_r V_r^T" class="latex" /></p>



<p>If r is equal to the rank of X then, using the information obtained from equations (12) and (14), it follows that (15) yields the matrix decomposition (8). The idea behind Principal Components Regression is that after choosing an appropriate r the important features of X have been captured in <img src="https://s0.wp.com/latex.php?latex=T_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r" class="latex" />. We then perform a linear regression with <img src="https://s0.wp.com/latex.php?latex=T_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r" class="latex" /> in place of X,<br>(16) <img src="https://s0.wp.com/latex.php?latex=T_r+c%3Dy&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r c=y" class="latex" />.</p>



<p>The least squares solution then gives<br>(17) <img src="https://s0.wp.com/latex.php?latex=%5Chat%7Bc%7D%3D+%28T_r%5ET+T_r+%29%5E%7B-1%7D+T_r%5ET+y&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{c}= (T_r^T T_r )^{-1} T_r^T y" class="latex" /></p>



<p>Note that since <img src="https://s0.wp.com/latex.php?latex=T_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="T_r" class="latex" /> is diagonal it is easy to invert. Also note that we left out the loadings matrix <img src="https://s0.wp.com/latex.php?latex=V_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="V_r" class="latex" />. This is due to the fact that the scores <img src="https://s0.wp.com/latex.php?latex=t_j&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t_j" class="latex" /> are linear combinations of the columns of X, and the PCR method amounts to singling out those combinations that are best for predicting y. Finally, using (9) and (16) we rewrite our linear regression problem <img src="https://s0.wp.com/latex.php?latex=X+%5Cbeta%3Dy&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="X &#92;beta=y" class="latex" /> as<br>(18) <img src="https://s0.wp.com/latex.php?latex=XV_r+c%3Dy&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="XV_r c=y" class="latex" /></p>



<p>From (18) we see that the PCR estimation <img src="https://s0.wp.com/latex.php?latex=%5Chat%7B%5Cbeta%7D_r&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{&#92;beta}_r" class="latex" /> is given by<br>(19) <img src="https://s0.wp.com/latex.php?latex=%5Chat%7B%5Cbeta%7D_r%3D+V_r+%5Chat%7Bc%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;hat{&#92;beta}_r= V_r &#92;hat{c}" class="latex" />.</p>



<p>Steve</p>



<p></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/principal-components-regression">Principal Components Regression: Part 3 – The NIPALS Algorithm</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/principal-components-regression/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7075</post-id>	</item>
		<item>
		<title>Principal Components Regression: Part 2 &#8211; The Problem With Linear Regression</title>
		<link>https://www.centerspace.net/priniciple-components-regression-in-csharp</link>
					<comments>https://www.centerspace.net/priniciple-components-regression-in-csharp#comments</comments>
		
		<dc:creator><![CDATA[Steve Sneller]]></dc:creator>
		<pubDate>Thu, 04 Mar 2010 17:17:07 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[PCR]]></category>
		<category><![CDATA[PCR c#]]></category>
		<category><![CDATA[PCR estimator]]></category>
		<category><![CDATA[principal component analysis C#]]></category>
		<category><![CDATA[principal component regression]]></category>
		<guid isPermaLink="false">http://centerspace.net/blog/?p=1816</guid>

					<description><![CDATA[<p>Multiple Linear Regression (MLR) is a powerful approach to modeling the relationship between one or two or more <em>explanatory</em> variables and a <em>response </em>variable by fitting a linear equation to observed data.  This is the second part in a three part series on PCR. </p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/priniciple-components-regression-in-csharp">Principal Components Regression: Part 2 &#8211; The Problem With Linear Regression</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><small> This is the second part in a three part series on PCR, the first article on the topic can be found <a href="/theoretical-motivation-behind-pcr/">here</a>.</small></p>
<h3><strong>The Linear Regression Model</strong></h3>
<p>Multiple Linear Regression (MLR) is a common approach to modeling the relationship between one or two or more <em>explanatory</em> variables and a <em>response </em>variable by fitting a linear equation to observed data. First let’s set up some notation. I will be rather brief, assuming the audience is somewhat familiar with MLR.</p>
<p>In multiple linear regression it is assumed that a <em>response variable</em>, <img decoding="async" title="Y" src="http://latex.codecogs.com/gif.latex?Y" alt="" /> depends on k <em>explanatory variables</em>, <img decoding="async" title="X_1,...X_k" src="http://latex.codecogs.com/gif.latex?X_1,...,X_k" alt="" />, by way of a linear relationship:</p>
<p><img decoding="async" title="Y=b_1X_1+b_2X_2...+b_kX_k" src="http://latex.codecogs.com/gif.latex?Y=b_1X_1+b_2X_2...+b_kX_k" alt="" /></p>
<p>The idea is to perform several observations of the response and explanatory variables and then to chose the linear coefficients <img decoding="async" title="b_1,...b_k" src="http://latex.codecogs.com/gif.latex?b_1,...b_k" alt="" /> which best fit the observed data.</p>
<p>Thus, a multiple linear regression model is:</p>
<p><img decoding="async" title="y_{i} = c + x_{i1}b_{1} + \cdots + x_{ik}b_{k} + f" src="http://latex.codecogs.com/gif.latex?y_{i} = c + x_{i1}b_{1} + \cdots + x_{ik}b_{k} + f_{i}" alt="" /><br />
<img decoding="async" title="i = 1,\cdots,n,\textrm{ where:}" src="http://latex.codecogs.com/gif.latex?i = 1,\cdots,n,\textrm{ where:}" alt="" /><br />
<img decoding="async" title="y_{i}\textrm{ is the }i\textrm{th value of the response variable}" src="http://latex.codecogs.com/gif.latex?y_{i}\textrm{ is the }i\textrm{th value of the response variable}" alt="" /><br />
<img decoding="async" title="x_{ij}\textrm{ is the }i\textrm{th value of the }j\textrm{th explanatory variable}" src="http://latex.codecogs.com/gif.latex?x_{ij}\textrm{ is the }i\textrm{th value of the }j\textrm{th explanatory variable}" alt="" /><br />
<img decoding="async" title="n \textrm{ is the sample size}" src="http://latex.codecogs.com/gif.latex?n \textrm{ is the sample size}" alt="" /><br />
<img decoding="async" title="k \textrm{ is the number of }x \textrm{-variables}" src="http://latex.codecogs.com/gif.latex?k \textrm{ is the number of }x \textrm{-variables}" alt="" /><br />
<img decoding="async" title="c \textrm{ is the intercpet of the regression model}" src="http://latex.codecogs.com/gif.latex?c \textrm{ is the intercpet of the regression model}" alt="" /><br />
<img decoding="async" title="b_{j} \textrm{ is the regression coefficient for the }j\textrm{th explanatory variable}" src="http://latex.codecogs.com/gif.latex?b_{j} \textrm{ is the regression coefficient for the }j\textrm{th explanatory variable}" alt="" /><br />
<img decoding="async" title="f\textrm{ is the random noise term, assumed independent, with zero mean and common variance }\sigma ^{2}" src="http://latex.codecogs.com/gif.latex?f\textrm{ is the random noise term, assumed independent, with zero mean and common variance }\sigma ^{2}" alt="" /><br />
<img decoding="async" title="c, b_{1},\cdots,b_{k}\textrm{ and }\sigma^{2}\textrm{ are unknown parameters, to be estimated from the data.}" src="http://latex.codecogs.com/gif.latex?c, b_{1},\cdots,b_{k}\textrm{ and }\sigma^{2}\textrm{ are unknown parameters, to be estimated from the data.}" alt="" /></p>
<p>In matrix notation we have</p>
<p><img decoding="async" title="Xb = y" src="http://latex.codecogs.com/gif.latex?\textrm{(1)  }Xb = y + f" alt="" /></p>
<p>where</p>
<p><img decoding="async" title="X = (x_{ij})\textrm{, } y = (y_{i}) \textrm{, and } f = f_{i}" src="http://latex.codecogs.com/gif.latex?X = (x_{ij})\textrm{, } y = (y_{i}) \textrm{, and } f = f_{i}" alt="" />.</p>
<p>The solution for the coefficient vector <img decoding="async" title="b" src="http://latex.codecogs.com/gif.latex?b" alt="" /> which “best” fits the data is given by the so called “normal equations”</p>
<p><img decoding="async" title="\textrm{(2)  }\beta=(X'X)^{-1}X'y" src="http://latex.codecogs.com/gif.latex?\textrm{(2)  }\beta=(X'X)^{-1}X'y" alt="" /></p>
<p>This is known as the least squares solution to the problem because it minimizes the sum of the squares of the errors.</p>
<p>Now, consider the following example in which<br />
<img decoding="async" title="X=\begin{bmatrix} 1 &amp; 1.9\ 1 &amp; 2.1\ 1 &amp; 2\\ 1&amp; 2\\ 1 &amp; 1.8 \end{bmatrix}" src="http://latex.codecogs.com/gif.latex?X=\begin{bmatrix} 1 &amp; 1.9\\ 1 &amp; 2.1\\ 1 &amp; 2\\ 1&amp; 2\\ 1 &amp; 1.8 \end{bmatrix}" alt="" /></p>
<p>and</p>
<p><img decoding="async" title="y=\begin{bmatrix} 6.0521\\ 7.0280\\ 7.1230\\ 4.4441\\ 5.0813 \end{bmatrix}" src="http://latex.codecogs.com/gif.latex?y=\begin{bmatrix} 6.0521\\ 7.0280\\ 7.1230\\ 4.4441\\ 5.0813 \end{bmatrix}" alt="" /></p>
<p>Solving this simple linear regression model using the normal equations yields</p>
<p><img decoding="async" title="\widehat{b}=\begin{bmatrix} -4.2489\\ 5.2013 \end{bmatrix}" src="http://latex.codecogs.com/gif.latex?\widehat{b}=\begin{bmatrix} -4.2489\\ 5.2013 \end{bmatrix}" alt="" /></p>
<p>which is quite far off from the actual solution</p>
<p><img decoding="async" title="b=\begin{bmatrix} 2\\ 2 \end{bmatrix}" src="http://latex.codecogs.com/gif.latex?b=\begin{bmatrix} 2\\ 2 \end{bmatrix}" alt="" /></p>
<p>The reason behind this is the fact that the matrix <img decoding="async" title="X'X" src="http://latex.codecogs.com/gif.latex?X'X" alt="" /> is ill conditioned. Since the second column of <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> is approximately twice the first, the matrix <img decoding="async" title="X'X" src="http://latex.codecogs.com/gif.latex?X'X" alt="" /> is almost singular.</p>
<p>One solution to this problem would be to change the model. Since the second column is approximately twice the first, these two explanatory variables encode basically the same information, thus we could remove one of them from the model.<br />
However, it is usually not so easy to identify the source of the bad conditioning as it is in this example.</p>
<p>Another method for removing information from a model that is responsible for impreciseness in the least squares solution is offered by the technique of <em>principal component regression </em>(PCR). Henceforth we shall assume that the data in the matrix <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> is <em>centered</em>. By this we mean that the mean of each explanatory variable has been subtracted from each column of X so that the explanatory variables all have mean zero. In particular this implies that the matrix <img decoding="async" title="X'X" src="http://latex.codecogs.com/gif.latex?X'X" alt="" /> is proportional to the covariance matrix for the explanatory variables.</p>
<h3>Removing the Source of Imprecision</h3>
<p>Let <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> be an mxn matrix, and recall from the part 1 of this series that we can write <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> as</p>
<p><img decoding="async" title="X^TX=V \Lambda V^T" src="http://latex.codecogs.com/gif.latex?X^TX=V \Lambda V^T" alt="" /></p>
<p>where <img decoding="async" title="\Lambda" src="http://latex.codecogs.com/gif.latex?\Lambda" alt="" /> is a diagonal matrix containing the eigenvalues (in ascending order down the diagonal) of <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" />, and <img decoding="async" title="V" src="http://latex.codecogs.com/gif.latex?V" alt="" /> is orthogonal. The condition number <img decoding="async" title="\kappa (X^TX)" src="http://latex.codecogs.com/gif.latex?\kappa (X^TX)" alt="" /> for <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> is just the absolute value of the ratio of the largest and smallest eigenvalues:</p>
<p><img decoding="async" title="\kappa(X^TX)=\left | \frac{\lambda_{max}}{\lambda_{min}} \right |" src="http://latex.codecogs.com/gif.latex?\kappa(X^TX)=\left | \frac{\lambda_{max}}{\lambda_{min}} \right |" alt="" /></p>
<p>Thus we can see that if the smallest eigenvalue is much smaller than the largest eigenvalue, we get a very large condition number which implies a poorly conditioned matrix. The idea then is to remove these small eigenvalues from <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> thus giving us an approximation to <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> that is better conditioned. To this end, suppose that we wish to retain the r (r less than or equal to n) largest eigenvalues of <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> in our approximation, and thus write</p>
<p><img decoding="async" title="X^TX=(V_1,V_2)\begin{pmatrix} \Lambda_1 &amp; 0 \\ 0 &amp; \Lambda_2 \end{pmatrix} \begin{pmatrix} V_1^T \\ V_2^T \end{pmatrix}" src="http://latex.codecogs.com/gif.latex?X^TX=(V_1,V_2)\begin{pmatrix} \Lambda_1 &amp; 0 \\ 0 &amp; \Lambda_2 \end{pmatrix} \begin{pmatrix} V_1^T \\ V_2^T \end{pmatrix}" alt="" />,</p>
<p>where</p>
<p><img decoding="async" title="\Lambda_1" src="http://latex.codecogs.com/gif.latex?\Lambda_1" alt="" /> is an r x r diagonal matrix consisting of the r largest eigenvalues of <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" />, <img decoding="async" title="\Lambda_2" src="http://latex.codecogs.com/gif.latex?\Lambda_2" alt="" /> is a (n-r) x (n-r) diagonal matrix consisting of the remaining n – r eigenvalues of <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" />, and the n x n matrix <img decoding="async" title="V=(V_1,V_2)" src="http://latex.codecogs.com/gif.latex?V=(V_1,V_2)" alt="" /> is orthogonal with <img decoding="async" title="V_1=(v_1,...v_r)" src="http://latex.codecogs.com/gif.latex?V_1=(v_1,...v_r)" alt="" /> consisting of the first  r columns of  <img decoding="async" title="V" src="http://latex.codecogs.com/gif.latex?V" alt="" />, and <img decoding="async" title="V_2=(v_{r+1},...v_n)" src="http://latex.codecogs.com/gif.latex?V_2=(v_{r+1},...v_n)" alt="" /> consisting of the remaining n – r columns of  <img decoding="async" title="V" src="http://latex.codecogs.com/gif.latex?V" alt="" />. Using this formulation we can write an approximation <img decoding="async" title="\widehat{X^TX}" src="http://latex.codecogs.com/gif.latex?\widehat{X^TX}" alt="" /> to <img decoding="async" title="X^TX" src="http://latex.codecogs.com/gif.latex?X^TX" alt="" /> using the r largest eigenvalues as</p>
<p><img decoding="async" title="\widehat{X^TX}=V_1 \Lambda_1 V_1^T" src="http://latex.codecogs.com/gif.latex?\widehat{X^TX}=V_1 \Lambda_1 V_1^T" alt="" />.</p>
<p>If we substitute this approximation into the normal equations 2, and do some simplification, we end up with the <em>principal components estimator</em></p>
<p><img decoding="async" title="\textrm{(3) }\widehat{\beta^{(r)}}=V_1 \Lambda_1^{-1} V_1^T X^T y" src="http://latex.codecogs.com/gif.latex?\textrm{(3) }\widehat{\beta^{(r)}}=V_1 \Lambda_1^{-1} V_1^T X^T y" alt="" />.</p>
<p>While we could use equation 3 directly, it is usually not the best way to perform principal components regression. The next article in this series will illustrate an algorithm for PCR and implement it using the NMath libraries.</p>
<p>-Steve</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/priniciple-components-regression-in-csharp">Principal Components Regression: Part 2 &#8211; The Problem With Linear Regression</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/priniciple-components-regression-in-csharp/feed</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1816</post-id>	</item>
		<item>
		<title>Principal Component Regression: Part 1 &#8211; The Magic of the SVD</title>
		<link>https://www.centerspace.net/theoretical-motivation-behind-pcr</link>
					<comments>https://www.centerspace.net/theoretical-motivation-behind-pcr#comments</comments>
		
		<dc:creator><![CDATA[Steve Sneller]]></dc:creator>
		<pubDate>Mon, 08 Feb 2010 17:44:45 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[PCR]]></category>
		<category><![CDATA[PCR c#]]></category>
		<category><![CDATA[principal component regression]]></category>
		<category><![CDATA[singular value decomposition]]></category>
		<category><![CDATA[SVD]]></category>
		<category><![CDATA[svd c#]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=1307</guid>

					<description><![CDATA[<p><img src="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg" alt="SVD of a 2x2 matrix" title="SVD of a 2x2 matrix" class="excerpt" /><br />
This is the first part of a multi-part series on Principal Component Regression, or PCR for short. We will eventually end up with a computational algorithm for PCR and code it up using C# using the NMath libraries. PCR is a method for constructing a linear regression model in the case that we have a large number of predictor variables which are highly correlated. Of course, we don't know exactly which variables are correlated, otherwise we'd just throw them out and perform a normal linear regression.</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/theoretical-motivation-behind-pcr">Principal Component Regression: Part 1 &#8211; The Magic of the SVD</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>This is the first part of a multi-part series on Principal Component Regression, or PCR for short. We will eventually end up with a computational algorithm for PCR and code it up using C# using the NMath libraries. PCR is a method for constructing a linear regression model in the case that we have a large number of predictor variables which are highly correlated. Of course, we don&#8217;t know exactly which variables are correlated, otherwise we&#8217;d just throw them out and perform a normal linear regression.</p>
<p>In order to understand what is going on in the PCR algorithm, we need to know a little bit about the SVD (Singular Value Decomposition). Understanding a bit about the SVD and it&#8217;s relationship to the eigenvalue decomposition will go a long way in understanding the PCR algorithm.</p>
<h2>The Singular Value Decomposition</h2>
<p>The SVD (Singular Value Decomposition)  is one of the most revealing matrix decompositions in linear algebra. A bit expensive to compute, but the bounty of information it yields is awe inspiring. Understanding a little about the SVD will illuminate the Principal Components Regression (PCR) algorithm. The SVD may seem like a deep and mysterious thing, at least I thought it was until I read the chapters covering it in the book  <a href="https://www.iri.upc.edu/people/thomas/Collection/details/5350.html">&#8220;Numerical Linear Algebra&#8221;</a>, by Lloyd N. Trefethen, and David Bau, III, which I summarize below.<br />
<span id="more-1307"></span><br />
We begin with an easy to state, and not too difficult to prove geometric statement about linear transformations.</p>
<h2>A Geometric Fact</h2>
<p>Let <img decoding="async" title="S" src="http://latex.codecogs.com/gif.latex?S" alt="" /> be the unit sphere in <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{n}" alt="" />, and let  <img decoding="async" title="X \in \mathbb{R}^{mxn}" src="http://latex.codecogs.com/gif.latex?X \in \mathbb{R}^{mxn}" alt="" /> be any matrix mapping <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{n}" alt="" /> into <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{m}" alt="" /> and suppose, for the moment, that <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> has full rank. Then the image, <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> of <img decoding="async" title="S" src="http://latex.codecogs.com/gif.latex?S" alt="" /> under <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> is a hyperellipse in <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{m}" alt="" /> (see the book for the proof).</p>
<p><figure id="attachment_1360" aria-describedby="caption-attachment-1360" style="width: 400px" class="wp-caption aligncenter"><a href="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg"><img decoding="async" loading="lazy" class="size-full wp-image-1360" title="SVD of a 2x2 matrix" src="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg" alt="SVD of a 2x2 matrix" width="400" height="184" srcset="https://www.centerspace.net/wp-content/uploads/2010/02/StevePCA_width400.jpg 400w, https://www.centerspace.net/wp-content/uploads/2010/02/StevePCA_width400-300x138.jpg 300w" sizes="(max-width: 400px) 100vw, 400px" /></a><figcaption id="caption-attachment-1360" class="wp-caption-text">Figure 1.  SVD of a 2x2 matrix</figcaption></figure></p>
<p>Given this fact we make the following definitions (refer to Figure 1.):</p>
<p>Define the singular values ,</p>
<p><img decoding="async" title="\sigma _{1}\cdots\sigma_{n}" src="http://latex.codecogs.com/gif.latex?\sigma _{1}\cdots\sigma_{n}" alt="" /></p>
<p>of <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> to be the lengths of the <img decoding="async" title="n" src="http://latex.codecogs.com/gif.latex?n" alt="" /> principal semiaxes of the hyperellipse <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" />. It is conventional to assume the singular values are numbered in descending order</p>
<p><img decoding="async" title="\inline \sigma {1}\geq \sigma _{2}\geq\cdots\geq \sigma_{n}" src="http://latex.codecogs.com/gif.latex?\inline \sigma {1}\geq \sigma _{2}\geq\cdots\geq \sigma_{n}" alt="" /></p>
<p>Define the left singular vectors</p>
<p><img decoding="async" title="u_{1},\cdots,u_{m}" src="http://latex.codecogs.com/gif.latex?u_{1},\cdots,u_{n}" alt="" /></p>
<p>to be unit vectors in the direction of the principal semiaxes of <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> and define the right singular vectors,</p>
<p><img decoding="async" title="v_{1}\cdots v_{n}" src="http://latex.codecogs.com/gif.latex?v_{1}\cdots v_{n}" alt="" />,</p>
<p>to be the pre-images of the principal semiaxes of <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> so that</p>
<p><img decoding="async" title="Xv_{i} = \sigma_{i}u_{i}" src="http://latex.codecogs.com/gif.latex?Xv_{i} = \sigma_{i}u_{i}" alt="" />.</p>
<p>In matrix form we have</p>
<p><img decoding="async" title="XV = U \Sigma" src="http://latex.codecogs.com/gif.latex?XV = U \Sigma" alt="" />,</p>
<p>where <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> is the <img decoding="async" src="http://latex.codecogs.com/gif.latex?n\textrm{ x }n" alt="" /> orthonormal matrix whose columns are the right singular vectors of <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" />, <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> is an <img decoding="async" src="http://latex.codecogs.com/gif.latex?n\textrm{ x }n" alt="" /> diagonal matrix with positive entries equal to the singular values, and <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> is an <img decoding="async" src="http://latex.codecogs.com/gif.latex?m\textrm{ x }n" alt="" /> matrix whose orthonormal columns are the left singular vectors.<br />
Since the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> are orthonormal by construction, <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" />is a <em>unitary</em> matrix, that is it&#8217;s transpose is equal to it&#8217;s inverse, thus we can write</p>
<p><img decoding="async" title="\textrm{(2) }X = U \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?\textrm{(2) }X = U \Sigma V^{T}" alt="" /></p>
<p>And there you have it, the SVD is all it&#8217;s majesty! Actually the above decomposition is what is known as the <em>reduced </em>SVD. Note that the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" />are <img decoding="async" src="http://latex.codecogs.com/gif.latex?n" alt="" /> orthonormal vectors in <img decoding="async" src="http://latex.codecogs.com/gif.latex?m" alt="" /> dimensional space. <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> can be extended to a unitary matrix by adjoining an additional <img decoding="async" src="http://latex.codecogs.com/gif.latex?m-n" alt="" /> orthonormal columns. If in addition we append <img decoding="async" src="http://latex.codecogs.com/gif.latex?m-n" alt="" /> rows of zeros to the bottom of the matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" />, it will effectively multiply the appended columns in <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> by zero, thus preserving equation (2). When <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> and <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> are modified in this way equation (2) is called the <em>full</em> SVD.</p>
<h2>The Relationship Between Singular Values and Eigenvalues</h2>
<p>There is an important relationship between the singular values of <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and the eigenvalues of <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" />. Recall that a vector <img decoding="async" title="v" src="http://latex.codecogs.com/gif.latex?v" alt="" /> is an eigenvector with corresponding eigenvalue <img decoding="async" title="\lambda" src="http://latex.codecogs.com/gif.latex?\lambda" alt="" /> for a matrix <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> if and only if <img decoding="async" title="Xv=\lambda v" src="http://latex.codecogs.com/gif.latex?Xv=\lambda v" alt="" />. Now, suppose we have the full SVD for <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> as in equation (2). Then</p>
<p><img decoding="async" title="X^{T}X=(U\Sigma V^{T})^{T}(U \Sigma V^{T})" src="http://latex.codecogs.com/gif.latex?X^{T}X=(U\Sigma V^{T})^{T}(U \Sigma V^{T})" alt="" /></p>
<p><img decoding="async" title="= V \Sigma ^{T}U^{T}U \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?= V \Sigma ^{T}U^{T}U \Sigma V^{T}" alt="" /></p>
<p><img decoding="async" title="= V \Sigma^{T} \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?= V \Sigma^{T} \Sigma V^{T}" alt="" /></p>
<p>or,</p>
<p><img decoding="async" title="(X^{T}X)V = V \Lambda" src="http://latex.codecogs.com/gif.latex?(X^{T}X)V = V \Lambda" alt="" /></p>
<p>where we have used the fact that <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> and <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> are unitary and set</p>
<p><img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda = \Sigma^{T} \Sigma" alt="" />.</p>
<p>Note that <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda" alt="" /> is a diagonal matrix with the singular values squared along the diagonal. From this it follows that the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" />are eigenvectors for <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" /> and the main diagonal of <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda" alt="" /> contain the corresponding eigenvalues. Thus the nonzero singular values of <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> are the square roots of the nonzero eigenvalues of <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" />.</p>
<p>We need one more very cool fact about the SVD before we get to the algorithm. Low-rank approximation.</p>
<h2>Low-Rank Approximation</h2>
<p>Suppose now that <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> has rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> and write <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> in equation (2) as the sum of <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> rank one matrices (each <img decoding="async" src="http://latex.codecogs.com/gif.latex?r\textrm{ x }r" alt="" /> rank one matrix will be all zeros except for <img decoding="async" src="http://latex.codecogs.com/gif.latex?\sigma_{j}" alt="" /> as the <img decoding="async" src="http://latex.codecogs.com/gif.latex?j" alt="" />th diagonal element). We can then, using equation (2), write <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> as the sum of rank one matrices,</p>
<p><img decoding="async" title="\textrm{(3)  }X=\sum_{j=1}^{r} \sigma_{j}u_{j}v_{j}^{T}" src="http://latex.codecogs.com/gif.latex?\textrm{(3)  }X=\sum_{j=1}^{r} \sigma_{j}u_{j}v_{j}^{T}" alt="" /></p>
<p>Equation (3) gives us a way to approximate any rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> by a lower rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k &lt; r" alt="" /> matrix. Indeed, given <img decoding="async" src="http://latex.codecogs.com/gif.latex?k &lt; r" alt="" />, form the <img decoding="async" src="http://latex.codecogs.com/gif.latex?k\textrm{th}" alt="" />partial sum</p>
<p><img decoding="async" title="X_{k}=\sum_{j=1}^{k} \sigma_{j}u_{j}v_{j}^{T}" src="http://latex.codecogs.com/gif.latex?X_{k}=\sum_{j=1}^{k} \sigma_{j}u_{j}v_{j}^{T}" alt="" /></p>
<p>Then <img decoding="async" src="http://latex.codecogs.com/gif.latex?X_{k}" alt="" /> is a rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k" alt="" /> approximation for <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" />.  How good is this approximation? Turns out it&#8217;s the best rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k" alt="" /> approximation you can get.</p>
<h2>Computing the Low-Rank Approximations Using NMath</h2>
<p>The NMath library provides two classes for computing the SVD for a matrix (actually 8 since there SVD classes for each of the datatypes <code>Double</code>, <code>Float</code>, <code>DoubleComplex</code> and <code>FloatComplex</code>). There is a basic decomposition class for computing the standard, reduced SVD, and a decomposition server class when more control is desired. Here is a simple C# routine that constructs the low-rank approximations for a matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and prints out the Frobenius norms of difference between <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and each of it&#8217;s low-rank approximations.</p>
<pre lang="csharp">static void LowerRankApproximations( DoubleMatrix X )
{
  // Construct the reduced SVD for X. We will consider
  // all singular values less than 1e-15 to be zero.
  DoubleSVDecomp decomp = new DoubleSVDecomp( X );
  decomp.Truncate( 1e-15 );
  int r = decomp.Rank;
  Console.WriteLine( "The {0}x{1} matrix X has rank {2}", X.Rows, X.Cols, r );

  // Construct the best lower rank approximations to X and
  // look at the frobenius norm of their differences.
  DoubleMatrix LowerRankApprox =
    new DoubleMatrix( X.Rows, X.Cols );
  double differenceNorm;
  for ( int k = 0; k &lt; r; k++ )
  {
    LowerRankApprox += decomp.SingularValues[k] *
      NMathFunctions.OuterProduct( decomp.LeftVectors.Col( k ), decomp.RightVectors.Col( k ) );
    differenceNorm = ( X - LowerRankApprox ).FrobeniusNorm();
    Console.WriteLine( "Rank {0} approximation difference
      norm = {1:F4}", k+1, differenceNorm );
  }
}</pre>
<p>Here&#8217;s the output for a matrix with 10 rows and 20 columns. Note that the rank can be at most 10.</p>
<pre lang="csharp">The 10x20 matrix X has rank 10
Rank 1 approximation difference norm = 3.7954
Rank 2 approximation difference norm = 3.3226
Rank 3 approximation difference norm = 2.9135
Rank 4 approximation difference norm = 2.4584
Rank 5 approximation difference norm = 2.0038
Rank 6 approximation difference norm = 1.5689
Rank 7 approximation difference norm = 1.1829
Rank 8 approximation difference norm = 0.8107
Rank 9 approximation difference norm = 0.3676
Rank 10 approximation difference norm = 0.0000</pre>
<p>-Steve</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/theoretical-motivation-behind-pcr">Principal Component Regression: Part 1 &#8211; The Magic of the SVD</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/theoretical-motivation-behind-pcr/feed</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1307</post-id>	</item>
		<item>
		<title>Savitzky-Golay Smoothing in C#</title>
		<link>https://www.centerspace.net/savitzky-golay-smoothing</link>
					<comments>https://www.centerspace.net/savitzky-golay-smoothing#comments</comments>
		
		<dc:creator><![CDATA[Paul Shirkey]]></dc:creator>
		<pubDate>Mon, 09 Nov 2009 19:45:58 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[data smoothing]]></category>
		<category><![CDATA[data smoothing in c#]]></category>
		<category><![CDATA[experimental data smoothing]]></category>
		<category><![CDATA[Savitzky-Golay]]></category>
		<category><![CDATA[Savitzky-Golay .NET]]></category>
		<category><![CDATA[Savitzky-Golay coefficients]]></category>
		<category><![CDATA[Savitzky-Golay in C#]]></category>
		<category><![CDATA[Savitzky-Golay in VB]]></category>
		<category><![CDATA[Savitzky-Golay smoothing]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=547</guid>

					<description><![CDATA[<p>Savitzky-Golay smoothing effectively removes local signal noise while preserving the shape of the signal. Commonly, it&#8217;s used as a preprocessing step with experimental data, especially spectrometry data because of it&#8217;s effectiveness at removing random variation while minimally degrading the signal&#8217;s information content. Savitzky-Golay boils down to a fast (multi-core scaling) correlation operation, and therefore can [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/savitzky-golay-smoothing">Savitzky-Golay Smoothing in C#</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Savitzky-Golay smoothing effectively removes local signal noise while preserving the shape of the signal.  Commonly, it&#8217;s used as a preprocessing step with experimental data, especially spectrometry data because of it&#8217;s effectiveness at removing random variation while minimally degrading the signal&#8217;s information content.  Savitzky-Golay boils down to a fast (multi-core scaling) correlation operation, and therefore can be used in a real-time environment or on large data sets efficiently.  If higher order information is needed from the signal, Savitzky-Golay can also provide high quality smoothed derivatives of a noisy signal.</p>
<h3>Algorithm</h3>
<p>Savitzky-Golay locally smooths a signal by fitting a polynomial, in a least squares sense, to a sliding window of data.  The degree of the polynomial and the length of the sliding window are the filter&#8217;s two tuning parameters.  If <code>n</code> is the degree of the polynomial that we are fitting, and <code>k</code> is the width of the sliding window, then<br />
<center><br />
<img decoding="async" title="n &lt; k - 1" src="http://latex.codecogs.com/gif.latex?n &lt; k - 1" alt="" /><br />
</center><br />
is needed for smoothing behavior (we must avoid an over-determined system).  Typically <code>n</code> is 3 or 4, and <code>k</code> depends on the size in samples of the noisy features to be suppressed in your data set.  </p>
<p>For the case of <code>n=0</code> the Savitzy-Golay filter degenerates to a moving average filter &#8211; which is good for removing white noise, but is poor for preserving peak shape (higher order moments).  For <code>n=1,</code> the filter does a linear least-squares fit of the windowed data to a line.  If <code>n=k-1,</code> the polynomial exactly fits data point in the window, and so no filtering takes place.</p>
<p>Once the polynomial is fit, then (typically) the center data-point in this window is replaced by value of the polynomial at that location.  The window then slides to the right one sample and the process is repeated.</p>
<p>Savizky-Golay delivers the unexpected surprise that <em> the polynomial fitting coefficients are constant </em> for a given <code>n</code> and <code>k.</code>  This means that once we fix <code>n</code> and <code>k</code> for our filter, the Savizky-Golay polynomial fitting coefficients are computed once during setup and then used across the entire data set.  This is why Savizky-Golay is a high performance correlation filter.</p>
<h3>Comparison Example</h3>
<p>The following three images show some real experimental data and a comparison of two filtering algorithms.  The first image shows the raw data, the second image shows the effect of an averaging filter, and the last image demonstrates a Savitzky-Golay smoothing filter of length five.</p>
<p><figure id="attachment_604" aria-describedby="caption-attachment-604" style="width: 400px" class="wp-caption aligncenter"><img decoding="async" loading="lazy" class="aligncenter size-full wp-image-604" title="current_raw_data" src="https://www.centerspace.net/blog/wp-content/uploads/2009/11/current_raw_data.jpg" alt="Unsmoothed Data" width="400" height="316" /><figcaption id="caption-attachment-604" class="wp-caption-text">Raw Data</figcaption></figure></p>
<p><figure id="attachment_603" aria-describedby="caption-attachment-603" style="width: 400px" class="wp-caption aligncenter"><img decoding="async" loading="lazy" class="size-full wp-image-603" title="current_averaging_filter" src="https://www.centerspace.net/blog/wp-content/uploads/2009/11/current_averaging_filter.jpg" alt="Averaging Filter, Length 5" width="400" height="316" /><figcaption id="caption-attachment-603" class="wp-caption-text">Averaging Filter, Length 5</figcaption></figure></p>
<p><figure id="attachment_605" aria-describedby="caption-attachment-605" style="width: 400px" class="wp-caption aligncenter"><img decoding="async" loading="lazy" class="size-full wp-image-605" title="current_savitzky-golay5" src="https://www.centerspace.net/blog/wp-content/uploads/2009/11/current_savitzky-golay5.jpg" alt="Savitzky-Golay Smoothing, Length = 5" width="400" height="316" /><figcaption id="caption-attachment-605" class="wp-caption-text">Savitzky-Golay Smoothing, Length 5</figcaption></figure></p>
<p>The averaging filter introduces a large error into the location of the orange peak whereas Savitzky-Golay removes the noise while maintaining the peak location.  Computationally, they require identical effort.</p>
<h3>Savitzky-Golay Smoothing in C#</h3>
<p>The Savitzky-Golay smoothing filter is implemented in the NMath-Stats package as a generalized correlation filter.  Any filter coefficients can be used with this moving window filter, Savitzky-Golay coefficients are just one possibility.  The moving window filter also does not require the filtering to take place in the center of the sliding window; so when specifying the window, two parameters are required: number to the left, and number to the right of the filtered data point.</p>
<p>Here are the key software components.</p>
<p><code></p>
<ul>
<li> MovingWindowFilter.<br />
SavitzkyGolayCoefficients(nLeft, nRight, PolyDeg)</li>
<li> MovingWindowFilter(nLeft, nRight, Coefficients[])</li>
<li> MovingWindowFilter.Filter(data, boundary options)</li>
</ul>
<p></code><br />
The first is a static function for generating the Savizky-Golay coefficients, the second is the filtering class that takes the generated coefficients in the constructor.  The third is the method that does the Savitzky-Golay filtering by running a cross-correlation between the data the the saved coefficients.</p>
<p>Below is a complete code example to copy and experiment with your own data sets.  Only three lines of code are needed to build the filter and actually do the filtering.  The remaining code builds a synthetic noisy signal to filter and displays the results.</p>
<pre lang="csharp">
int nLeft = 2;
int nRight = 2;
int n = 3;

// Generate the coefficients.
DoubleVector c =
MovingWindowFilter.SavitzkyGolayCoefficients( nLeft, nRight, n );

// Build the filter of width: nLeft + nRight + 1
MovingWindowFilter filter =
  new MovingWindowFilter( nLeft, nRight, c );
Console.WriteLine( "Filter coeffs = " + filter.Coefficients );

// Generate a signal
DoubleVector x = new DoubleVector( 100, -5, .1 );
DoubleVector y = new DoubleVector( x.Length );
for ( int i = 0; i &lt; x.Length; i++ )
{
  double a = x[i];
  y[i] = 0.03*Math.Pow(a, 3) + 0.2*Math.Pow(a, 2) -.22*a + 0.5;
}
Console.WriteLine( "Smoothed signal = " + y );

RandGenUniform rng = new RandGenUniform(-1, 1, 0x124 );
for ( int i = 0; i &lt; y.Length; i++ )
{
  y[i] += rng.NextDouble();
}
Console.WriteLine( "x = " + x );
Console.WriteLine( "Noisy signal = " + y );

// Do the filtering.
DoubleVector z =
filter.Filter(y, MovingWindowFilter.BoundaryOption.PadWithZeros);
Console.WriteLine( "Signal filtered = " + z );

</pre>
<p><em>-Paul </em></p>
<h3>Addendum &#8211; Savitzky-Golay Coefficients</h3>
<p>If you want to quickly try out Savitzky-Golay smoothing without computing the coefficients, or just compare coefficients, here are some coefficients for a sliding window of length five.  They also provide some insight into the relationship between the coefficients and the behavior of the filter.</p>
<pre class="code">
Filter length = 5, nLeft = 2, nRight = 2.
Polynomial of order 0,
[ 0.2 0.2 0.2 0.2 0.2 ]  (averaging filter)
Polynomial of order 1,
[ 0.2 0.2 0.2 0.2 0.2 ]
Polynomial of order 2,
[ -0.0857142 0.3428571 0.4857142 0.3428571 -0.0857142 ]
Polynomial of order 3,
[ -0.0857142 0.3428571 0.4857142 0.3428571 -0.0857142 ]
Polynomial of order 4 and higher,
[ 0 0 1 0 0 ]  (as expected, no filtering here!)
</pre>
<table border="0">
<tbody></tbody>
</table>
<h3>Another Smoothing Example on Real Data</h3>
<p>This is another example of Savitzky-Golay smoothing on some experimental data.  If more smoothing was desired, a longer filtering window could have been used.</p>
<p><figure id="attachment_602" aria-describedby="caption-attachment-602" style="width: 367px" class="wp-caption alignnone"><img decoding="async" loading="lazy" class="size-full wp-image-602" title="blue_rawdata_savitzky_golay5" src="https://www.centerspace.net/blog/wp-content/uploads/2009/11/blue_rawdata_savitzky_golay5.jpg" alt="Savitzky-Golay Smoothing Example, Length = 5." width="367" height="778" /><figcaption id="caption-attachment-602" class="wp-caption-text">Savitzky-Golay Smoothing, Length = 5.</figcaption></figure></p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/savitzky-golay-smoothing">Savitzky-Golay Smoothing in C#</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/savitzky-golay-smoothing/feed</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">547</post-id>	</item>
		<item>
		<title>Variance inflation factors</title>
		<link>https://www.centerspace.net/variance-inflation-factors</link>
					<comments>https://www.centerspace.net/variance-inflation-factors#respond</comments>
		
		<dc:creator><![CDATA[Trevor Misfeldt]]></dc:creator>
		<pubDate>Wed, 27 May 2009 20:11:56 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Computing variance inflation factors]]></category>
		<category><![CDATA[Variance Inflation factors]]></category>
		<category><![CDATA[vif]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=124</guid>

					<description><![CDATA[<p>A customer contacted us about computing &#8220;variance inflation factors&#8221;. Wikipedia defines this as: In statistics, the variance inflation factor (VIF) is a method of detecting the severity of multicollinearity. More precisely, the VIF is an index which measures how much the variance of a coefficient (square of the standard deviation) is increased because of collinearity. [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/variance-inflation-factors">Variance inflation factors</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>A customer contacted us about computing &#8220;variance inflation factors&#8221;.</p>
<p>Wikipedia defines this as:</p>
<blockquote><p>In statistics, the variance inflation factor (VIF) is a method of detecting the severity of multicollinearity. More precisely, the VIF is an index which measures how much the variance of a coefficient (square of the standard deviation) is increased because of collinearity. [<a href="https://en.wikipedia.org/wiki/Variance_inflation_factor">Ref]</a></p></blockquote>
<p>Here&#8217;s an implementation using CenterSpace&#8217;s NMath and NMath Stats libraries.</p>
<blockquote>
<pre>// Returns all the variance inflation factors
private static DoubleVector Vif( LinearRegression lr )
{
  // iterate through predictors and find variance
  // inflation factor for each
  DoubleVector factors =
    new DoubleVector( lr.NumberOfPredictors );
  for (int i = 0; i &lt; lr.NumberOfPredictors; i++)
  {
  factors[i] = Vif( lr, i );
  }
  return factors;
}

// Returns a single variance inflation factor
private static double Vif( LinearRegression lr, int i )
{
  // remove predictor, change observation
  LinearRegression lr2 = (LinearRegression)lr.Clone();
  lr2.RemovePredictor( i );
  lr2.SetRegressionData( lr2.PredictorMatrix,
    lr.PredictorMatrix.Col( i ), true );

  // calculate variance inflation factor
  LinearRegressionAnova anova =
    new LinearRegressionAnova( lr2 );

  // return factor
  return 1.0 / (1.0 - anova.RSquared);
}</pre>
</blockquote>
<p>And here&#8217;s an example using these functions:</p>
<blockquote>
<pre>DoubleMatrix independent = new DoubleMatrix(
  "30x3[0.270 78 41 0.282 79 56 0.277 81 63 " +
       "0.280 80 68 0.272 76 69 0.262 78 65 " +
       "0.275 82 61 0.267 79 47 0.265 76 32 " +
       "0.277 79 24 0.282 82 28 0.270 85 26 " +
       "0.272 86 32 0.287 83 40 0.277 84 55 " +
       "0.287 82 63 0.280 80 72 0.277 78 72 " +
       "0.277 84 67 0.277 86 60 0.292 85 44 " +
       "0.287 87 40 0.277 94 32 0.285 92 27 " +
       "0.282 95 28 0.265 96 33 0.265 94 41 " +
       "0.265 96 52 0.268 91 64 0.260 90 71]" );

DoubleVector dependent =
  new DoubleVector( "0.386 0.374 0.393 0.425 " +
  "0.406 0.344 0.327 0.288 0.269 0.256 0.286 " +
  "0.298 0.329 0.318 0.381 0.381 0.470 0.443 " +
  "0.386 0.342 0.319 0.307 0.284 0.326 0.309 " +
  "0.359 0.376 0.416 0.437 0.548" );

LinearRegression regression =
  new LinearRegression( independent, dependent, true );
Console.WriteLine( "Is good? " + regression.IsGood );
LinearRegressionAnova anova =
  new LinearRegressionAnova( regression );
Console.WriteLine( "variance: " + regression.Variance );
Console.WriteLine( "r-squared: " + anova.RSquared );
DoubleVector vif = Vif( regression );
Console.WriteLine( "variance inflation factors: " + vif );</pre>
</blockquote>
<p>-Trevor</p>
<p><strong>Note: </strong>This functionality is now in NMath Stats.</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/variance-inflation-factors">Variance inflation factors</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/variance-inflation-factors/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">124</post-id>	</item>
		<item>
		<title>Non-negative Matrix Factorization in NMath, Part 1</title>
		<link>https://www.centerspace.net/non-negative-matrix-factorization-in-nmath-part-1</link>
					<comments>https://www.centerspace.net/non-negative-matrix-factorization-in-nmath-part-1#respond</comments>
		
		<dc:creator><![CDATA[Steve Sneller]]></dc:creator>
		<pubDate>Fri, 09 Jan 2009 22:48:55 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[C# NMF]]></category>
		<category><![CDATA[NMF]]></category>
		<category><![CDATA[NMF clustering]]></category>
		<category><![CDATA[Non-negative matrix factorization]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=61</guid>

					<description><![CDATA[<p>A couple of years ago, we were asked by a customer to provide an implementation of an algorithm called Non-negative Matrix Factorization (NMF). We did a basic implementation, which we later included in our NMath Stats library. I kind of forgot about it until we recently heard from a prospective NMath customer who wanted to [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/non-negative-matrix-factorization-in-nmath-part-1">Non-negative Matrix Factorization in NMath, Part 1</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>A couple of years ago, we were asked by a customer to provide an implementation of an algorithm called Non-negative Matrix Factorization (NMF). We did a basic implementation, which we later included in our NMath Stats library. I kind of forgot about it until we recently heard from a prospective NMath customer who wanted to use NMF for grouping, or clustering. Talking with this customer rekindled my interest in NMF and we decided to provide additional functionality built on the existing NMF code to facilitate using the NMF for clustering.</p>
<p>This entry will proceed in three parts. The first will give a brief introduction to NMF and its uses, the second will briefly cover how to compute the factorization, and the third will cover how NMF can be used for clustering.</p>
<p><strong>The Non-negative Matrix Factorization</strong></p>
<p>Given a non-negative <em>m</em>-row by <em>n</em>-column matrix A, a non-negative matrix factorization of A is a non-negative <em>n</em>-row by <em>k</em>-column matrix W and a non-negative <em>k</em>-row by <em>m</em>-column matrix H whose product approximates the matrix A.</p>
<p>A ~ WH</p>
<p>The non-negativity of the elements of W and H are crucial, and are what make this problem a bit different. The entries of A usually represent some quantity for which negative numbers make no sense. For instance, the numbers, <em>a</em><sub><em>ij</em>,</sub> in A might be counts of the <em>i</em><sup>th</sup> term in the <em>j</em><sup>th</sup> document, or the <em>i</em><sup>th</sup> pixel value  in the <em>j</em><sup>th</sup> image.</p>
<p>So, why is this useful? Of course, it depends on the particular application, but the basic idea is <em>dimension reduction</em><span style="font-style: normal;">. In general, NMF is used only when the matrix A is large. In the image pixel value example, where each column of the matrix A contains the pixel values of a particular image, the number of rows will be quite large, as may be the number of columns. When we do an NMF of A and make <em>k</em> much smaller than the number of rows or columns in A, the factorization yields a representation of each column of A as a linear combination of the <em>k</em> columns of W, with the coefficients coming from H.</span></p>
<p><span style="font-style: normal;">For example, suppose I have 300 facial images (pictures of people&#8217;s faces). Each image is encoded as 50,000 pixel values. I arrange these into a 50,000 x 300 matrix A. 50,000 is a fairly large number, and if I am looking at each column of A as a vector, it&#8217;s a vector with 50,000 coordinates. Let&#8217;s do a NMF on A with <em>k</em> = 7. Now, each image (column in A) can be approximated by a linear combination of these 7 </span><em>basis images</em><span style="font-style: normal;">. If the approximation is good, these 7 basis images, which are the columns of W, must represent a good chunk of the information in the original 300 images, and we have reduced the dimension of the space we are working in from 50,000 down to 7. Indeed, in this particular application it was found that the columns of W represented facial characteristics </p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/non-negative-matrix-factorization-in-nmath-part-1">Non-negative Matrix Factorization in NMath, Part 1</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/non-negative-matrix-factorization-in-nmath-part-1/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">61</post-id>	</item>
	</channel>
</rss>
