<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>SVD Archives - CenterSpace</title>
	<atom:link href="https://www.centerspace.net/tag/svd/feed" rel="self" type="application/rss+xml" />
	<link>https://www.centerspace.net/tag/svd</link>
	<description>.NET numerical class libraries</description>
	<lastBuildDate>Sun, 03 May 2020 15:30:07 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1.1</generator>
<site xmlns="com-wordpress:feed-additions:1">104092929</site>	<item>
		<title>Principal Component Regression: Part 1 &#8211; The Magic of the SVD</title>
		<link>https://www.centerspace.net/theoretical-motivation-behind-pcr</link>
					<comments>https://www.centerspace.net/theoretical-motivation-behind-pcr#comments</comments>
		
		<dc:creator><![CDATA[Steve Sneller]]></dc:creator>
		<pubDate>Mon, 08 Feb 2010 17:44:45 +0000</pubDate>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[PCR]]></category>
		<category><![CDATA[PCR c#]]></category>
		<category><![CDATA[principal component regression]]></category>
		<category><![CDATA[singular value decomposition]]></category>
		<category><![CDATA[SVD]]></category>
		<category><![CDATA[svd c#]]></category>
		<guid isPermaLink="false">http://www.centerspace.net/blog/?p=1307</guid>

					<description><![CDATA[<p><img src="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg" alt="SVD of a 2x2 matrix" title="SVD of a 2x2 matrix" class="excerpt" /><br />
This is the first part of a multi-part series on Principal Component Regression, or PCR for short. We will eventually end up with a computational algorithm for PCR and code it up using C# using the NMath libraries. PCR is a method for constructing a linear regression model in the case that we have a large number of predictor variables which are highly correlated. Of course, we don't know exactly which variables are correlated, otherwise we'd just throw them out and perform a normal linear regression.</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/theoretical-motivation-behind-pcr">Principal Component Regression: Part 1 &#8211; The Magic of the SVD</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>This is the first part of a multi-part series on Principal Component Regression, or PCR for short. We will eventually end up with a computational algorithm for PCR and code it up using C# using the NMath libraries. PCR is a method for constructing a linear regression model in the case that we have a large number of predictor variables which are highly correlated. Of course, we don&#8217;t know exactly which variables are correlated, otherwise we&#8217;d just throw them out and perform a normal linear regression.</p>
<p>In order to understand what is going on in the PCR algorithm, we need to know a little bit about the SVD (Singular Value Decomposition). Understanding a bit about the SVD and it&#8217;s relationship to the eigenvalue decomposition will go a long way in understanding the PCR algorithm.</p>
<h2>The Singular Value Decomposition</h2>
<p>The SVD (Singular Value Decomposition)  is one of the most revealing matrix decompositions in linear algebra. A bit expensive to compute, but the bounty of information it yields is awe inspiring. Understanding a little about the SVD will illuminate the Principal Components Regression (PCR) algorithm. The SVD may seem like a deep and mysterious thing, at least I thought it was until I read the chapters covering it in the book  <a href="https://www.iri.upc.edu/people/thomas/Collection/details/5350.html">&#8220;Numerical Linear Algebra&#8221;</a>, by Lloyd N. Trefethen, and David Bau, III, which I summarize below.<br />
<span id="more-1307"></span><br />
We begin with an easy to state, and not too difficult to prove geometric statement about linear transformations.</p>
<h2>A Geometric Fact</h2>
<p>Let <img decoding="async" title="S" src="http://latex.codecogs.com/gif.latex?S" alt="" /> be the unit sphere in <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{n}" alt="" />, and let  <img decoding="async" title="X \in \mathbb{R}^{mxn}" src="http://latex.codecogs.com/gif.latex?X \in \mathbb{R}^{mxn}" alt="" /> be any matrix mapping <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{n}" alt="" /> into <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{m}" alt="" /> and suppose, for the moment, that <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> has full rank. Then the image, <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> of <img decoding="async" title="S" src="http://latex.codecogs.com/gif.latex?S" alt="" /> under <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> is a hyperellipse in <img decoding="async" title="\mathbb{R}^{n}" src="http://latex.codecogs.com/gif.latex?\mathbb{R}^{m}" alt="" /> (see the book for the proof).</p>
<figure id="attachment_1360" aria-describedby="caption-attachment-1360" style="width: 400px" class="wp-caption aligncenter"><a href="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg"><img decoding="async" loading="lazy" class="size-full wp-image-1360" title="SVD of a 2x2 matrix" src="https://www.centerspace.net/blog/wp-content/uploads/2010/02/StevePCA_width400.jpg" alt="SVD of a 2x2 matrix" width="400" height="184" srcset="https://www.centerspace.net/wp-content/uploads/2010/02/StevePCA_width400.jpg 400w, https://www.centerspace.net/wp-content/uploads/2010/02/StevePCA_width400-300x138.jpg 300w" sizes="(max-width: 400px) 100vw, 400px" /></a><figcaption id="caption-attachment-1360" class="wp-caption-text">Figure 1.  SVD of a 2x2 matrix</figcaption></figure>
<p>Given this fact we make the following definitions (refer to Figure 1.):</p>
<p>Define the singular values ,</p>
<p><img decoding="async" title="\sigma _{1}\cdots\sigma_{n}" src="http://latex.codecogs.com/gif.latex?\sigma _{1}\cdots\sigma_{n}" alt="" /></p>
<p>of <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> to be the lengths of the <img decoding="async" title="n" src="http://latex.codecogs.com/gif.latex?n" alt="" /> principal semiaxes of the hyperellipse <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" />. It is conventional to assume the singular values are numbered in descending order</p>
<p><img decoding="async" title="\inline \sigma {1}\geq \sigma _{2}\geq\cdots\geq \sigma_{n}" src="http://latex.codecogs.com/gif.latex?\inline \sigma {1}\geq \sigma _{2}\geq\cdots\geq \sigma_{n}" alt="" /></p>
<p>Define the left singular vectors</p>
<p><img decoding="async" title="u_{1},\cdots,u_{m}" src="http://latex.codecogs.com/gif.latex?u_{1},\cdots,u_{n}" alt="" /></p>
<p>to be unit vectors in the direction of the principal semiaxes of <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> and define the right singular vectors,</p>
<p><img decoding="async" title="v_{1}\cdots v_{n}" src="http://latex.codecogs.com/gif.latex?v_{1}\cdots v_{n}" alt="" />,</p>
<p>to be the pre-images of the principal semiaxes of <img decoding="async" title="XS" src="http://latex.codecogs.com/gif.latex?XS" alt="" /> so that</p>
<p><img decoding="async" title="Xv_{i} = \sigma_{i}u_{i}" src="http://latex.codecogs.com/gif.latex?Xv_{i} = \sigma_{i}u_{i}" alt="" />.</p>
<p>In matrix form we have</p>
<p><img decoding="async" title="XV = U \Sigma" src="http://latex.codecogs.com/gif.latex?XV = U \Sigma" alt="" />,</p>
<p>where <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> is the <img decoding="async" src="http://latex.codecogs.com/gif.latex?n\textrm{ x }n" alt="" /> orthonormal matrix whose columns are the right singular vectors of <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" />, <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> is an <img decoding="async" src="http://latex.codecogs.com/gif.latex?n\textrm{ x }n" alt="" /> diagonal matrix with positive entries equal to the singular values, and <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> is an <img decoding="async" src="http://latex.codecogs.com/gif.latex?m\textrm{ x }n" alt="" /> matrix whose orthonormal columns are the left singular vectors.<br />
Since the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> are orthonormal by construction, <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" />is a <em>unitary</em> matrix, that is it&#8217;s transpose is equal to it&#8217;s inverse, thus we can write</p>
<p><img decoding="async" title="\textrm{(2) }X = U \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?\textrm{(2) }X = U \Sigma V^{T}" alt="" /></p>
<p>And there you have it, the SVD is all it&#8217;s majesty! Actually the above decomposition is what is known as the <em>reduced </em>SVD. Note that the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" />are <img decoding="async" src="http://latex.codecogs.com/gif.latex?n" alt="" /> orthonormal vectors in <img decoding="async" src="http://latex.codecogs.com/gif.latex?m" alt="" /> dimensional space. <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> can be extended to a unitary matrix by adjoining an additional <img decoding="async" src="http://latex.codecogs.com/gif.latex?m-n" alt="" /> orthonormal columns. If in addition we append <img decoding="async" src="http://latex.codecogs.com/gif.latex?m-n" alt="" /> rows of zeros to the bottom of the matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" />, it will effectively multiply the appended columns in <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> by zero, thus preserving equation (2). When <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> and <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> are modified in this way equation (2) is called the <em>full</em> SVD.</p>
<h2>The Relationship Between Singular Values and Eigenvalues</h2>
<p>There is an important relationship between the singular values of <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and the eigenvalues of <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" />. Recall that a vector <img decoding="async" title="v" src="http://latex.codecogs.com/gif.latex?v" alt="" /> is an eigenvector with corresponding eigenvalue <img decoding="async" title="\lambda" src="http://latex.codecogs.com/gif.latex?\lambda" alt="" /> for a matrix <img decoding="async" title="X" src="http://latex.codecogs.com/gif.latex?X" alt="" /> if and only if <img decoding="async" title="Xv=\lambda v" src="http://latex.codecogs.com/gif.latex?Xv=\lambda v" alt="" />. Now, suppose we have the full SVD for <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> as in equation (2). Then</p>
<p><img decoding="async" title="X^{T}X=(U\Sigma V^{T})^{T}(U \Sigma V^{T})" src="http://latex.codecogs.com/gif.latex?X^{T}X=(U\Sigma V^{T})^{T}(U \Sigma V^{T})" alt="" /></p>
<p><img decoding="async" title="= V \Sigma ^{T}U^{T}U \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?= V \Sigma ^{T}U^{T}U \Sigma V^{T}" alt="" /></p>
<p><img decoding="async" title="= V \Sigma^{T} \Sigma V^{T}" src="http://latex.codecogs.com/gif.latex?= V \Sigma^{T} \Sigma V^{T}" alt="" /></p>
<p>or,</p>
<p><img decoding="async" title="(X^{T}X)V = V \Lambda" src="http://latex.codecogs.com/gif.latex?(X^{T}X)V = V \Lambda" alt="" /></p>
<p>where we have used the fact that <img decoding="async" src="http://latex.codecogs.com/gif.latex?U" alt="" /> and <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" /> are unitary and set</p>
<p><img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda = \Sigma^{T} \Sigma" alt="" />.</p>
<p>Note that <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda" alt="" /> is a diagonal matrix with the singular values squared along the diagonal. From this it follows that the columns of <img decoding="async" src="http://latex.codecogs.com/gif.latex?V" alt="" />are eigenvectors for <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" /> and the main diagonal of <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Lambda" alt="" /> contain the corresponding eigenvalues. Thus the nonzero singular values of <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> are the square roots of the nonzero eigenvalues of <img decoding="async" title="X^{T}X" src="http://latex.codecogs.com/gif.latex?X^{T}X" alt="" />.</p>
<p>We need one more very cool fact about the SVD before we get to the algorithm. Low-rank approximation.</p>
<h2>Low-Rank Approximation</h2>
<p>Suppose now that <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> has rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> and write <img decoding="async" src="http://latex.codecogs.com/gif.latex?\Sigma" alt="" /> in equation (2) as the sum of <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> rank one matrices (each <img decoding="async" src="http://latex.codecogs.com/gif.latex?r\textrm{ x }r" alt="" /> rank one matrix will be all zeros except for <img decoding="async" src="http://latex.codecogs.com/gif.latex?\sigma_{j}" alt="" /> as the <img decoding="async" src="http://latex.codecogs.com/gif.latex?j" alt="" />th diagonal element). We can then, using equation (2), write <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> as the sum of rank one matrices,</p>
<p><img decoding="async" title="\textrm{(3)  }X=\sum_{j=1}^{r} \sigma_{j}u_{j}v_{j}^{T}" src="http://latex.codecogs.com/gif.latex?\textrm{(3)  }X=\sum_{j=1}^{r} \sigma_{j}u_{j}v_{j}^{T}" alt="" /></p>
<p>Equation (3) gives us a way to approximate any rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?r" alt="" /> matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> by a lower rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k &lt; r" alt="" /> matrix. Indeed, given <img decoding="async" src="http://latex.codecogs.com/gif.latex?k &lt; r" alt="" />, form the <img decoding="async" src="http://latex.codecogs.com/gif.latex?k\textrm{th}" alt="" />partial sum</p>
<p><img decoding="async" title="X_{k}=\sum_{j=1}^{k} \sigma_{j}u_{j}v_{j}^{T}" src="http://latex.codecogs.com/gif.latex?X_{k}=\sum_{j=1}^{k} \sigma_{j}u_{j}v_{j}^{T}" alt="" /></p>
<p>Then <img decoding="async" src="http://latex.codecogs.com/gif.latex?X_{k}" alt="" /> is a rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k" alt="" /> approximation for <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" />.  How good is this approximation? Turns out it&#8217;s the best rank <img decoding="async" src="http://latex.codecogs.com/gif.latex?k" alt="" /> approximation you can get.</p>
<h2>Computing the Low-Rank Approximations Using NMath</h2>
<p>The NMath library provides two classes for computing the SVD for a matrix (actually 8 since there SVD classes for each of the datatypes <code>Double</code>, <code>Float</code>, <code>DoubleComplex</code> and <code>FloatComplex</code>). There is a basic decomposition class for computing the standard, reduced SVD, and a decomposition server class when more control is desired. Here is a simple C# routine that constructs the low-rank approximations for a matrix <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and prints out the Frobenius norms of difference between <img decoding="async" src="http://latex.codecogs.com/gif.latex?X" alt="" /> and each of it&#8217;s low-rank approximations.</p>
<pre lang="csharp">static void LowerRankApproximations( DoubleMatrix X )
{
  // Construct the reduced SVD for X. We will consider
  // all singular values less than 1e-15 to be zero.
  DoubleSVDecomp decomp = new DoubleSVDecomp( X );
  decomp.Truncate( 1e-15 );
  int r = decomp.Rank;
  Console.WriteLine( "The {0}x{1} matrix X has rank {2}", X.Rows, X.Cols, r );

  // Construct the best lower rank approximations to X and
  // look at the frobenius norm of their differences.
  DoubleMatrix LowerRankApprox =
    new DoubleMatrix( X.Rows, X.Cols );
  double differenceNorm;
  for ( int k = 0; k &lt; r; k++ )
  {
    LowerRankApprox += decomp.SingularValues[k] *
      NMathFunctions.OuterProduct( decomp.LeftVectors.Col( k ), decomp.RightVectors.Col( k ) );
    differenceNorm = ( X - LowerRankApprox ).FrobeniusNorm();
    Console.WriteLine( "Rank {0} approximation difference
      norm = {1:F4}", k+1, differenceNorm );
  }
}</pre>
<p>Here&#8217;s the output for a matrix with 10 rows and 20 columns. Note that the rank can be at most 10.</p>
<pre lang="csharp">The 10x20 matrix X has rank 10
Rank 1 approximation difference norm = 3.7954
Rank 2 approximation difference norm = 3.3226
Rank 3 approximation difference norm = 2.9135
Rank 4 approximation difference norm = 2.4584
Rank 5 approximation difference norm = 2.0038
Rank 6 approximation difference norm = 1.5689
Rank 7 approximation difference norm = 1.1829
Rank 8 approximation difference norm = 0.8107
Rank 9 approximation difference norm = 0.3676
Rank 10 approximation difference norm = 0.0000</pre>
<p>-Steve</p>
<p>The post <a rel="nofollow" href="https://www.centerspace.net/theoretical-motivation-behind-pcr">Principal Component Regression: Part 1 &#8211; The Magic of the SVD</a> appeared first on <a rel="nofollow" href="https://www.centerspace.net">CenterSpace</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.centerspace.net/theoretical-motivation-behind-pcr/feed</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1307</post-id>	</item>
	</channel>
</rss>
