# Statistics

## Fitting the Weibull Distribution

The Weibull distribution is widely used in reliability analysis, hazard analysis, for modeling part failure rates and in many other applications. The NMath library currently includes 19 probably distributions and has recently added a fitting function to the Weibull distribution class at the request of a customer. The Weibull probability distribution, over the random variable x, has two para...

## Principal Components Regression: Part 3 – The NIPALS Algorithm

Principal Components Regression: Recap of Part 2 Recall that the least squares solution to the multiple linear problem is given by(1) $latex \hat{\beta} = (X^T X)^{-1} X^T y$ And that problems occurred finding when the matrix(2) was close to being singular. The Principal Components Regression approach to addressing the problem is to replace in equation (1) with a better conditioned...

## Principal Components Regression: Part 2 – The Problem With Linear Regression

This is the second part in a three part series on PCR, the first article on the topic can be found here. The Linear Regression Model Multiple Linear Regression (MLR) is a common approach to modeling the relationship between one or two or more explanatory variables and a response variable by fitting a linear equation to observed data. First let’s set up some notation. I will be rather brief, ass...

## Principal Component Regression: Part 1 – The Magic of the SVD

Introduction This is the first part of a multi-part series on Principal Component Regression, or PCR for short. We will eventually end up with a computational algorithm for PCR and code it up using C# using the NMath libraries. PCR is a method for constructing a linear regression model in the case that we have a large number of predictor variables which are highly correlated. Of course, we don't ...

## Savitzky-Golay Smoothing in C#

Savitzky-Golay smoothing effectively removes local signal noise while preserving the shape of the signal. Commonly, it's used as a preprocessing step with experimental data, especially spectrometry data because of it's effectiveness at removing random variation while minimally degrading the signal's information content. Savitzky-Golay boils down to a fast (multi-core scaling) correlation operati...