Blog

Archive for the ‘NMath’ Category

Searching for Roots in Univariate Functions

Wednesday, July 21st, 2010

Finding the roots of a univariate function is a basic numerical procedure that nearly all engineers and scientists encounters at some point, because the root finding question addresses the basic equation,


where, x can be a real or complex variable. Or more generally, we can solve,

by finding the roots to,

Algorithms

I will demonstrate three well known univariate root finding algorithms, Newton-Raphson, Ridders, and Secant. All three methods require a delegate to the function and a bracketed range where a root is expected to reside, additionally, Newton-Raphson requires a continuous, non-zero derivative of f(x)in the region of the root. None of these methods are recommended for finding roots of polynomials in general, although Newton-Raphson will work well if the polynomial isn’t ill-conditioned by containing an even number of duplicated roots (Bracketing such roots isn’t possible because the function doesn’t change sign in the region of these roots.), or by containing root pairs extremely close together.

Name Convergence Rate Root remains bracketed? Comments
Newton-Raphson Quadratic Yes Often used for root polishing.

Secant Golden Ration – 1.618 No Fast in locally linear regions.

Ridders Quadratic Yes Frequently competitive with the more complex Brent’s method

C# Example with NMath

All of these roots finders have a similar interface and properties in NMath, so exchanging one with the other is easy. As seen in the example below, both the Ridders and Secant root finders implement the IOneVariableRootFinder interface, and the Newton-Raphson root finder implements the slightly different IOneVariableDRootFinder, where the ‘D’ stands for derivative.

Using Wolfram|Alpha’s online services, I’ve computed and graphed the roots of the following non-linear univariate function.
Root finding test function

The code block below demonstrates the use of each of these root finders, and shows the how to define a function delegate using a lamda expression. I find lamda expressions are convenient for defining delegates to mathematical expressions and that they create very readable code. (As an aside, lamda expressions are a C# language feature, and are not part of the .NET framework. Therefore, applications targeting the .NET 2.0 framework can leverage lamda expressions as long as the compiler supports them, as VS9 and VS10 both do.).

using CenterSpace.NMath.Analysis;
 
void RootFindingExample()
    {
 
     // Create the function delegate with a lamda expression, 
     // and the interval which contains a root.
      OneVariableFunction f = 
        new OneVariableFunction(x => 9 * Math.Log(x) * Math.Sin(x) - x * x * x);
      Interval bracket = new Interval(1, 1.5, Interval.Type.Closed);
 
      IOneVariableDRootFinder nrRoots = new NewtonRaphsonRootFinder(3);
      Double rootnewton = nrRoots.Find(f, f.Derivative(), bracket);
      Console.WriteLine(String.Format("Newton-Raphson root is = {0}", rootnewton));
 
      IOneVariableRootFinder rootFinder = new RiddersRootFinder(3);
      Double rootridder = rootFinder.Find(f, bracket);
      Console.WriteLine(String.Format("Ridder's root is = {0}", rootridder));
 
      rootFinder = new SecantRootFinder(3);
      Double rootsecant = rootFinder.Find(f, bracket);
      Console.WriteLine(String.Format("Secant root is = {0}", rootsecant));
 
    }

With each root finder limited to just 3 iterations, this program creates the output,

Newton-Raphson root is = 1.26760302048967
Ridder's root is = 1.26760302063032
Secant root is = 1.3334994219432

Where Newton-Raphson found the first root most accurately with it’s knowledge of derivative information. Besides specifying the maximum number of iterations, a root tolerance can also be used to control convergence effort.

A second somewhat more difficult example which can cause problems for both the Secant and Newton-Raphson methods.
Example root finding function.

I gave the Secant method 1000 iterations, and still it’s root estimate was very poor – in fact with an initial bracket of [-5, -0.01], the Secant method will never converge on the root. There is a stable orbit around which the Secant method spins forever, preventing convergence. Because the Newton-Raphson requires a continuous derivative any starting bracket that contains the origin will cause an exception to be thrown because of the discontinuous derivative at zero.

Newton-Raphson root is = -1.05539938820662 (with 8 iterations)
Ridder's root is = -1.05539938548287 (with 8 iterations)
Secant root is = -0.0912954387917323 (with 1000 iterations)

As a general recommendation, I would start my root search using Ridder’s method and then finish up by polishing with Newton-Raphson.

-Happy Computing,
Paul

  • Share/Bookmark

July Release of NMath and NMath Stats

Tuesday, June 8th, 2010

We are currently working hard on our upcoming July release of our NMath and NMath Stats C# math libraries. This release will add many new features from Runge-Kutta to automatic Peak Finding algorithms as well as address our most frequent support requests. Developers using our math libraries currently will find the new release build-compatible with the prior release. Upgrades are provided free of charge to customers with current annual maintenance contracts.  Maintenance contracts are available through our webstore.

Pure C#

Both libraries are now supported by a new pure C# math kernel doing away with our old C++ kernel. Because we are now a pure .NET assembly, deployment of NMath based applications is simplified by eliminating the Microsoft C++ runtime library dependency. As with all releases we will be posting our updated performance benchmarks at the time of the release.

Full Control

Additionally, our libraries have been re-architected to dynamically link to both native numerical libraries and ( and perhaps more importantly for our customers ) the Intel OMP threading library (libiomp.dll) . This means that our customers will have complete control over the threading library. In the past, we statically linked in OMP. Now, we are picking up OMP dynamically and thereby avoid collisions between statically and dynamically linked OMP libraries. In a nutshell, NMath will now play more nicely with libraries from other vendors.

New Features

Now for the fun stuff. The table below summarized the new features in NMath 4.1 and NMath Stats 3.2.

Product Feature Summary
NMath 4.1 Savitzky-Golay derivatives Class generates correlation coefficients to compute the smoothed Savizky-Golay derivatives of sampled data.
Savitzky-Golay smoothing See blog article on SG smoothing
Peak Finding Class finds peaks in sampled data using Savitzky-Golay smoothed polynomials and their derivatives.
Runge-Kutta ODE solver Class for solving ODE’s
Bounded function fitting Class for fitting general nonlinear models with bounds on the parameters. Also see this blog article for code examples of bounded nonlinear curving fitting.
Correlated random number generators Class creates streams of induced correlated random numbers typically for simulation studies using Monte Carlo.
NMath Stats 3.2 Johnson System of distributions The Johnson system of distributions is based on three possible transformations of a normal distribution–exponential, logistic, and hyperbolic sine–plus the identity transformation:

X = xi + (lambda * T((z – gamma) / delta))

where z is a standard normal random variable, xi and lambda are shape parameters, delta is a scale parameter, gamma is a location parameter, and T is the transformation.

Kruskal-Wallis rank sum test The Kruskal-Wallis rank sum test is a non-parametric test for equality of population medians among groups. It is a non-parametric version of the classical one-way ANOVA.
Regression statistics for PolynomialLeastSquares [see below]
Regression statistics for OneVariableFunctionFitter Class provides a variety of regression statistics including the residual sum of squares, R squared, adjusted R squared, F statistic, and others.

I hope you find these new additions to the library useful in your application work. If you are looking for something specific that isn’t currently supported in our library, please contact us. We build custom numeric classes for existing and new customers on a regular basis.

Happy Computing,
Paul

  • Share/Bookmark

Non-linear Curve Fitting

Monday, April 12th, 2010

SIGA, a public biotechnology company, recently hired CenterSpace consultants to refine their logistic modeling capabilities. Modeling a dose-response system with a logistic curve is one important special case of the more general non-linear curve fitting problem. If you are familiar with linear regression and related statistics, the non-linear case closely parallels the linear case making the jump to non-linear curve fitting easier.

SIGA Human BioArmor
SIGA is a world leader in designing and developing novel countermeasures to prevent and treat serious infectious diseases, with an emphasis on biological warfare defense.

In a previous blog post, Ken outlined the techniques for using NMath to compute various common linear trend lines and non-linear trend lines linearizable via a simple variable substitution.  This post extends that curve fitting article to the non-linear case.

Non-linear Curving Fitting – The Logistic

The logistic model is a fundamental non-linear model for many systems, and is widely used in the life sciences, medicine, and environmental toxicology.

Non-linear curve fit with logistic model

Non-linear curve fit with logistic model


This image shows a fit of a 4-parameter logistic model to the measured inhibitory response of an infectious agent to a treatment at various drug dose levels – this is a classic dose-response curve.  The 4-parameter model was used here because the underlying physical process is expected to be symmetric, and is defined as:

a - response at x = 0
b - slope factor
c - mid-range (50%) concentration
d - response as x -> infinity

The 5-parameter logistic adds an additional asymmetry parameter that allow the leading and trailing tails to have different curvatures. It is defined as:

a - response at x = 0
b - slope factor
c - mid-range (50%) concentration
d - response as x -> infinity
g - asymmetry factor

Using NMath the following C# code fits the four-parameter logistic to the given data. By extending the initial conditions vector by one element, and changing the model function delegate we could use the same code to fit the FiveParameterLogistic function.

using CenterSpace.NMath.Core;
using CenterSpace.NMath.Analysis;
 
 public void NonlinearFitExampleCode()
 {
    // Make some synthetic data
    DoubleVector xValues = new DoubleVector("30000 10000 3000 500 200 60  25 10");
    DoubleVector yValues = new DoubleVector("112.8 115 131 118 26 1.7 -0.8 -1.4");
 
    // This is the function delegate that the curve fitting 
    // class will use - any delegate can be used.
    NMathFunctions.GeneralizedDoubleUnaryFunction f 
      = AnalysisFunctions.FourParameterLogistic;
 
    // The starting point in the function parameter space.
    DoubleVector start = new DoubleVector("0.1 0.1 0.1 0.1");
 
    // Construct a curve fitting object for our function, then perform the fit.
    OneVariableFunctionFitter fitter 
      = new OneVariableFunctionFitter(f);
    DoubleVector solution = fitter.Fit(xValues, yValues, start);
 
    // Display solution
    Console.WriteLine("4 Parameter Logistic Model Parameters: {0}", solution.ToString("0.####"));
}

The model parameters are returned in a vector, in the order defined above [a, b, c, d].

4 Parameter Logistic Model Parameters:
  [ -0.1755 6.0556 246.8343 119.6108 ]

The fitting class OneVariableFunctionFitter is based on the Trust Region algorithm which is an extension of the Levenberg-Marquardt minimization algorithm. As used in the example above, the partial derivatives with respect to the parameters are numerically estimated. In our experience numerically estimating the partials works very well. However, for improved precision and performance it is also possible to explicitly specify delegates for the partials, using the property:

 NMathFunctions.GeneralizedDoubleUnaryFunction[] PartialDeriviates 

If the curvature of the function surface is low, as is typically the case for the logistic model, and the local linearized estimate closely approximates the function surface during the minimization process, the partial derivatives are not even evaluated for performance reasons.

Bounded Parameters

Because the OneVariableFunctionFitter is based on the Trust Region algorithm, it is possible to place bounds on the parameters. This is a very powerful feature, which is frequently employed when modeling a dose-response curve with a logistic. There’s an old debate between the chemists and biologists regarding the dose-response curve on whether the curve should be relative or absolute. In the image above, a relative dose-response curve is shown, as there were no bounds placed on the four parameters. However, we can create an absolute dose-response curve by adding the constraints 100 >= a >= 0 and 0 <= d <= 100 and re-fitting the curve. This will enforce all predicted responses to be between 0% and 100%.

Logistic model fit with parameter bounds

Logistic model fit with parameter bounds 100 >= a >= 0 and 0 <= d <= 100

Here is a C# code example on how to do this with NMath.

using CenterSpace.NMath.Core;
using CenterSpace.NMath.Analysis;
 
 public void BoundedNonlinearFitExampleCode()
 {
 
  // Make some synthetic data
  DoubleVector xValues = new DoubleVector("30000 10000 3000 500 200 60  25 10");
  DoubleVector yValues = new DoubleVector("112.8 115 131 118 26 1.7 -0.8 -1.4");
 
  // The starting point in the function parameter space.
  DoubleVector start = new DoubleVector("0.1 0.1 0.1 100.0");
 
  // 4-parameter bounds in the order of a, b, c, & d
  DoubleVector lowerBounds = new DoubleVector("0.0 0.0 0.0 0.0");
  DoubleVector upperBounds = new DoubleVector("100.0 100000.0 100000.0 100.0");
 
  // Convert the 4-parameter logistic to a Func<>
  Func<Double[], Double, Double> fitFunction = 
    new Func<double[], double, double>((double[] soln, Double x) => AnalysisFunctions.FourParameterLogisticFunction(new   DoubleVector(soln), x));
 
  // This class is not included with the current release.
  FunctionFitterWithBounds fitter = 
    new FunctionFitterWithBounds(fitFunction, lowerBounds, upperBounds);
 
  //Now define and fit the data as shown above.
  DoubleVector solution = fitter.Fit(xValues, yValues, start);
 
  // Display solution
   Console.WriteLine("4 Parameter Logistic Model Parameters: {0}", solution.ToString("0.####"));
}

The bounded model parameters returned, in the order defined above [a, b, c, d] are:

4 Parameter Logistic Model Parameters: [ 0 0.1652 0.787 100 ]

Note that the first and last parameters are now within the specified bounds.

The class FunctionFitterWithBounds is unreleased (and won't be in this form) but you can download it and use it if you have the latest release of NMath installed. The non-linear curving fitting NMath API will be extended and refined in our next NMath release, and bounded non-linear curve fitting will be more clearly accessible.

  • Click here to download the free C# FunctionFitterWithBounds class.

Goodness-Of-Fit Model Statistics

After we are done fitting our model, the fit may look good, but really how good is the fit? To better understand the goodness-of-fit of our model, we need to analyze the residuals of the fit and related statistics. The next release of NMath Stats (expected in June) will offer a new class GoodnessOfFit that will provide a complete set of statistical measures that can be used to assess the goodness-of-fit for a non-linear model, including the F-Statistic, t-Statistic, and parameter confidence intervals. Please contact us if you are interested in obtaining an advanced copy of this class.

Happy Computing,
Paul

Resources

  • The five-parameter logistic: A characterization and comparison with the four-parameter logistic, Paul G. Gotschalk, John R. Dunn, Analytical Biochemistry 343 (2005) pp 54-65.
  • Principles of Curve Fitting for Multiplex Sandwich Immunoassys, D. Davis, A. Zhang, C. Etienne, I. Huang, & M. Malit, tech note 2861.
  • A brief technical introduction to Trust Region methods, by Mark S. Gockenbach
    • Share/Bookmark

Excel Trendlines

Thursday, March 25th, 2010

We are sometimes asked how to reproduce the various Excel Trendline types in NMath, including printing out the form of the equation and the R2 value (coefficient of determination). Excel offers these trend types:

  • Linear
  • Logarithmic
  • Exponential
  • Power
  • Polynomial
  • Moving Average

These can all be easily computed using NMath. Let’s see how. First, let’s start with a simple data series:

DoubleVector x = new DoubleVector(11, 15, 18, 23, 26, 31, 39, 44, 54, 64, 74);
DoubleVector y = new DoubleVector(0.00476, 0.0105, 0.0207, 0.0619, 0.337, 0.74, 1.7, 2.45, 3.5, 4.5, 5.09);

These data represent the evolution of an algal bloom in the Adriatic Sea. The x-values are time expressed in days, and the y-values are the size of the surface of the bloom in mm2.

Linear

The Linear trendline fits a line, y = a*x + b, to the data. This is easily computed using the LinearRegression class in NMath Stats, as shown in the following C# code:

bool addIntercept = true;
LinearRegression lr = new LinearRegression(new DoubleMatrix(x), y, addIntercept);
LinearRegressionAnova lrAnova = new LinearRegressionAnova(lr);
double a = lr.Parameters[0];
double b = lr.Parameters[1];
double r2 = lrAnova.RSquared;
 
Console.WriteLine("y = {0}x + {1}", a, b);
Console.WriteLine("r2 = {0}", r2);

Output

y = -1.63908651994092x + 0.0913403802489977
r2 = 0.967828167696715

Note that to compute the coefficient of determination (R2), we construct a LinearRegressionAnova object from the LinearRegression instance. This class tests overall model significance for regressions computed by LinearRegression.

Logarithmic

The Logarithmic trendline fits a line to ln(x), y–that is, y = a*ln(x) + b. Again, we can use the LinearRegression class for this:

DoubleVector logX = NMathFunctions.Log(x);
lr = new LinearRegression(new DoubleMatrix(logX), y, addIntercept);
a = lr.Parameters[0];
b = lr.Parameters[1];
r2 = new LinearRegressionAnova(lr).RSquared;
 
Console.WriteLine("y = {0}ln(x) + {1}", a, b);
Console.WriteLine("r2 = {0}", r2);

Output:

y = -8.17805531083818ln(x) + 2.87283105614145
r2= 0.840393353070466

Exponential

The Exponential trendline fits the function y = a * e ^(b * x). This can also be computed using LinearRegression by fitting a line to x, ln(y). Taking the log of both sides of the function gives:

ln(y) = ln(a * e ^(b * x)) = ln(a) + bx

The following C# code does this:

DoubleVector logY = NMathFunctions.Log(y);
lr = new LinearRegression(new DoubleMatrix(x), logY, addIntercept);
a = Math.Exp(lr.Parameters[0]);
b = lr.Parameters[1];
r2 = new LinearRegressionAnova(lr).RSquared;
 
Console.WriteLine("y = {0}e^{1}x", a, b);
Console.WriteLine("r2 = {0}", r2);

Output:

y = 0.00552689525130073e^0.112876522212724x
r2 = 0.812366433832176

Note that because the intercept of the fitted line is the natural log of the “a” parameter in the exponential function, we need to take the exponential of the found intercept (using Math.Exp) to recover the value of “a”.

Power

The Power trendline fits the function y = a * x^b. This can be computed using LinearRegression by fitting a line to ln(x), ln(y). Taking the log of both sides of the equation gives:

ln(y) = ln(a * x^b) = ln(a) + b * ln(x)
lr = new LinearRegression(new DoubleMatrix(logX), logY, addIntercept);
lrAnova = new LinearRegressionAnova(lr);
a = Math.Exp(lr.Parameters[0]);
b = lr.Parameters[1];
r2 = new LinearRegressionAnova(lr).RSquared;
 
Console.WriteLine("y = {0}x^{1}", a, b);
Console.WriteLine("r2 = {0}", r2);

Output:

y = 2.46993343563889E-07x^4.11443630332377
r2 = 0.947447653331871

Again, we recover the value of parameter “a” by taking the exponential of the found intercept.

Polynomial

NMath provides class PolynomialLeastSquares for fitting a polynomial of the specified degree to paired vectors of x- and y-values. For example, this code fits a cubic to our data:

int degree = 3;
PolynomialLeastSquares pls = new PolynomialLeastSquares(degree, x, y);
 
// compute R2
DoubleVector predictions = pls.FittedPolynomial.Evaluate(x);
double regressionSumOfSquares = StatsFunctions.SumOfSquares(predictions - StatsFunctions.Mean(y));
double residualSumOfSquares = pls.LeastSquaresSolution.Residuals.TwoNormSquared();
r2 = regressionSumOfSquares / (regressionSumOfSquares + residualSumOfSquares);
 
Console.WriteLine("y = " + pls.FittedPolynomial);
Console.WriteLine("r2 = {0}", r2);

Output:

y = -4.68278158094222E-05x^3 + 0.00640408381023593x^2 -
      0.1643720340709x + 1.12703300920657
r2 = 0.998634459376868

Note that PolynomialLeastSquares does not currently provide the R2 value, so in the code above we compute it directly.

Moving Average

Unlike the trendlines we’ve examined so far, a moving average does not fit a functional form to data. Rather, it filters data in order to smooth out noise. NMath provides class MovingWindowFilter for this purpose.

Class MovingWindowFilter replaces data points with a linear combination of the data points immediately to the left and right of each point, based on a given set of coefficients to use in the linear combination. Static class methods are provided for generating coefficients to implement a moving average filter and a Savitzky-Golay smoothing filter.

For example, the following C# code filters the data using a window of width 3 (the “period” parameter in Excel):

int numberLeft = 1;
int numberRight = 1;
DoubleVector movingAvgCoefficents = MovingWindowFilter.MovingAverageCoefficients(numberLeft, numberRight);
MovingWindowFilter filter = new MovingWindowFilter(numberLeft, numberRight, movingAvgCoefficents);
DoubleVector yfiltered = filter.Filter(y, MovingWindowFilter.BoundaryOption.DoNotFilterBoundaryPoints);
Console.WriteLine("yfiltered = " + yfiltered);

Output:

yfiltered = [ 0.00476 0.0119866666666667 0.0310333333333333
              0.139866666666667 0.379633333333333 0.925666666666667 1.63
              2.55 3.48333333333333 4.36333333333333 5.09 ]

Advanced Curve Fitting

That covers the simple trendlines produced by Excel. For advanced curve fitting, NMath provides class OneVariableFunctionFitter which fits arbitrary one variable functions to a set of points. (You must supply at least as many data points to fit as your function has parameters.) In a future post, we’ll take a look at this class. In the meantime, see this page for an example demonstrating the use of OneVariableFunctionFitter, including how to define your own function.

Ken

  • Share/Bookmark

Using NMath in .NET 4.0 Applications

Wednesday, March 17th, 2010

Version 4.0 of the .NET Framework introduced a new runtime activation policy, which enables an application to activate multiple versions of the common language runtime (CLR) in the same process. (See here for more information.)

Because of this support for side-by-side runtimes, .NET 4.0 has changed the way that it binds to older mixed-mode assemblies, such as the NMath Kernel assembly, which is written in C++\CLI and uses the Intel Math Kernel Library internally.

When using NMath in .NET 4.0 applications, the NMath Kernel fails to load, because it was built against an earlier version of .NET and includes unmanaged code. The error is:

System.TypeInitializationException : The type initializer for
'CenterSpace.NMath.Core.NMathKernel' threw an exception.
----> CenterSpace.NMath.Core.KernelLoadException : Could
not load kernel assembly NMathKernelx86

There are two solutions:

  1. Activate the v2.0 runtime activation policy by adding the following lines to your app.config when using runtime v4.0:
    <configuration>
      <startup useLegacyV2RuntimeActivationPolicy="true">
        <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/>
      </startup>
    </configuration>

    See here for more information on the <startup> element.

  2. Request assemblies for .NET 3.5  from Centerspace Technical Support. (see below for correction)

In the next release of NMath, the shipping assemblies will be built against .NET 3.5. (Assemblies for .NET 2.0 will remain available upon request.) << see below for correction

Ken

Update

This issue can only be solved with the application config change or with assemblies built with .NET 4.0.

If you are widely deploying your application, you should probably target .NET 3.5 or earlier. In this case, the application config is the best solution.

If you know your application will be run within .NET 4.0 or later and don’t wish to make the application config change, please contact us at support@centerspace.net. We can send you NMath assemblies built and tested with .NET 4.0.

Since we don’t use any .NET 3.5 features and it’s too early to target .NET 4.0 with a class library, we will be targeting .NET 2.0 in the next release (expected in June, 2010).

  • Share/Bookmark