This project has moved. For the latest updates, please go here.

Nonconvergence exception when fitting logistic function to data

Mar 12, 2013 at 7:45 PM
Hi,

I'm using the Meta.Numerics class UncertainMeasurementSample to fit a logistic function of the form y = A / (1 + C * exp(B * x)) to some real data representing price decay. Unfortunately, I keep getting Nonconvergence exceptions. I'd appreciate any help - the reproduceable IronPython code is listed below:
import sys
import math
import clr
clr.AddReference('Meta.Numerics')
from Meta.Numerics.Statistics import *

# real data - no convergence
x = [65, 53, 53, 41, 41, 41, 41, 41, 41, 41, 29, 29, 29, 29]
y = [31.04, 37.01, 47.78, 51.34, 51.47, 51.32, 51.45, 52.04, 56.26, 56.28, 58.55, 57.78, 57.78, 57.79]

# artificial data - converges
#x = [0,1,2,3,4,5,6,7]
#y = [4,6,10,16,24,34,46,58]

# artificial data - no convergence
#x = [15, 14, 13, 12, 11, 10, 9, 8]
#y = [99, 98, 98, 94, 91, 86, 79, 69]

# a[0] ~ A, a[1] ~ B, a[2] ~ C, where y = A / (1 + C*exp(B*x))
def logistic(a, x): return(a[0] / (1 + a[2] * math.exp(a[1] * x)))

# a[0] ~ A, a[1] ~ B, where y = A*exp(B*x)
def exponential(a, x): return(a[0] * math.exp(a[1] * x))

def fit_curves():
    A = 120; B = 0.02; C = 0.2
    sample = UncertainMeasurementSample()
    for i in range(0, len(x)):
        sample.Add(x[i], y[i], 1.0)
        print x[i], y[i]
    try:
        # fit exponential first - bad fit, but stable
        fit1 = sample.FitToFunction(exponential, (A, B))
        print 'GoodnessOfFit (exp): %s' % fit1.GoodnessOfFit.LeftProbability
        print 'A: %s' % fit1.Parameter(0)
        print 'B: %s' % fit1.Parameter(1)

        # logistic - should be better fit, but bombs out
        fit2 = sample.FitToFunction(logistic, (A, B, C))
        print 'GoodnessOfFit (log): %s' % fit2.GoodnessOfFit.LeftProbability
        print 'A: %s' % fit2.Parameter(0)
        print 'B: %s' % fit2.Parameter(1)
        print 'C: %s' % fit2.Parameter(2)

    except Exception, ex:
        print ex

fit_curves()
Coordinator
Mar 15, 2013 at 2:24 AM
Great to see people using the library from Iron Python! And thanks for the detailed repro. I will take a look and see what I can figure out.
Coordinator
Mar 18, 2013 at 8:17 PM
For your real data, starting from your initial values (120,0.02,0.2), it just takes the minimization algorithm longer to get to the minimum than the number of iterations it is pre-programmed to allow. If I monitor where it is going and start instead from initial values close to that (60, 0.07, 0.01), it converges to a minimum and returns a result. Also, note that in the end your data don't fit your function all that well: chi-squared per degree of freedom is 8, for a good fit you expect it to be 1. For your example that converges, the fit is much better.

There are a few things you can do and a few things we can do.

For you:
  1. Have some logic to make reasonable initial guesses for paramter values. Averages, moments, and extrema are often useful for this.
  2. Realize that a bad fit is much more likely to thrash about trying to find a minimum on a very non-quadratic surface than a good fit which quickly finds itself on a nice quadratic surface with a clear, unique minimum. For any given maximum allowed number of iterations, you are much more likely to reach that limit for a bad fit than for a good fit.
For us:
  1. The current number of interations allowed is 33 * dimension. I have no idea where that number came from, and it seems rather low. We need to look into it.
  2. Really we need to allow users to tell us a different value from our default for the allowed number of iterations, as we do for integration.
Mar 20, 2013 at 9:09 AM
Yes, I expected it had something to do with my initial values, but couldn't think of a solution. I'll give my application logic more thought, to see if I can find robust initial values based on the data.

Thanks for the support, and for a great library! It's a great complement to .Net scripting languages like IronPython and F#.