I'm using the Meta.Numerics class UncertainMeasurementSample to fit a logistic function of the form y = A / (1 + C * exp(B * x)) to some real data representing price decay. Unfortunately, I keep getting Nonconvergence exceptions. I'd appreciate any help - the
reproduceable IronPython code is listed below:
from Meta.Numerics.Statistics import *
# real data - no convergence
x = [65, 53, 53, 41, 41, 41, 41, 41, 41, 41, 29, 29, 29, 29]
y = [31.04, 37.01, 47.78, 51.34, 51.47, 51.32, 51.45, 52.04, 56.26, 56.28, 58.55, 57.78, 57.78, 57.79]
# artificial data - converges
#x = [0,1,2,3,4,5,6,7]
#y = [4,6,10,16,24,34,46,58]
# artificial data - no convergence
#x = [15, 14, 13, 12, 11, 10, 9, 8]
#y = [99, 98, 98, 94, 91, 86, 79, 69]
# a ~ A, a ~ B, a ~ C, where y = A / (1 + C*exp(B*x))
def logistic(a, x): return(a / (1 + a * math.exp(a * x)))
# a ~ A, a ~ B, where y = A*exp(B*x)
def exponential(a, x): return(a * math.exp(a * x))
A = 120; B = 0.02; C = 0.2
sample = UncertainMeasurementSample()
for i in range(0, len(x)):
sample.Add(x[i], y[i], 1.0)
print x[i], y[i]
# fit exponential first - bad fit, but stable
fit1 = sample.FitToFunction(exponential, (A, B))
print 'GoodnessOfFit (exp): %s' % fit1.GoodnessOfFit.LeftProbability
print 'A: %s' % fit1.Parameter(0)
print 'B: %s' % fit1.Parameter(1)
# logistic - should be better fit, but bombs out
fit2 = sample.FitToFunction(logistic, (A, B, C))
print 'GoodnessOfFit (log): %s' % fit2.GoodnessOfFit.LeftProbability
print 'A: %s' % fit2.Parameter(0)
print 'B: %s' % fit2.Parameter(1)
print 'C: %s' % fit2.Parameter(2)
except Exception, ex:
Mar 15, 2013 at 2:24 AM
Great to see people using the library from Iron Python! And thanks for the detailed repro. I will take a look and see what I can figure out.
Mar 18, 2013 at 8:17 PM
For your real data, starting from your initial values (120,0.02,0.2), it just takes the minimization algorithm longer to get to the minimum than the number of iterations it is pre-programmed to allow. If I monitor where it is going and start instead from
initial values close to that (60, 0.07, 0.01), it converges to a minimum and returns a result. Also, note that in the end your data don't fit your function all that well: chi-squared per degree of freedom is 8, for a good fit you expect it to be 1. For your
example that converges, the fit is much better.
There are a few things you can do and a few things we can do.
- Have some logic to make reasonable initial guesses for paramter values. Averages, moments, and extrema are often useful for this.
- Realize that a bad fit is much more likely to thrash about trying to find a minimum on a very non-quadratic surface than a good fit which quickly finds itself on a nice quadratic surface with a clear, unique minimum. For any given maximum allowed number
of iterations, you are much more likely to reach that limit for a bad fit than for a good fit.
- The current number of interations allowed is 33 * dimension. I have no idea where that number came from, and it seems rather low. We need to look into it.
- Really we need to allow users to tell us a different value from our default for the allowed number of iterations, as we do for integration.
Yes, I expected it had something to do with my initial values, but couldn't think of a solution. I'll give my application logic more thought, to see if I can find robust initial values based on the data.
Thanks for the support, and for a great library! It's a great complement to .Net scripting languages like IronPython and F#.