Example 2 - Book Chapter- Nonlinear Regression Models

Solution:

Fitting procedure

As it was already stated, the first step in fitting a nonlinear model is to take a look at the data points and see how they look like (Fig 1). In the previous example, we found that by using the original Beta growth function some of the assumptions wouldn’t be valid anymore. So in this example we fit different sigmoid functions including the modified version of Beta to find the model which can represent better the observed yield.

Figure 1: Observed yield during the growth season

So using the ‘data’ function we import the data set from ‘nlraa’ package and subsequently extract the desired predictor (Day of the year) and response variable (yield in Mg/ha) for the Maize and the first block. In the previous example we explained how to use ‘nls’ function to fit our nonlinear model to data. In the following block of code, you can find how the ‘nls’ function has been called and constructed with getting help form ‘nlraa’ package for fitting different nonlinear models for a set of predictor (X which is day of the year) and response variable(Y which is yield in Mg/ha).

############### Nonlinear Regression for Gompertz function
reg.Gompertz= nls(Y~(Y.asim)*exp(-exp(-k*(X-t.m))),data=Maize.B1.low, start=list(Y.asim=12.1,k=0.08,t.m=240.5))
############### Nonlinear Regression for Logistic function
reg.Logistic= nls(Y~(Y.asim)/(1+exp(-k*(X-t.m))),data=Maize.B1.low, start=list(Y.asim=12.1,k=0.08,t.m=240.5))
############### Nonlinear Regression for Richards function
reg.Richards= nls(Y~rich(X, t.m, k, v, Y.asim),data=Maize.B1.low, start=list(Y.asim=14.1,k=0.07,t.m=200,v=0.6))
############### Nonlinear Regression for Beta function 
reg.Beta= nls(Y~SSbgf(X, Y.asim, t.e, t.m),data=Maize.B1.low, start=list(Y.asim=10.1,t.e=250,t.m=220.1))
############### Nonlinear Regression for Beta function 2
reg.Beta2= nls(Y~bgf2(X, w.max, w.b, t.e, t.m, t.b),data=Maize.B1.low, start=list(w.max=24,t.e=252,t.m=215,w.b=0,t.b=141))
############### Nonlinear Regression for Weibull function
reg.Weibull= nls(Y~weibull((X-140),a = a, b = b, Yo=Y.asim),data=Maize.B1.low, start=list(Y.asim=21.0,a=0.0000009,b=3.3))

In the above code, the last four ‘nls’ functions have been called differently. In these parts of the code instead of constructing the nonlinear model, some predefined functions describing the models (from ‘nlraa’ package) have been used to avoid the confusion.

Figure 2: Simulated yield versus observed for different models and high agricultral inputs

As a results of fitting different nonlinear growth models, the following coefficients were obtained. Obviously, each model has reached to a different value for maximum yield, but before we judge about that we need to find out which one those models has shown less bias in the assumptions.

Model	Y.asim	K	t.m	Mean of residuals
Gompertz	23.788	0.049	199.49	0.124
Logistic	22.529	0.083	206.872	0.024
Richards	21.98	0.135	212.105	-0.064
Beta	25.001	-	225.783	-0.265
Beta2	23.278	-	210.774	-1.26510^{-8}
Weibull	22.126	-	-	0.081

Checking the model assumptions

Residuals mean

The mean of residuals is presented in the last column of the above table. The closest value to the zero belongs to the modified version of Beta and Logistic function while, the original Beta function has the largest absolute value.

Equality of the variance

Equality of variance are investigated using a scatter plot between the residual and predicted response variable (Yield) .

Normality

The normality of the residuals has the least importance and also given the fact that the residuals have a small sample size, this test won’t play a major role in our decision.

Figure 3: Here you see some interesting stuff about cars and such.

Based on figures presented above, among all the models the modified Beta, Logistic, Richards and Weibull show the least bias from the assumptions. On the other hand, Gompertz and original Beta function had unacceptable large mean of residuals as well as clear pattern in their residuals. So from now on, we try to stick with the four models had the least bias and perform the further analysis on them.

Statistical descriptors

For those models passed the checking assumption process, the statistical descriptors were calculated to find out which one of them works the best. Cheking out the table below shows that the modified version of Beta and Richards model had very close and best performances. Different statistical indices help us to learn more about the models performances. For example, results show tha the Logistic and Weibull model understimate in general while, the other two model tend to overestimate.

Model	RMSE	NRMSE (%)	Bias	\(R^2\)	EF	AIC	BIC
Logistic	0.59	6.062	-0.024	0.996	0.995	32.232	33.021
Richards	0.461	4.697	0.064	0.997	0.997	21.62	22.606
Beta2	0.463	4.749	1.26510^{-8}	0.997	0.997	23.7	24.883
Weibull	0.542	5.602	-0.081	0.997	0.996	22.526	23.315

Discussion

The first lesson we learned in this example was the fact that how the original Beta function was modified to capture more effectivly the maize growth at the cost of having two extra parameters. So there would be always a trade off between the complexity of the models and their performance, and it would be the modelers expert judgment to choose between the loss of accuracy or dealing with more complicated statistics/programming problem.

It’s also necessary to mention that, what we found as the result of this example would be valid just for the this specific plot, crop and agricultural input and it can not be generalized to other situations.

In the much more complicated problems, you may deal with sets of predictor and response variables in the same time (for example for different blocks or treatments), and you may look for a model that works the best for all those cases. In these types of problems ‘nlsList’ and/or ‘nlme’ (for addition of mixed model effect) function can be used. Further details can be found on the supplementary material of Archontolis and Miguez (2014).

Example 2 - Book Chapter- Nonlinear Regression Models

Hamze Dokoohaki

Tuesday, November 21, 2014

Example 2.

Find the best model represntative of the obeserved yield in high agricultural input and first block.