Abstract

This project is about testing default OPTIM and NLM function in R for five functions namely, Beale, Booth, Matya, Rosenbrock & Easom. As part of testing, we are generating random numbers from -19 to 21 & all the functions are run 40,000 times on each function. OPTIM is run using default method Nelder-Mead on booth and Rosenbrock function using control list of absolute tolerance of 1e-16. Also, other methods like BFGS and CG are tested on other functions like Easom, Matya and Beale with control list optimization.

Optim

All the functions are being tested but the report consist of plots for functions highlighted in color amber which portrays good convergence.

Full data to Stratified training data ratio.


Nlm

All the functions are being tested but the report consist of plots for functions highlighted in color amber which portrays good convergence.

Full data to Stratified training data ratio.


Nelder-Mead Method:

1. Beale Function:

A. Evidence of Convergence:

Full data to Stratified training data ratio.



Graph returns Red points which do not get converged, while the Blue points get fully converged. Most of un-converged pairs of input parameters tend to lie on the either x-axis and y-axis and their first value of input parameter almost got negative value.

Method CG for comparison:

Full data to Stratified training data ratio.



Graph returns a lot of red points which doesn’t get converged, while the blue ones gets fully converged. Most of un-converged pairs of input parameters tend to lie all around x-axis and y-axis and their first value of input parameter almost started from a negative value.

Nelder-Mead Evidence of Actual Convergence:

Full data to Stratified training data ratio.



CG Evidence of Actual Convergence :

Full data to Stratified training data ratio.

There are 3 ok points in red color that got actually converged with err.norm<1e-5 for both default(Nelder-Mead) and CG methods.

B. Initial & Final Values:

Full data to Stratified training data ratio.



Full data to Stratified training data ratio.



The function returned few optimal points. The points near to (3,0.5) where 3 is on x-axis and 0.5 is on y-axis are the optimal points. Rest of the adjoining point can be considered as slightly converged points. The function also returns several points which are not converged & these points are near (-600,0) and (0,-250).

C. Function value for actual convergence:

Full data to Stratified training data ratio.



We have three ok points in color red which are very close by and have overlapped in the graph. They have very small values which are less than 1e-8.

D. Function Values for un-convergence:

Full data to Stratified training data ratio.



We can see that there are some red points which the function claims that those are un-converged points. At those points, optim() will return the non-zero convergence values.

E.Function Values:

Full data to Stratified training data ratio.



There are few points lie on the line y= -8 and very few below that, which are the best values, there are other points (above -5) that are not good and seem to be dispersed apart.

F. Gradient Graph:

Full data to Stratified training data ratio.



For optimal solution, the gradient should be zero. From the graph, lot of values seem to converge to zero and few points far away from zero, with values like 8000 and low points with -2000

G. Starting Values Vs Final Values:

Full data to Stratified training data ratio.



The final values return a wide range of function value. There are some good final values attain function values that are very small (1e-10), nearly to 0.






2. Booth Function (Nelder-Mead):

A. Evidence of Convergence:

Full data to Stratified training data ratio.



Optim() claims that 100% of the result are totally converged.

B. Initial & Final Values:

Full data to Stratified training data ratio.



Full data to Stratified training data ratio.



The function returned the output parameters gather near the real optimal point (1,3). Surprisingly, these points are not actually get converged. That means that optim() claims to get optimal minimum each time it run, unfortunately, these point are not significant.

C. Function value for actual convergence:

Full data to Stratified training data ratio.



There is only one point that get actually converged. This proves that optim() claimed to get converged 100% are not reliable. The FV of converged point is quite small around 1e-9

D. Function Values for un-convergence:

Full data to Stratified training data ratio.



By using default method, optim() claimed that all of the input parameters are totally converged. Therefore, there is no un-converged point in graph. We can also see that most of points that nlm() claim to get converged achieved FVs in range from 0 to 1e-8. Although they actually do not converged, their FVs attain at very small number. There are some point whose FVs even go below 1e-8.

F. Gradient Graph:

Full data to Stratified training data ratio.



For optimal solution, the gradient should be zero. Most of the time, optim() returns number of pair of parameters that get their gradient values near to (0,0). However, we can see that there are a few point are away from 0. The furthest point is near to (1, -1.5)

  1. Starting Values Vs Final Values:
Full data to Stratified training data ratio.



The final values return a wide range of function value. There are some good final values with function values that are very small (below 1e-8), which are nearly to 0.

BFGS Method:

1. Beale Function:

A. Evidence of claimed Convergence:

Full data to Stratified training data ratio.



In the claimed convergence, we have lot more (blue points) good points. All the un-converged pairs of input parameters are red points.

B. Evidence of actual Convergence:

Full data to Stratified training data ratio.



Graph returns yellow points which doesn’t get converged, while the red points(ok) get converged. We also got some good points (in green ).

C. Initial & Final Values:

Full data to Stratified training data ratio.



Full data to Stratified training data ratio.



The function returned few optimal points. The points near to (3,0.5) where 3 is on x axis and 0.5 is on y axis are the optimal points. Rest of the adjoining point can be considered as slightly converged points. The function also returns several points which are not converged & these points are near (-600,0) and (0, -50).

C. Function value for actual convergence:

Full data to Stratified training data ratio.



We have some “ok” points in color red near or below 1e-10 which are very close by and have overlapped in the graph. We also have 4 ok points which are marked as green triangles. Surprisingly, all of the actual converged points can be found at a very early stage.

Function Values for un-convergence:

Full data to Stratified training data ratio.



We can see that there are some pink points which the function claims that are un-converged points. At those points, optim will return the non-zero convergence values.

E. Function Values:

Full data to Stratified training data ratio.



There are multiple points lie on the line y=-9 and various below that, which are the best values, there are other points (above 0 ) that are not good.

F. Gradient Graph:

Full data to Stratified training data ratio.



For optimal solution, the gradient should be zero. From the graph, lot of values seem to converge to zero and few points far away from zero. Clearly, the gradient values for first parameter vary in wider range than that of the second parameter.

G. Starting Values Vs Final Values:

Full data to Stratified training data ratio.



The final function points converge mostly near 1e-9 and 1e-10. There are some good final values plotted in blue dots with function values that are very small (below 1e-11), which are nearly to 0.

2. Rosenbrock Function (BFGS):

A. Evidence of claimed Convergence

Full data to Stratified training data ratio.



In the claimed convergence, we have lot more converged points (blue) than un-converged points (red) . Most of un-converged points are following in same pattern which parallels with the y-axis.

B. Evidence of actual Convergence:

Full data to Stratified training data ratio.



Graph returns yellow points which do not get converged, while the red points (ok) get converged. We also got some good points (green).

C. Initial and Final Values:

Full data to Stratified training data ratio.



Full data to Stratified training data ratio.



The function returned final points in parabolic shape. There are few optimal points near to the real optimal minimum (1,1) while others is far away. The function even returns a point whose the second value is 60000.

D. Function value for actual convergence:

Full data to Stratified training data ratio.



We have 466 ok points in red color triangle. We also have 8 good points which are marked as green triangles. These point achieved very small FVs (less than 1e-10) which are near to 0.

E. Function Values for un-convergence:

Full data to Stratified training data ratio.



We can see that there are some pink points which the function claims that those are un-converged points. At those points, optim() will return the non-zero convergence values. Interestingly, there are some converged points whose FVs are quite high (above 1e+3).

F. Function Values:

Full data to Stratified training data ratio.



We can see that most of FVs are lie on lines 1e+1, 1e+2, 1e+3, 1e-7, and 1e-8. There are some other points whose FVs attain even below 1e-10.

G. Gradient Graph:

Full data to Stratified training data ratio.



For optimal solution, the gradient should be zero. From the graph, lot of values seem to converge to zero and few points far away from zero.

H. Starting Values Vs Final Values:

Full data to Stratified training data ratio.

































Full data to Stratified training data ratio.


In the claimed convergence, we have lot more (blue points) good points. All the un-converged pairs of input parameters are red points.