This report analyzes 3 groups of 2 variables trying to relate the variables at the same group. There are the variables Moons1 and Moons2, at the first group, Filename1 and Filename2, as the second group and the variables Corners1 and Corners2 at the same group. The variables from the first group has 9 samples of size 200. The secon group also presents 9 samples, but with size equals to 400. The variables from the third group have 9 samples of 1000 observations. For each value from each observation, the values that compose the variables where obtained by a parameter changing.
The first analysis is the correlation between the variables from the same group. The correlation between the variables were calculated using the same parameter value as reference. After that we calculated the mean of the correlations coefficients obtained by this process. In other words, we calculated the correlation between the 9 first values of Moons1 and the 9 first values of Moons2, then the 9 second values and so on, and did the same for the other groups. The matrix below shows the mean of this correlations coefficients for each group.
## Moons Filename Corners
## Correlation -0.9485886 -0.8969468 -0.8835189
As we can see, the correlation between the variables from the same group are negative and strong. It means that Moons1 is related to Moons2, so as Filename1 and Filename2 and Corners1 and Corners2. In addition to the correlation, the Mann- Whitney test was applied to compare the variables means. For each 9 observations from each variable belonging to the same group a p-value was obtained. If the p-value is bigger than 0.05 the variables are considered to have the same mean. Otherwise, if the p-value is smaller than 0.05, the variables are considered to have different means. After that we calculated a rejection ratio, presented in the following table.
## Moons Corners Filename
## Rejection Ratio 1 0.892 0.5725
The table above indicates that Moons1 and Moons2 are always significantly different, Corners1 and Corners2 are different about 89,2% of the time and Filename1 and Filename2 are different about 57,25% of the time.
In this section, a lot of regression models were fitted to analyze the determination coefficient, or R-squared. Firstly for each group 9 models were adjusted using the 9 observations of each group and the mean of all determination coefficients were calculated. The values are described in the following table.
## Moons Corners Filename
## R-squared 0.1912636 0.4163375 0.2205763
The table reveals that a linear regression model for Moons1 been explained by Moons2 only explain 19,13% of this variable. For the Corners variables it is a little bigger, 41,63% and for the Filenames variables, 22,06%. As the results were low, a different approach was necessary. So, linear regressions models were fitted for the 9 observations with the same parameters for the variables. The results are shown in the following table.
## Moons Corners Filename
## R-squared 0.9043593 0.8002564 0.8362099
As we can see, the model using the same parameters for the variables in each group presents a superior value for the R-squared coefficient. For the Moons variable it is about 90,44%, for the Corners it is 80,03% and for the Filenames variables it is 83,62%.