Apple quality for over centuries is considered to be the significant reasons as to why farmers generate optimum sales in the global market. These kind of apples give the nations that have large scale of the fruit’s orchards to have a higher bargaining power against their potential buyers. Some of the countries that deal in this farming include the Asian countries of China, India then from the European continent include Italy and Poland. In the United States of America, the potential growers of apples include Michigan, New York and California. However there have been several methods employed to preserve the quality of apples ranging from both natural and scientific. These methods have been highly adopted by most of the potential farmers so as to maintain high sales in the competitive market. Some of these methods include temperature and humidity control(temperature between 32-39F with humidity levels of 90-95%), modified atmosphere packaging (MAP), wax coating and ethylene control. From an economist’s perspective, nothing out competes the advantage of quality prediction due to the benefit of foreseeing the amount of product generation increased, constant or decreased. In the competitive global market the rise for apple quality increased in the recent years. These predictions enhance the stakeholders approach to the potential buyers. NIR spectroscopy has been the majorly used methods for apple quality predictions, evaluation and quantification. It is a technique that has the ability to provide rapid, accurate, and non-destructive analysis of apples which is additionally used in quality control and sorting operations. This method uses the assessment of both the external and internal characteristics such as color, size, shape, surface defects, soluble solids content(SSC), total dry matter concentration and the nutritional value. The analysis report presented to uses the artificial methodology of machine learning to make predictions of the apple quality based on attributes of weight, size, crunchiness, juiciness and acidity. Through the machine learning model it reduces the workers’ risk of exposure to the spectrometer radiations, it is highly time saving provided accurate readings are collected and it only requires less technicality to be employed.
This report will be used by policy makers to evaluate the methods used to predict the quality of the apples. They shall be able to identify those that could harm the health of their people and those that are beneficial to both the people and the environment. It will furthermore enable the governments’ of these nations through the economic sector to easily ascertain or determine how much revenue is yielded based on the quality predictions. Other potential researchers will be able to use this report as a foundation to identify the gaps and also enhance the methods previously used to acquire more accurate predictions.
This section of the report comprises of the procedures that were employed during data collection, data cleaning and the how the results were extracted for more insights.
The apple quality data set is a secondary type of data that was acquired from the online kaggle website. The data was accessed freely and extracted as a comma separated file. It comprised of 4000 apples that were evaluated against 8 variables that include size, weight, sweetness, crunchiness, juiciness, ripeness, acidity and quality.
The table below shows the data set that was extracted from the kaggle online website
Summary and descriptive statistics will be generated through the uni variable, bi-variable and multivariate analysis. Relationships between the variables will be extracted in order to determine the effect on each of the variables employed to predict the fruit’s quality. Furthermore, statistical tests that include two-sample t-tests and correlations tests will be employed to prove the hypothesis.
The data set was split into the training and testing sets based on the best split ratio that ranged between (0.65-0.9). Training data set was used for building and evaluating the model whereas the testing data set was used for carrying out the quality predictions. The researcher used supervised type of machine learning because the data set was labelled and highly structured. Random forests, Decision trees and classification methods were employed since the dependent variable was labelled and contained two outcomes(binary).
Model performance/evaluation was investigated using the confusion matrix through the identifying the key aspects such as accuracy, sensitivity, p-values and specificity.
From the results obtained from the model performance the best model will be identified to rely on for best predictions.
Quality
| Quality | Frequency | Percentage(%) |
|---|---|---|
| bad | 1996 | 0.499 |
| good | 2004 | 0.501 |
The table above indicated that majority of the apples were of a good quality(0.501%) and the minority were of a bad quality(0.499%)
The statistics representation in the column chart above indicated that good quality apples(2004 apples) where slightly more compared to the bad quality apples(1996 apples).
Size
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -7.2 | -0.5 | -0.5 | 1.9 | 6.4 |
The table above indicated that the size of the apples ranged from -7.2mm to 6.4mm with an average diameter of -0.5mm, a median size of -0.5mm and a 1.9% variation in an apple’s size from the average size.
The histogram above indicated the size of apples was normally distributed with most of them having a size ranging between -4mm and 0mm. Additionally, the level of outliers in the apple size was relatively low.
Weight
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -7.1 | -1 | -1 | 1.6 | 5.8 |
The table above indicated that the weight of apples ranged from -7.1g to 5.8g, with an average weight of -1.0g, a median weight of -1.0 and a 1.6% variation in an apples weight compared to the average weight.
The histogram above indicated that the weight of apples was normally distributed with most of the apples weighing between -4g to 0g. The level of outliers in the weight distribution of apples was relatively low hence a better indicator of quality.
Sweetness
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -6.9 | -0.5 | -0.5 | 1.9 | 6.4 |
The table above indicated that the sweetness of an apple fruit ranged from -6.9 degree of sweetness to 6.4 degree of sweetness and -0.5 average degree of sweetness. This further indicated that apples that were above the average degree of sweetness were better than those below average.
The histogram above indicated that the degree of sweetness in the apples was normally distributed with majority of them having a degree between -2 degree of sweetness and 0 degree of sweetness.
Crunchiness
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -6.1 | 1 | 1 | 1.4 | 7.6 |
The results presented in the table above indicated that the texture levels of the apples ranged from -6.1 levels to 7.6 levels with an average and median texture levels of 1, 1.4% variation in the apples texture from the average texture level.
The histogram above indicated that there was a normal distribution in the apples texture levels with most of the apples having textures levels that ranged between 0 to 4 levels of texture.
Juiciness
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -6 | 0.5 | 0.5 | 1.9 | 7.4 |
The table above indicated that the juiciness levels ranged from -6.0 to 7.4 levels with an average level of juiciness at 0.5cl.
The histogram above indicated that the juiciness of apples was distributed to the right(right skewed) which signified that most of the harvested apples had juice levels that ranged from 1 to -6 levels of juiciness.
Ripeness
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -5.9 | 0.5 | 0.5 | 1.9 | 7.2 |
The statistics presented in the table above indicated that the stage of ripeness in the apples ranged between -5.9 and 7.2, an average stage of ripeness at 0.5 with a 1.9% deviation in each apple’s stage of ripeness from the average.
The histogram above indicated that there was a normal distribution in the stages of ripeness in the apples with most of them having a ripening stage that laid between -1 and 1 stages of ripeness.
Acidity
| Minimum | Median | Average | StandardDeviation | Maximum |
|---|---|---|---|---|
| -7 | 0 | 0.1 | 2.1 | 7.4 |
The table above indicated that the acidity levels in apples ranged between -7 and 7.4 levels with an average of 0.1 acidity levels.
The histogram above indicated that the acidity levels in the apples were normally distributed which implied that the acidity levels were equally distributed in these apples.
This section of the report contained a correlogram and scatter plots to visualize the several relationships between the numeric variables.
Correlogram
The correlogram above indicated that there was a significant relationship between the apple size and weight(p = 0.00), size and sweetness(p = 0.00), size and crunchiness(p = 0.00), size and ripeness(p = 0.00), weight and sweetness(p = 0.00), weight and crunchiness (p = 0.00), weight and juiciness(p = 0.00), weight and ripeness(p = 0.00), sweetness and crunchiness(p = 0.01), sweetness and juiciness(p = 0.00), sweetness and ripeness(p = 0.00), crunchiness and juiciness(p = 0.00), crunchiness and ripeness(p = 0.00), juiciness and ripeness(p = 0.00), ripeness and acidity(p = 0.00). Additionally weight, sweetness, crunchiness and juiciness also had significant relationships with acidity (p = 0.00).
However, there was no significant relationship between the size and juiciness of the apple(p = 0.23), weight and acidity(p = 0.30)
Scatter plots
In this section of the report illustrates the linear relationship between the apple weight, sweetness, crunchiness, juiciness, ripeness and acidity based on their quality.
Size and weight
The scatter plot above indicated that there was a linear relationship between the size and weight of apples(-0.17). It further indicated that as the size of an apple increased, there as a gradual decrease in their weight.
Size and Sweetness
The scatter plot above indicated that the relationship between the size and sweetness of apples was linear(-0.32) which implied that an increase in the size of an apple caused a decrease in the sweetness levels.
Size and crunchiness
The scatter plot above indicated that there was a linear relationship between an apple’s size and crunchiness(texture)(0.17) which implied that an increase in the size of a given apple caused an increased texture degree.
Size and Juiciness
The scatter plot above indicated that there was a linear relationship between the size and juiciness of the apples(-0.02).An increase in the size of a given apple led to a gradual decrease in the juiciness level.
Size and Ripeness
The scatter plot above indicated that there was a linear relationship between the apple’s size and ripeness(-0.13) which implied that as the size of the apple increased, the level of ripeness in the apples gradually decreased.
Size and Acidity
The scatter plot above indicated that acidity and size in the apples had a linear relationship(0.196) which therefore meant that the acidity levels in the apples raised with an increase in the size of an apple.
Weight and sweetness
The scatter plot indicated that the relationship between the apple weight and degree of sweetness were linear(-0.15). An increase in the weight of a given apple led to a decrease in the apple’s degree of sweetness.
Weight and crunchiness
The scatter plot above indicated that the crunchiness(texture) in the apples gradually decreased as the weight in the apples reduced**(r = -0.96). Therefore the more the apples gained weight the smoother they became.
Weight and Juiciness
The scatter plot above indicated that the juiciness levels in the apples gradually reduced as the apples gained more weight. Therefore weight had a negative impact in the juiciness of a given apple**(r = -0.018)
Weight and Ripeness
The scatter plot above indicated that as the apples gained more weight, the level of ripeness rapidly declined(-0.24). The weight in the apples tends to reduce the level/degree of ripeness that is in a given apple.
Weight and acidity
The scatter plot above indicated there was an association between the weight and acidity in the apples. Heavier apples had a higher level of acidity counts than the lighter apples. Therefore the weight of an apple has an effect in the acidity levels of an apple.
Sweetness and crunchiness
The scatter plot above indicated that sweetness and crunchiness(texture) in the apples have a linear relationship(-0.04). As the level of sweetness in the apples increased, the texture in the apples tends to gradually reduce.
Sweetness and Juiciness
The scatter plot above indicated that as the level of sweetness in a given apple increased, the juiciness levels in the apple gradually increased(0.96). Sweetness levels in apples had an effect on the juiciness levels of the apples.
Sweetness and Ripeness
The scatter plot above indicated that there was a linear relationship between the ripeness and sweetness levels in the apples(-0.27) which signified that a unit increase in the level of ripeness in the apples led to a a unit decrease in their degree of sweetness.
Crunchiness and Juiciness
The scatter plot above indicated that there was linear and significant relationship between the apple crunchiness and juiciness(r = -0.26, p = 0.00). A unit increase in the crunchiness levels of an apples indicated a significant unit decrease in the juiciness levels.
Crunchiness and Ripeness
The scatter plot above indicated that there was a significant and linear relationship between the the apple’s ripeness and crunchiness levels(r = -0.2, p = 0.00). A unit increase in the degree of ripeness in the apples led to a unit decrease in the level of crunchiness.
Crunchiness and Acidity
The scatter plot above indicated that there was a significant and linear relationship between the crunchiness and acidity levels in the apples(r = 0.07,p = 0.00). A unit increase in the crunchiness degree led to a gradual unit increase in the acidity levels in the apples.
Juiciness and Ripeness
The scatter plot above indicated that there was a significant and linear relationship between the ripeness and juiciness levels in the apples(r = -0.1, p = 0.00). A unit increase in the degree of ripeness of the apples indicated a gradual unit decrease in the juiciness levels of the apples.
Juiciness and Acidity
The scatter plot above indicated that there was a significant and linear relationship between the apples juiciness and acidity levels(r = 0.25, p = 0.00). A unit increase in the juiciness levels indicated a gradual rise in the acidity levels.
Ripeness and Acidity
The scatter plot above indicated that there was a significant and linear relationship between the degree in ripeness and acidity levels in the apples(r = -0.20, p = 0.00). A unit increase in the ripeness levels indicated a unit increase in the apples acidity levels.
Size and Quality
| Quality | Average_size |
|---|---|
| good | -0.03 |
| bad | -0.97 |
The table above indicated that good quality apples had a higher average size(-0.03) compared to the bad quality apples(-0.97).
Weight and Quality
| Quality | Average weight |
|---|---|
| good | -0.987 |
| bad | -0.992 |
The table above indicated that good quality apples had a much higher average weight(-0.987g) compared to the bad quality apples(0.992).
Sweetness and Quality
| Quality | Average degree of sweetness |
|---|---|
| good | 0.02 |
| bad | -0.96 |
The table above indicated that the average degree of sweetness in the good quality apples(0.02) was higher than compared to the poor quality apples(-0.96).
Crunchiness and Quality of apples
| Quality | Average level of crunchiness |
|---|---|
| bad | 1.00 |
| good | 0.97 |
The table above indicated that bad quality apples had a relatively higher average level of crunchiness(1.00) compared to the good quality apples(0.97). Therefore the apples that had a relatively rough texture were of bad quality.
Juiciness and Quality
| Quality | Average juiciness level |
|---|---|
| good | 1.01 |
| bad | 0.01 |
The table above indicated that the average juiciness levels in the good quality apples were higher (1.01) compared to those in the bad quality apples (0.01). The juiciness levels were more in the good quality apples than the bad quality apples.
Ripeness and Quality
| Quality | Average ripeness |
|---|---|
| bad | 0.99 |
| good | 0.00 |
The table above indicated that the average level of ripeness was relatively higher in the bad quality apples (0.99) compared to the good quality apples (0.00).
Acidity and Quality
| Quality | Average acidity level |
|---|---|
| bad | 0.09 |
| good | 0.06 |
The table above indicated that the average acidity levels in the bad quality apples were higher (0.09) compared to the good quality apples (0.06)**.
This section of the report contains the correlation and two sample tests carried out on the juiciness, ripeness, sweetness, acidity, crunchiness and quality of the apples in order to test the hypothesis.
Size and Weight
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.1707017 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the size and weight in apples (p-value = 0.00).
Size and Sweetness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.3246802 |
| p-value | 0.0000000 | |
The statistics presented in the table above indicated that there was a significant association between the size and sweetness in the apples (p-value = 0.00).
Size and Crunchiness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.1698681 |
| p-value | 0.0000000 | |
The statistics in the table indicated that there was a significant relationship between the size and crunchiness (texture) in the apples (p-value = 0.00).
Size and Juiciness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.0188924 |
| p-value | 0.2322451 | |
The statistics in the table above indicated that there was an insignificant relationship between the size and juiciness in apples (p-value = 0.23).
Size and Ripeness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.1347729 |
| p-value | 0.0000000 | |
The statistics in the table above indicated that there was a significant relationship between the size and ripeness in the apples (p-value = 0.00).
Size and Acidity
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.1962181 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the size and acidity in the apples (p-value = 0.00).
Weight and Sweetness
|
Correlation Test Results
|
|
|---|---|
| Statistics | Value |
| Correlation | -0.1542463 |
| p-value | -0.1542463 |
The statistics in the table above indicated that there was a significant relationship between the weight and sweetness levels in apples (p-value = -0.15).
Weight and Crunchiness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.1542463 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the weight and crunchiness(texture) in apples (p-value = 0.00).
Weight and Juiciness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.0922628 |
| p-value | 0.0000000 | |
The statistics in the table above indicated that there was a significant relationship between the weight and juiciness in the apples (p-value = 0.00).
Weight and Ripeness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.0958817 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the weight and ripeness stages in apples (p-value = 0.00).
Weight and Acidity
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.0164142 |
| p-value | 0.2993295 | |
The statistics in the above indicated that there was a insignificant relationship between the weight and acidity in apples (p-value = 0.30).
Sweetness and Crunchiness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlatio | -0.0375520 |
| p-value | 0.0175445 | |
The table above indicated that there was a significant relationship between the sweetness and crunchiness levels in apples (p-value = 0.02).
Sweetness and Juiciness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.0958815 |
| p-value | 0.0000000 | |
The statistics in the table above indicated that there was a significant relationship between the sweetness and juiciness levels in apples (p-value = 0.00).
Sweetness and Ripeness
| Statistics | Value | |
|---|---|---|
| cor | Correlation | -0.2738001 |
| p-value | 0.0000000 |
The table above indicated that there was a significant relationship between the sweetness and ripeness in the apples (p-value = 0.00).
Sweetness and Acidity
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.0859992 |
| p-value | 0.0000001 | |
The statistics presented in the table above indicated that there was a significant relationship between the sweetness and acidity levels in the apple fruits (p-value = 0.00).
Crunchiness and Juiciness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.2596071 |
| p-value | 0.0000000 | |
The statistics in the table above indicated that there was a significant relationship between the crunchiness(texture) and the juiciness levels in apples (p-value = 0.00).
Crunchiness and Ripeness
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.2019816 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the crunchiness and ripeness levels in apples (p-value = 0.00)
Crunchiness and Acidity
|
Correlation Test Result
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.0699430 |
| p-value | 0.0000095 | |
The statistics presented in the table above indicated that there was a significant relationship between the crunchiness or texture levels of apples and the acidity levels in apples (p-value = 0.00)
Juiciness and Ripeness
|
Correlation Test Result
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.0971441 |
| p-value | 0.0000000 | |
The statistics in the table above indicated that there was a significant relationship between the juiciness and ripeness levels in apples (p-value = 0.00).
Juiciness and Acidity
|
Correlation Test Result
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | 0.2487139 |
| p-value | 0.0000000 | |
The statistics presented in the table above indicated that there was a significant relationship between the juiciness and acidity levels in the apples (p-value = 0.00).
Ripeness and Acidity
|
Correlation Test Results
|
||
|---|---|---|
| Statistics | Value | |
| cor | Correlation | -0.2026691 |
| p-value | 0.0000000 | |
The table above indicated that there was a significant relationship between the ripeness and acidity levels in apples (p-value = 0.00).
This section of the report contains the models used to model the apple data set in order to make it well suited for generating predictions that were highly accurate. Predicted and actual values were both indicated for easier reference by the researcher and other potential readers or policy makers. It additionally contains the model evaluation results acquired from each specific model.
A machine learning algorithm that predicts or assigns categories or labels to new data based on patterns learned from labelled training data
Below is the complete data set that illustrates the model predictions
Confusion Matrix of the classification model
| Actual | ||
|---|---|---|
| Predictions | bad | good |
| bad | 256 | 143 |
| good | 57 | 344 |
The results presented by the confusion matrix indicated that the model had an accuracy level of 75% and predicted 256 apples to be of bad quality which was in favor of the actual results and 143 apples were predicted to be of bad quality which was contrary to the actual results that referenced them being of good quality.
57 apples were predicted to be of good quality yet the actual data referenced them as bad quality apples and 344 apples aligned with both the model predictions and actual data reference of being good quality apples.
This is a machine learning algorithm that recursively splits the data based on features, creating a tree like structure of decisions. Each node in the tree represents a feature, and the branches represent possible values or outcomes.
The diagram below illustrates a decision tree used to predict the quality of apples.
The decision tree above indicated that apples that had juiciness(< -0.33) and a size(< -0.86), proportion of 0.13 were predicted to be of a bad quality whereas those that had a size(> -0.86) and a crunchiness(< -0.0073) had a proportion of 0.02 predicted to be of good quality.
Apples that had juiciness(< -0.33), size(> -0.86), crunchiness(>= 0.0073) with a far less juiciness level(< -1.6) and a weight(< 0.64) had a 0.07 proportion of the apples predicted to be of bad quality while those that had a weight(> 0.64) had a proportion of 0.01 predicted to be good quality apples.With a juiciness level(>= -1.6), acidity(< -1.1) had 0.02 of the apples predicted to be of bad quality while those that had acidity(>= -1.1) had 0.07 of the apples predicted to be of good quality.
Additionally, the right hand side of the decision tree indicated that juiciness(>= -0.33) with a degree of ripeness(>= 1.7), sweetness(< 1.1) and a size(< 1.4) had a proportion of 0.12 apples predicted to be of bad quality while those that had size(>= 1.4), a proportion of 0.02 apples were predicted to be of good quality.
Finally, apples that had juiciness(>= -0.33), ripeness degree(< 1.7), an acidity level(< 2.1) had a proportion of 0.39 apples predicted to be of good quality whereas those that had acidity levels(>= 2.1) with a size(< -0.48) had a proportion of 0.05 apples predicted to be bad quality and those had a size(>= -0.48) had a 0.07 proportion of the apples predicted to be of good quality.
This is an ensemble learning technique based on decision trees. This creates multiple decision trees during training and combines their predictions to improve accuracy and reduce over fitting.
Random forest predictions
The table below shows the actual and predicted categories of the apple quality
Random Forest algorithm Confusion Matrix
The results presented by the confusion matrix generated by the random forest algorithm indicated that 340 apples were predicted to be of bad quality when in actual sense they were of bad quality whereas 27 apples were predicted to be of bad quality yet the actual data indicated them as good quality apples. A prediction of 59 apples were generated by the model indicated that they were good quality apples yet the data indication was that these were bad quality apples.And 374 apples were predicted to be of good quality which rhymed with the exact number of good quality apples in the data set. The model generated 89.25% of the predictions accurate hence above the moderate level.
Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem with strong independence assumptions between the features.
The results generated by the Naive Bayes algorithm indicated that there was an equal chance of predicting both bad and good quality apples(Prob = 0.5). Furthermore, the average size of bad quality apples(-0.99) was predicted to be relatively lower than good quality apples (-0.06), the average weight in the bad quality apples(-0.99) was predicted to be lower than the good quality apples (-0.95).Based on sweetness, the average level in bad quality apples(-0.97) was lower than good quality apples(0.02). The average crunchiness levels in good quality apples was predicted to be higher in bad quality apples(0.98) compared to the good quality apples(0.94).
The average juiciness levels in the good quality apples was predicted to be higher(1.0) than that in the bad quality apples(0.008), average ripeness levels were predicted to be higher in the bad quality apples(1.0) and lower in the good quality apples(-0.012). Acidity average levels in the bad quality apples was predicted to be higher(0.1) compared to the good quality apples(0.06).
Naive Bayes predictions
Confusion Matrix
Results presented in confusion matrix indicated that the predictions were 74.2% accurate. 301 apples were of bad quality which allied with the model predictions whereas 108 of the apples were predicted to be of a bad quality yet they are indicated as good quality apples. Furthermore, the model had correct predictions of 293 apples which were correctly indicated as good quality in the data set while 98 of the apples were predicted to be of good quality yet they were of bad quality.
In conclusion, the best machine learning model that was used to predict the apple quality was the random forest algorithm(Accuracy = 89.25%). Majority of the apples were of good quality(2004 apples), the juiciness, size, weight, sweetness, crunchiness, ripeness and acidity in the apples were normally distributed, the predictor variables indicated a linear relationship with the apple quality, the average distribution in weight, size, juiciness, acidity and sweetness levels were higher in the good quality apples compared to the bad quality apples.
The researcher based on the provided apples’ data set to make the following recommendation for better quality production.
The government and policy formulator should encourage farmers to use natural fertilizers that will boost the growth in the apples.
Farmers should be encouraged to employ agricultural mechanization practices to improve on the timing of the apples growth stages in order to have accurate levels of weight, size, juiciness and sweetness.
Highly improved storage facilities should be provided to the farmers in order to increase the sustainability and quality of harvested apples before they reach the market.
The image below shows the use of a spectrometer to predict the quality of apples on a given farm land.
The image below shows the internal operation of a spectrometer.
The image below shows the countries that deal in apple production.