Technology, since its development, has been adapted and has played a very significant part in the field of education. This statement is well-realized, especially during this pandemic, where people continue to learn through online applications amidst the hardships and complications brought by COVID-19. The Internet has served as a reliable and stable bridge that has aided in imparting knowledge from more than one source. Undoubtedly, the Internet holds a crucial role for anyone in pursuit of knowledge and education. To further support the claim of the Internet’s influence on education, the researchers analyzed a study that focuses on daily internet data traffic.
A study by Adeyemi et al. presents data regarding how much internet data traffic, considering both download and upload, is generated by a Smart University Campus in Nigeria. The researchers believe that the study shows how often and how much people at the university contribute to their daily internet traffic.
In the study, numerous amounts of data are analyzed and provided for the exploration of the topic. Although, the study focuses on the two main parts of the daily internet traffic generated from the university, which are the upload and download traffic volumes. The researchers aim to determine if a linear relationship between the download and upload traffic volumes is present.
To test the presence of a linear relationship between the two variables, the researchers implemented a hypothesis test that determines the nature of the relationship of two variables [1].
1. Parameter of Interest: The parameter of interest from the data set is the possible linear relationship between the data download traffic and data upload traffic.
2. Null hypothesis: \(H_0: \beta_1 = 0\)
The null hypothesis states that the coefficient of the slope of the regression model is equal to \(0\).
3. Alternative hypothesis: \(H_1: \beta_1 \neq 0\)
The alternative hypothesis is defined as the coefficient of the slope of the regression model is not equal to \(0\), and that there is a linear relationship between the download traffic and upload traffic.
4. Test Statistic: The test statistic for the hypothesis is the \(t\)-value of the slope of the regression model.
5. Reject \(H_0\) if: The null hypothesis is rejected if \(|t_0| > t_{\alpha/2, n-2}\), tested with a significance level \(\alpha = 0.05\). The \(t\)-value of the slope is given by the summary() function when used on the regression model.
6.Computation
Summary of the Fitted Regression Model
To fit a regression model to the data set, a scatter plot of the data must first be produced with the Upload Traffic \(x\) and Download Traffic \(y\).
It is evident, from analyzing Figure 1, that there is a possible positive linear relationship between the two variables, meaning that as the Data Upload Traffic increases in a day, the Data Download Traffic also increases. Using the lm() function, a linear regression model is fitted to the data set using the least squares method [2].
With the addition of the Fitted Regression Model, it is present that the regression line supports the same information displayed in Figure 1 that shows a positive linear relationship between the two variables. As the Daily Upload Traffic increases, the Daily Download Traffic following the equation \(\hat{y}=0.6752594 + 0.0043017{x_i}\) acquired using the lm() function.
##
## Call:
## lm(formula = downloadTraffic ~ uploadTraffic)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.19366 -0.30686 -0.02776 0.26587 1.34476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.6752594 0.0573629 11.77 <2e-16 ***
## uploadTraffic 0.0043017 0.0001245 34.55 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4424 on 351 degrees of freedom
## Multiple R-squared: 0.7728, Adjusted R-squared: 0.7721
## F-statistic: 1194 on 1 and 351 DF, p-value: < 2.2e-16
From the summary of the regression model, the t-value of \(t_{0(\hat\beta_1)}\) is \(34.55\). Computing for the T Critical Value of the sample with \(\alpha=0.05\) and with a degree of freedom of 351 using the qt() function,
## [1] 1.966746
The T Critical Value of the sample with a significance level of \(\alpha=0.05\) is \(t_{0.025,351}=1.966746\).
7. Conclusion:
Recalling the condition for the rejection of the null hypothesis, the null hypothesis is rejected if \(|t_0|>t_{\alpha/2,n-2}\). As \(t_{0(\hat\beta_1)}\) is equal to \(34.55\) and \(t_{0.025,351}\) is equal to \(=1.966746\), it is evident that the t-value of \(\hat\beta_1\) is greater than \(t_{0.025,351}\). Therefore, the null hypothesis is rejected.
As seen from the computation in the results, Daily Download Traffic and Daily Upload Traffic are used in order to have a fitted regression model for the data set in figure 1. This shows, through the compact data in the scatter plot, that there is a linear relationship between these variables. This is also seen in the second figure wherein a regression line is now added to better see the positive linear relationship of the variables. And when taking into account the critical value of the sample, it is seen the \(t_{0(\hat\beta_1)}\) is greater than \(t_{0.025,351}\) (\(34.55 > 1.966746\)). With this outcome, the null hypothesis of the sample is rejected. To add, the plots of the residuals were also analyzed.
Figure 3, the Normal Probability Plot of Residuals, includes a scatter plot which involves two variables, Standard Residuals and Normal Scores. This shows that there is no severe departure from the normality of the sample.
Figure 4a shows the plot with residuals plotted against the upload traffic values \(x_i\), while Figure 4b shows residuals plotted against the predicted download traffic values \(\hat{y_i}\). These plots resemble a “double bow” residual plot, indicating that there may be inequality of variance in the data set [3].
The Internet has been serving a huge role in the current generation as almost everything and everyone are currently involved in the digital world. It has become significant in pursuit of knowledge as most schools and educational institutions have welcomed the transition of using online platforms for learning and teaching.
Through this, the researchers have chosen to dwell deeper on a study conducted by Adeyemi et al. that concerns about how much internet data traffic–both download and upload–is generated by a smart university campus in Nigeria daily. The data collected in the study shows how much the people in the university contribute to their internet traffic.
Besides the results found in the study, the researchers decided to dig deeper and focus on finding the linear relationship between two specific data: the upload and download traffic volumes of the university. The researchers used statistical methods such as hypothesis testing, fitting a regression model, and getting the t-value in order to get the desired outcome.
As seen in the results, the null hypothesis was rejected. This meant that there is indeed a positive linear relationship between the download and the upload traffic volumes of the university. This implies that downloading and uploading files are essential for most of the people in the university. Since most of their modules, tests, and activities are presumably held online, this goes to show that a huge portion of their school materials can be accessed through the Internet. This data can help facilitate more research in the area of Internet traffic engineering. These data can also help educational institutions around the world create more accurate internet traffic planning and forecasting for the benefit of their students and their staffs. Since the study used by the researchers is only limited to having data found in a smart university campus, further research regarding the internet traffic in other private and public educational institutions is recommended to get more thorough and diverse data.
[1] D. C. Montgomery and G. C. Runger, in Applied Statistics and Probability for Engineers, 7th ed., Hoboken, NJ: Wiley, 2018, p. 289.
[2] “Linear Least Squares Regression,” Cyclismo. [Online]. Available: https://www.cyclismo.org/tutorial/R/linearLeastSquares.html. [Accessed: 04-Aug-2021].
[3] D. C. Montgomery and G. C. Runger, in Applied Statistics and Probability for Engineers, 7th ed., Hoboken, NJ: Wiley, 2018, p. 297.
Research Article: O. J. Adeyemi, S. I. Popoola, A. A. Atayero, D. G. Afolayan, M. Ariyo, and E. Adetiba, “Exploration of daily Internet data traffic generated in a smart university campus,” Data in Brief, vol. 20, pp. 30–52, 2018.
Dataset: O. J. Adeyemi, S. I. Popoola, A. A. Atayero, D. G. Afolayan, M. Ariyo, and E. Adetiba, Daily Data Download and Upload Traffic, January to December, Ota, Nigeria: Covenant University, 2018. Accessed on: August 4, 2021. [Online]. Available: https://ars.els-cdn.com/content/image/1-s2.0-S2352340918308126-mmc1.xlsx