EMAIL: dheepcho@gmail.com COLLEGE: MISB Bocconi
In this report, we will analyze the Ranklist of Fortune 500 Companies US.
Identify the factors that effects the Ranklist of Fortune 500 Companies US.
rfcus.df= read.csv("Fortune Companies.csv")
dim(rfcus.df)
## [1] 500 10
summary(rfcus.df)
## Rank Company.Name Number.of.Employees
## Min. : 1.0 Alcoa : 2 Min. : 1326
## 1st Qu.:125.8 Avnet : 2 1st Qu.: 12125
## Median :250.5 Regions Financial : 2 Median : 25600
## Mean :250.5 3M : 1 Mean : 56956
## 3rd Qu.:375.2 A-Mark Precious Metals: 1 3rd Qu.: 57625
## Max. :500.0 Abbott Laboratories : 1 Max. :2300000
## (Other) :491
## Previous.Rank Revenues..in... Revenue.Change Profits
## Min. : 1.0 Min. : 5145 -0.90% : 7 Min. : 1.0
## 1st Qu.:127.8 1st Qu.: 7245 1.90% : 7 1st Qu.: 353.6
## Median :251.5 Median : 11384 - : 6 Median : 780.0
## Mean :257.1 Mean : 24112 -0.40% : 6 Mean : 2034.6
## 3rd Qu.:379.2 3rd Qu.: 22605 0.20% : 6 3rd Qu.: 2010.5
## Max. :761.0 Max. :485873 2.50% : 6 Max. :45687.0
## NA's :8 (Other):462 NA's :1
## Profit.Change Assets Market.Value
## - : 66 Min. : 437 Min. : 120
## 5.20% : 4 1st Qu.: 8436 1st Qu.: 6996
## -13.50%: 3 Median : 19324 Median : 17696
## 10.60% : 3 Mean : 80389 Mean : 41322
## 19.10% : 3 3rd Qu.: 48126 3rd Qu.: 41768
## 4.40% : 3 Max. :3287968 Max. :753718
## (Other):418 NA's :30
library(psych)
describe(rfcus.df)
## vars n mean sd median trimmed mad
## Rank 1 500 250.50 144.48 250.5 250.50 185.32
## Company.Name* 2 500 248.44 143.92 248.5 248.37 185.32
## Number.of.Employees 3 500 56955.53 123622.29 25600.0 35679.33 25352.46
## Previous.Rank 4 492 257.11 154.05 251.5 253.59 186.81
## Revenues..in... 5 500 24111.75 38337.35 11384.0 15331.88 7373.71
## Revenue.Change* 6 500 144.10 84.82 143.0 145.04 111.19
## Profits 7 499 2034.57 3816.66 780.0 1188.83 817.51
## Profit.Change* 8 500 166.07 120.91 164.5 163.39 165.31
## Assets 9 500 80389.34 270425.70 19324.5 30796.67 20080.33
## Market.Value 10 470 41322.40 75614.01 17696.0 24360.06 18755.63
## min max range skew kurtosis se
## Rank 1 500 499 0.00 -1.21 6.46
## Company.Name* 1 497 496 0.00 -1.21 6.44
## Number.of.Employees 1326 2300000 2298674 12.46 215.02 5528.56
## Previous.Rank 1 761 760 0.20 -0.81 6.95
## Revenues..in... 5145 485873 480728 5.45 46.90 1714.50
## Revenue.Change* 1 288 287 -0.07 -1.20 3.79
## Profits 1 45687 45686 5.27 41.56 170.86
## Profit.Change* 1 373 372 0.08 -1.36 5.41
## Assets 437 3287968 3287531 7.82 70.10 12093.81
## Market.Value 120 753718 753598 4.65 28.73 3487.81
str(rfcus.df)
## 'data.frame': 500 obs. of 10 variables:
## $ Rank : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Company.Name : Factor w/ 497 levels "3M","A-Mark Precious Metals",..: 473 71 45 177 299 455 131 201 54 191 ...
## $ Number.of.Employees: int 2300000 367700 116000 72700 68000 230000 204000 225000 268540 201000 ...
## $ Previous.Rank : int 1 4 3 2 5 6 7 8 10 9 ...
## $ Revenues..in... : int 485873 223604 215639 205004 192487 184840 177526 166380 163786 151800 ...
## $ Revenue.Change : Factor w/ 288 levels "-","-0.10%","-0.40%",..: 134 257 112 44 258 182 175 282 155 140 ...
## $ Profits : num 13643 24074 45687 7840 2258 ...
## $ Profit.Change : Factor w/ 373 levels "-","-0.20%","-0.30%",..: 155 182 35 131 337 255 191 52 53 94 ...
## $ Assets : int 198825 620854 321686 330314 56563 122810 94462 221690 403821 237951 ...
## $ Market.Value : int 218619 411035 753718 340056 31439 157793 81310 52968 255679 46349 ...
boxplot(rfcus.df$Number.of.Employees, xlab="Number of Employees", main="Number of Employees", col=c("yellow"), horizontal = TRUE)
hist(rfcus.df$Revenues..in..., xlab = "Revenues in $", main = "Distribution of Revenues", xlim = c(1, 500000))
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplot.matrix(formula = ~rfcus.df$Rank + rfcus.df$Number.of.Employees)
## Warning: 'scatterplot.matrix' is deprecated.
## Use 'scatterplotMatrix' instead.
## See help("Deprecated") and help("car-deprecated").
Companies with top ranks reflects higher number of employees
scatterplot.matrix(formula = ~rfcus.df$Rank + rfcus.df$Profits + rfcus.df$Revenues, diagonal="histogram")
## Warning: 'scatterplot.matrix' is deprecated.
## Use 'scatterplotMatrix' instead.
## See help("Deprecated") and help("car-deprecated").
Companies with top ranks reflects higher profits and revenues
library(corrgram)
corrgram(rfcus.df[c(1,4)], upper.panel = panel.pie, lower.panel = panel.cor)
cor(rfcus.df$Rank, rfcus.df[, c(3,5,9)])
## Number.of.Employees Revenues..in... Assets
## [1,] -0.3453529 -0.606263 -0.3036207
library(corrgram)
corrgram(rfcus.df[c(1,3:5,7,9,10)], upper.panel = panel.pie)
library(corrgram)
corrgram(rfcus.df[c(1,3:5,7,9,10)], upper.panel = panel.pie, lower.panel = panel.cor)
library(corrplot)
## corrplot 0.84 loaded
corrplot(corr=cor(rfcus.df[, c(1,3:5,7,9,10)], use = "complete.obs"), method="ellipse")
Companies with top rank reflects higher number of employees, higher revenues, higher profits, higher assets and higher market value.
Let us assume the null hypothesis: Market value and assets does not affect the ranking of company
fit <- lm(rfcus.df$Rank ~ rfcus.df$Market.Value + rfcus.df$Assets)
summary(fit)
##
## Call:
## lm(formula = rfcus.df$Rank ~ rfcus.df$Market.Value + rfcus.df$Assets)
##
## Residuals:
## Min 1Q Median 3Q Max
## -262.01 -97.55 -1.26 99.82 350.64
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.916e+02 6.580e+00 44.311 < 2e-16 ***
## rfcus.df$Market.Value -8.106e-04 7.970e-05 -10.170 < 2e-16 ***
## rfcus.df$Assets -8.776e-05 2.174e-05 -4.038 6.31e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 124.4 on 467 degrees of freedom
## (30 observations deleted due to missingness)
## Multiple R-squared: 0.2544, Adjusted R-squared: 0.2512
## F-statistic: 79.66 on 2 and 467 DF, p-value: < 2.2e-16
It clearly shows that Market value and assets significantly affects the rank since its p-value<0.05. Thus null hypothesis is rejected.
Let us assume the null hypothesis: Profit does not affect the ranking of company
cor.test(rfcus.df$Rank, rfcus.df$Profits)
##
## Pearson's product-moment correlation
##
## data: rfcus.df$Rank and rfcus.df$Profits
## t = -11.882, df = 497, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.5359989 -0.3990469
## sample estimates:
## cor
## -0.47035
It clearly shows that Profit of a company plays an significant role in deciding its rank since its p-value<0.05. Thus null hypothesis is rejected.