Please delete all the intro tet I wrote from line 22 to line 69 and start writing your short biography after this blockquote.
Bio
I am Suize (Syuzanna) Melkonyan and am a current MFA student at London Business School. I was born and raised in Armenia but later moved to the United States for my undergraduate studies. In 2019, I received my Bachelors degree in Management - Finance at University of Massachusetts Boston. I have always been passionate about the world of finance and have always wanted to deepen my understanding and knowledge of it. Having received this unique opportunity to pursue Masters in Financial Analysis at one of the top-ranked universities in the world makes me even more willing to learn as much as I can and reach my career goals.
Besides gaining knowledge and developing skills in finance, I am willing to expand my network and be part of the diverse LBS community. To make that happen, I have joined a number of clubs at LBS and am looking forward to meet other students who come from so many different backgrounds and cultures. A few of these clubs are:
Further details on my education and experience are available on my LinkedIn profile
gapminder country comparisonYou have seen the gapminder dataset that has data on life expectancy, population, and GDP per capita for 142 countries from 1952 to 2007. To get a glipmse of the dataframe, namely to see the variable names, variable types, etc., we use the glimpse function. We also want to have a look at the first 20 rows of data.
## Rows: 1,704
## Columns: 6
## $ country <fct> Afghanistan, Afghanistan, Afghanistan, Afghanistan, Afgha...
## $ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asi...
## $ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 199...
## $ lifeExp <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 4...
## $ pop <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372,...
## $ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.113...
## # A tibble: 20 x 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
## 7 Afghanistan Asia 1982 39.9 12881816 978.
## 8 Afghanistan Asia 1987 40.8 13867957 852.
## 9 Afghanistan Asia 1992 41.7 16317921 649.
## 10 Afghanistan Asia 1997 41.8 22227415 635.
## 11 Afghanistan Asia 2002 42.1 25268405 727.
## 12 Afghanistan Asia 2007 43.8 31889923 975.
## 13 Albania Europe 1952 55.2 1282697 1601.
## 14 Albania Europe 1957 59.3 1476505 1942.
## 15 Albania Europe 1962 64.8 1728137 2313.
## 16 Albania Europe 1967 66.2 1984060 2760.
## 17 Albania Europe 1972 67.7 2263554 3313.
## 18 Albania Europe 1977 68.9 2509048 3533.
## 19 Albania Europe 1982 70.4 2780097 3631.
## 20 Albania Europe 1987 72 3075321 3739.
Your task is to produce two graphs of how life expectancy has changed over the years for the country and the continent you come from.
I have created the country_data and continent_data with the code below.
country_data <- gapminder %>%
filter(country == "United States")
continent_data <- gapminder %>%
filter(continent == "Americas")
#I am originally from Armenia. However, seems like gapminder does not include data for Armenia, therefore I have completed the task with the example of the US, which is were I lived for the past years.First, create a plot of life expectancy over time for the single country you chose. You should use geom_point() to see the actual data points and geom_smooth(se = FALSE) to plot the underlying trendlines. You need to remove the comments # from the lines below for your code to run.
plot1 <- ggplot(data = country_data, mapping = aes(x = year, y = lifeExp)) +
geom_point() +
geom_smooth(se = FALSE) +
NULL
plot1## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Next we need to add a title. Create a new plot, or extend plot1, using the labs() function to add an informative title to the plot.
plot1 <- ggplot(data = country_data, mapping = aes(x = year, y = lifeExp)) +
geom_point() +
geom_smooth(se = FALSE) +
labs(title = "Life expectancy in the US 1952-2007",
x = "Year",
y = "Life Expectancy") +
NULL
print(plot1)## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Secondly, produce a plot for all countries in the continent you come from. (Hint: map the country variable to the colour aesthetic).
ggplot(data = continent_data , mapping = aes(x = year, y = lifeExp, colour = country))+
geom_point() +
geom_smooth(se = FALSE) +
NULL## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Finally, using the original gapminder data, produce a life expectancy over time graph, grouped (or faceted) by continent. We will remove all legends, adding the theme(legend.position="none") in the end of our ggplot.
ggplot(data = gapminder , mapping = aes(x = year , y = lifeExp , colour = continent))+
geom_point() +
geom_smooth(se = FALSE) +
facet_wrap(~continent) +
theme(legend.position="none") +
NULL## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Given these trends, what can you say about life expectancy since 1952? Again, don’t just say what’s happening in the graph. Tell some sort of story and speculate about the differences in the patterns.
Type your answer after this blockquote.
Since 1952 life expectancy has been consistently increasing in all 5 continents. A few factors have resulted in the gradual upward trend throughout the past decades. First, the development of societies and improved socio-economic conditions are contributing factors. In a more developed world people have become more conscious about personal hygiene, which in turn lowered mortality rate over time. Next, innovations in healthcare and medicine have had an important role as well. There are now hundreds of diseases that are cured with current technology and medicine which were not available in the 20th century. After all, more focus on healthy lifestyle, healthy nutrition, and exercising have had a great impact on this trend as well.
We will have a quick look at the results of the 2016 Brexit vote in the UK. First we read the data using read_csv() and have a quick glimpse at the data
## Rows: 632
## Columns: 11
## $ Seat <chr> "Aldershot", "Aldridge-Brownhills", "Altrincham and Sal...
## $ con_2015 <dbl> 50.592, 52.050, 52.994, 43.979, 60.788, 22.418, 52.454,...
## $ lab_2015 <dbl> 18.333, 22.369, 26.686, 34.781, 11.197, 41.022, 18.441,...
## $ ld_2015 <dbl> 8.824, 3.367, 8.383, 2.975, 7.192, 14.828, 5.984, 2.423...
## $ ukip_2015 <dbl> 17.867, 19.624, 8.011, 15.887, 14.438, 21.409, 18.821, ...
## $ leave_share <dbl> 57.89777, 67.79635, 38.58780, 65.29912, 49.70111, 70.47...
## $ born_in_uk <dbl> 83.10464, 96.12207, 90.48566, 97.30437, 93.33793, 96.96...
## $ male <dbl> 49.89896, 48.92951, 48.90621, 49.21657, 48.00189, 49.17...
## $ unemployed <dbl> 3.637000, 4.553607, 3.039963, 4.261173, 2.468100, 4.742...
## $ degree <dbl> 13.870661, 9.974114, 28.600135, 9.336294, 18.775591, 6....
## $ age_18to24 <dbl> 9.406093, 7.325850, 6.437453, 7.747801, 5.734730, 8.209...
The data comes from Elliott Morris, who cleaned it and made it available through his DataCamp class on analysing election and polling data in R.
Our main outcome variable (or y) is leave_share, which is the percent of votes cast in favour of Brexit, or leaving the EU. Each row is a UK parliament constituency.
To get a sense of the spread of the data, plot a histogram and a density plot of the leave share in all constituencies.
One common explanation for the Brexit outcome was fear of immigration and opposition to the EU’s more open border policy. We can check the relationship (or correlation) between the proportion of native born residents (born_in_uk) in a constituency and its leave_share. To do this, let us get the correlation between the two variables
## leave_share born_in_uk
## leave_share 1.0000000 0.4934295
## born_in_uk 0.4934295 1.0000000
The correlation is almost 0.5, which shows that the two variables are positively correlated.
We can also create a scatterplot between these two variables using geom_point. We also add the best fit line, using geom_smooth(method = "lm").
ggplot(brexit_results, aes(x = born_in_uk, y = leave_share)) +
geom_point(alpha=0.3) +
geom_smooth(method = "lm") +
theme_bw() +
labs(title = "Brexit polling results by consistuency", x = "% of UK native born residents", y = "% of votes for leaving Brexit" )## `geom_smooth()` using formula 'y ~ x'
You have the code for the plots, I would like you to revisit all of them and use the labs() function to add an informative title, subtitle, and axes titles to all plots.
What can you say about the relationship shown above? Again, don’t just say what’s happening in the graph. Tell some sort of story and speculate about the differences in the patterns.
Type your answer after, and outside, this blockquote.
The visualization of the Brexit polling results of the 2016 indicates positive correlation between the proportion of UK native born residents and proportion of votes for leaving Brexit. This validates the reasoning that the UK born residents might have had more fears regarding the EU’s open immigration policy vs non-UK born residents, i.e. immigrants.
Knit the completed R Markdown file as ah HTML or Word document (use the “Knit” button at the top of the script editor window) and upload it to Canvas.
If you want to, please answer the following