Пакет WHO, подготовленный Eric Persson в феврале 2016 года, обеспечивает простой доступ к данным, собираемым ВОЗ (Всемирной организацией здравоохранения).
Также о возможностях работы с данным пакетом можно прочитать здесь
Предлагаемый материал - создание файла с данными ВОЗ для последующего анализа
library(WHO)
library(tidyverse)
Будут применены модифицированные функции, предложенные в работе Peter’s Stats Stuff
#----------------helper functions---------------
prep <- function(data){
# удаляет строки с пропущенными значениями по переменным region, country
data %>%
filter(!is.na(region)) %>%
filter(!is.na(country))
}
latest <- function(data){
# возвращает последние доступные данные по стране
data %>%
group_by(country) %>%
filter(year == max(year))
}
При использовании данных WHO необходимо знание используемых ВОЗ кодов
codes <- get_codes()
glimpse(codes)
## Observations: 2,228
## Variables: 3
## $ label <chr> "MDG_0000000001", "MDG_0000000003", "MDG_0000000005", ...
## $ display <chr> "Infant mortality rate (probability of dying between b...
## $ url <chr> "http://apps.who.int/gho/indicatorregistry/App_Main/vi...
Названия используемых ВОЗ индикаторов (некоторые из них приведены в таблице)
| Код индикатора | Описание индикатора |
|---|---|
| WHOSIS_000001 | Life expectancy at birth (years) |
| WHOSIS_000002 | Healthy life expectancy (HALE) at birth (years) |
| WHOSIS_000004 | Adult mortality rate (probability of dying between 15 and 60 years per 1000 population) |
| GBD_76 | GBD - Disability-adjusted life years (DALYS) |
| GBD_77 | GBD - Years of life lost (YLL) |
| GBD_78 | GBD - Years lost due to disability (YLD) |
| MDG_0000000001 | Infant mortality rate (probability of dying between birth and age 1 per 1000 live births) |
| MDG_0000000003 | Adolescent birth rate (per 1000 women aged 15-19 years) |
| MDG_0000000005 | Contraceptive prevalence (%) |
| WHOSIS_000005 | Low-birth-weight newborns (%) |
| WHS9_85 | Literacy rate among adults aged >= 15 years (%) |
| WHS9_93 | Gross national income per capita (PPP int. $) |
| WHS9_95 | Total fertility rate (per woman) |
| WHS6_102 | Hospital beds (per 10 000 population) |
| WHS7_105 | Per capita total expenditure on health (PPP int. $) |
Расходы на здравоохранение на душу населения
# Получаем данные
wh105 <- get_data("WHS7_105")
glimpse(wh105)
## Observations: 4,015
## Variables: 8
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ year <dbl> 2002, 2009, 2010, 2005, 2007, 1996, 2013,...
## $ country <chr> "Andorra", "Andorra", "Andorra", "United ...
## $ worldbankincomegroup <chr> "High-income", "High-income", "High-incom...
## $ region <chr> "Europe", "Europe", "Europe", "Eastern Me...
## $ gho <chr> "Per capita total expenditure on health (...
## $ value <dbl> 2101.67, 2926.50, 3325.20, 1973.48, 1953....
## $ datasource <chr> NA, NA, NA, NA, NA, "Argentina note", "Ar...
# Подготовка данных
wh105 <- latest(wh105) %>%
filter(!worldbankincomegroup %in% c("Global", "NA")) %>%
prep() %>% select(country, worldbankincomegroup, value) %>%
rename(PerCapitaTotalHealthExpenditure = value) %>%
mutate(worldbankincomegroup = factor(worldbankincomegroup,
levels = c("Low-income", "Lower-middle-income", "Upper-middle-income", "High-income")))
Средняя продолжительность престоящей жизни при рождении
# Получаем данные
wh001 <- get_data("WHOSIS_000001")
glimpse(wh001)
## Observations: 9,120
## Variables: 7
## $ publishstate <chr> "Published", "Published", "Published", "Published...
## $ sex <chr> "Female", "Both sexes", "Female", "Male", "Female...
## $ gho <chr> "Life expectancy at birth (years)", "Life expecta...
## $ region <chr> "Eastern Mediterranean", "Eastern Mediterranean",...
## $ year <dbl> 2001, 2009, 2012, 2005, 2009, 2015, 2000, 2003, 2...
## $ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", NA, ...
## $ value <dbl> 56.5, 58.6, 60.8, 51.8, 57.6, 58.3, 46.8, 48.3, 4...
# Подготовка данных
wh001 <- wh001 %>% filter(sex == "Both sexes") %>%
prep() %>%
latest() %>% select(country, region, value) %>%
rename(LifeExpectancy = value) %>%
mutate(region = factor(region))
Уровень рождаемости у подростков (на 1000 женщин в возрасте 15-19 лет)
# Получаем данные
mdg003 <- get_data("MDG_0000000003")
glimpse(mdg003)
## Observations: 200
## Variables: 8
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ region <chr> "Europe", "Europe", "Europe", "Europe", "...
## $ country <chr> "Albania", "Armenia", "Iceland", "Israel"...
## $ gho <chr> "Adolescent birth rate (per 1000 women ag...
## $ worldbankincomegroup <chr> "Upper-middle-income", "Lower-middle-inco...
## $ sex <chr> "Female", "Female", "Female", "Female", "...
## $ year <dbl> 2013, 2013, 2013, 2014, 2013, 2012, 2013,...
## $ value <dbl> 19.7, 22.7, 7.1, 10.2, 14.2, 28.4, 40.8, ...
# Подготовка данных
mdg003 <- mdg003 %>% prep() %>% latest() %>%
select(country, value) %>%
rename(AdolescentBirthRate = value)
Использование контрацепции
# Получаем данные
mdg005 <- get_data("MDG_0000000005")
glimpse(mdg005)
## Observations: 256
## Variables: 7
## $ country <chr> "Cambodia", "Nicaragua", "Sudan", "Albani...
## $ gho <chr> "Contraceptive prevalence (%)", "Contrace...
## $ year <dbl> 2011, 2012, 2010, 2009, 2012, 2005, 2010,...
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ region <chr> "Western Pacific", "Americas", "Eastern M...
## $ value <dbl> 50.5, 80.4, 9.0, 69.3, 5.6, 47.3, 38.5, 7...
## $ worldbankincomegroup <chr> NA, NA, NA, NA, NA, NA, "Low-income", NA,...
# Подготовка данных
mdg005 <- mdg005 %>% prep() %>% latest() %>%
select(country, value) %>%
rename(ContraceptivePrevalence = value)
Доля новорожденных с низкой массой тела
# Получаем данные
wh005 <- get_data("WHOSIS_000005")
glimpse(wh005)
## Observations: 226
## Variables: 7
## $ gho <chr> "Low-birth-weight newborns (%)", "Low-bir...
## $ year <dbl> 2007, 2010, 2010, 2010, 2007, 2009, 2009,...
## $ country <chr> "Barbados", "Bhutan", "Democratic Republi...
## $ region <chr> "Americas", "South-East Asia", "Africa", ...
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ value <dbl> 12, 10, 10, 7, 10, 9, 10, 10, 5, 4, 9, 13...
## $ worldbankincomegroup <chr> NA, NA, NA, NA, NA, NA, NA, "Lower-middle...
# Подготовка данных
wh005 <- wh005 %>% prep() %>% latest() %>%
select(country, value) %>%
rename(LowBirthWeight = value)
Уровень грамотности среди взрослых 15 лет и старше
# Получаем данные
wh85 <- get_data("WHS9_85")
glimpse(wh85)
## Observations: 142
## Variables: 7
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ gho <chr> "Literacy rate among adults aged >= 15 ye...
## $ worldbankincomegroup <chr> "Lower-middle-income", "Low-income", "Glo...
## $ country <chr> "Senegal", "Zimbabwe", NA, "Oman", "Saudi...
## $ year <chr> "2007-2012", "2007-2012", "2007-2012", "2...
## $ region <chr> "Africa", "Africa", "Global", "Eastern Me...
## $ value <dbl> 50, 84, 84, 87, 87, 55, 57, 58, 70, 74, 2...
# Подготовка данных
wh85 <- wh85 %>% prep() %>% latest() %>%
select(country, value) %>%
rename(LiteracyRate = value)
Количество детей, приходящееся на одну женщину
# Получаем данные
wh95 <- get_data("WHS9_95")
glimpse(wh95)
## Observations: 570
## Variables: 7
## $ country <chr> "Hungary", "Lao People's Democratic Repub...
## $ year <dbl> 2012, 2011, 2012, 2011, 2011, 2011, 2012,...
## $ region <chr> "Europe", "Western Pacific", "Africa", "A...
## $ gho <chr> "Total fertility rate (per woman)", "Tota...
## $ publishstate <chr> "Published", "Published", "Published", "P...
## $ value <dbl> 1.40, 3.20, 3.09, 4.95, 4.59, 1.38, 3.05,...
## $ worldbankincomegroup <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
# Подготовка данных
wh95 <- wh95 %>% prep() %>% latest() %>%
select(country, value) %>%
rename(TotalFertilityRate = value)
Объединив таблицы данных (дата фреймы), мы получим таблицу данных, которая будет использоваться для дальнейшего изучения
whoJ <- inner_join(wh105, wh001)
whoJ <- left_join(whoJ, mdg003)
whoJ <- left_join(whoJ, mdg005)
whoJ <- left_join(whoJ, wh005)
whoJ <- left_join(whoJ, wh85)
whoJ <- left_join(whoJ, wh95)
summary(whoJ)
## country worldbankincomegroup
## Length:181 Low-income :29
## Class :character Lower-middle-income:49
## Mode :character Upper-middle-income:48
## High-income :55
##
##
##
## PerCapitaTotalHealthExpenditure region
## Min. : 24.96 Eastern Mediterranean:20
## 1st Qu.: 202.16 Europe :50
## Median : 698.30 Africa :47
## Mean :1269.48 Americas :33
## 3rd Qu.:1718.02 Western Pacific :21
## Max. :9402.54 South-East Asia :10
##
## LifeExpectancy AdolescentBirthRate ContraceptivePrevalence
## Min. :50.10 Min. : 1.70 Min. : 4.00
## 1st Qu.:65.70 1st Qu.: 17.00 1st Qu.:34.40
## Median :73.50 Median : 45.70 Median :54.80
## Mean :71.38 Mean : 56.77 Mean :51.21
## 3rd Qu.:76.70 3rd Qu.: 84.00 3rd Qu.:70.30
## Max. :83.70 Max. :229.00 Max. :88.40
## NA's :16
## LowBirthWeight LiteracyRate TotalFertilityRate
## Min. : 3.00 Min. : 29.00 Min. :1.300
## 1st Qu.: 6.00 1st Qu.: 71.00 1st Qu.:1.800
## Median : 9.00 Median : 90.00 Median :2.300
## Mean :10.52 Mean : 83.36 Mean :2.863
## 3rd Qu.:13.00 3rd Qu.: 98.00 3rd Qu.:3.800
## Max. :34.00 Max. :100.00 Max. :7.600
## NA's :7 NA's :50
Для того, чтобы не повторять весь пройденный путь снова, результаты будут сохранены в файле, который будет использоваться в дальнейшем.
write.csv(whoJ, "WHOData.csv", row.names = FALSE)
To be continued