1 データ情報および変更履歴:

  • “DataFormat.csv” latest version 2021年2月10日
  • “DataFormat.csv” latest version 2021年3月11日受領

<追記修正箇所等>

  • V列

    • 熊本県の欠損値について、国の示す推定方法により数値を記載 -
    • 熊本県女性の健康寿命の推定値
  • FI列、FJ列、FK列、FL列  収集元データの年度修正(新しいデータに置き換え)

    • 野菜摂取量_2016(20歳以上平均値(g/日)
    • 食塩摂取量_2016(20歳以上平均値(g/日)
    • BMI平均値_2016(男性20〜69歳)(女性40〜69歳)(単位Kg/u)
    • 歩数_2016(20歳以上平均値(歩/日)
  • DH列、DI列、DJ列、DK列のデータ収集元URLを修正

    • 悪性新生物(子宮)_年齢調整死亡率2015
    • 心疾患_年齢調整死亡率2015
    • 肺炎_年齢調整死亡率2015
    • 急性心筋梗塞_年齢調整死亡率2015
    • (✕「患者調査」から→〇「人口動態統計特殊報告」)
  • EF列、EG列、EH列、EI列、EJ列

    • バリアフリー化の総数であったものをバリアフリー化率に置き換え
      • 一定のバリアフリー化率_2018
      • 高度のバリアフリー化率_2018
      • バリアフリー_手すりがある2018
      • バリアフリー_廊下などが車いすで通行可能な幅2018
      • バリアフリー_段差のない屋内2018
    • 参考事項:バリアフリー化率の出典元を発見
# devtools::install_github(repo = "luka3117/JcPackage/OsakaUniv2020")

# 使用package
suppressMessages(library(readxl))
suppressMessages(library(dplyr))
suppressMessages(library(data.table))
suppressMessages(library(kableExtra))
suppressMessages(library(curl))
suppressMessages(library(tidyverse))
suppressMessages(library(plotly))

2 data import and pre-processing

  • row data data 次元

  • 94行と 168変数

  • 168変数の中、一部の変数は男女の区別がないため、県単位のデータ(47行のデータ)として扱う

3 寿命データの説明

3.1 平均寿命_2015

出典:平成27年都道府県別生命表の概況 取得元:以下のリンク

https://www.mhlw.go.jp/toukei/saikin/hw/life/tdfk15/index.html

詳細説明: ・「都道府県別にみた平均寿命の推移」をダウンロード ・男性の平均寿命は「表5-1 平均寿命の推移(男)」より「平成27年」のものを、女性の平均寿命は「表5-2 平均寿命の推移(女)」より「平成27年」のものを取得

3.2 変数名:健康寿命_2016

出典:厚生労働科学研究 健康寿命のページ(http://toukei.umin.jp/kenkoujyumyou/)

「都道府県別健康寿命(2010~2016年)」をダウンロード ・男性の健康寿命はシート「付表1-1」より「I列」の2016年の推定値を、女性の健康寿命はシート「付表1-2」より「日常生活に制限のない期間の平均」を取得。

・国民生活基礎調査は熊本地震により2016年の熊本県の健康情報を調査していないが、熊本県の2016年健康寿命のデータに熊本県の3年次の平均を下式による、2016年の仮定に基づく推定値を用いて算定した  (熊本県の2016年の仮定に基づく推定値)=(熊本県の2013年の推定値)×(全国の2016年の推定値)/(全国の2013年の推定値)

3.3 順位

Align labels on the left or right edge
Set direction to “y” and try hjust 0.5, 0, and 1:

set.seed(42)

p <- ggplot(mtcars, aes(y = wt, x = 1, label = rownames(mtcars))) +
  geom_point(color = "red") +
  ylim(1, 5.5) +
  theme(
    axis.line.x  = element_blank(),
    axis.ticks.x = element_blank(),
    axis.text.x  = element_blank(),
    axis.title.x = element_blank()
  )

p1 <- p +
  xlim(1, 1.375) +
  geom_text_repel(
    force        = 0.5,
    nudge_x      = 0.15,
    direction    = "y",
    hjust        = 0,
    segment.size = 0.2
  ) +
  ggtitle("hjust = 0")

p2 <- p +
  xlim(1, 1.375) +
  geom_text_repel(
    force        = 0.5,
    nudge_x      = 0.2,
    direction    = "y",
    hjust        = 0.5,
    segment.size = 0.2
  ) +
  ggtitle("hjust = 0.5 (default)")

p3 <- p +
  xlim(0.25, 1) +
  scale_y_continuous(position = "right") +
  geom_text_repel(
    force        = 0.5,
    nudge_x      = -0.25,
    direction    = "y",
    hjust        = 1,
    segment.size = 0.2
  ) +
  ggtitle("hjust = 1")

gridExtra::grid.arrange(p1, p2, p3, ncol = 3)
getwd()
## [1] "/Users/jc/Dropbox/00000\u5065\u5eb7\u548c\u6b4c\u5c71\u770c/0 wakayamaPkg/on working/slide1"
# d<-read_csv("../..//data/DataFormat.csv", skip=1, locale = readr::locale(encoding = "CP932"))

file<-"../../data/DataFormat-32021年3月11日受領/DataFormat.csv"
d<-read_csv(file, skip=1, locale = readr::locale(encoding = "CP932"))


d<-d[-95,]
d %>% dim()
## [1]  94 168
d<-d %>% rename(key=X1)
d %>% DT::datatable()

3.4 変数名を英語に変換 :data名d

var<-read.csv("../..//var_name_Eng.csv")
colnames(d)<-var$var_name_Eng
# d %>% DT::datatable()

3.4.1 変数名:和英対応表

read.csv("../../JpnEng.csv") %>% DT::datatable()

3.5 男女区別のないデータ d_common

  • 変数は 104
d_common<-d[, sapply(d[48,], is.na)]

d_common<-d_common[1:47, ]

# d_common%>% DT::datatable()

d_common<-d %>% select(1:6) %>% .[1:47,] %>% bind_cols(d_common)

d_common %>% DT::datatable()
d_common %>% dim()
## [1]  47 104

3.5.1 d_common data 変数名

{.tabset .tabset-fade .tabset-pills}

3.5.1.1 全体

d_common %>% colnames() %>%tbl_df() %>%  DT::datatable()

3.5.1.2 健康/疾患

Trt_rate_Hospitalization_Malignant_neoplasm_2017
Trt_rate_hospitalization_heart_dz_2017
Trt_rate_Hospitalization_Cerebrovascular_dz_2017
Trt_rate_Outpatient_Malignant_neoplasm_2017
Trt_rate_outpatient_heart_dz_2017
Trt_rate_Outpatient_Cerebrovascular_dz_2017

3.5.1.3 医療施設

Num_of_hospitals_2019
Num_of_clinics_2019
Num_of_general_clinics
HM_Num_of_general_hospitals
HM_Num_of_general_dental_clinics_per_100k_pop
HM_Num_of_doctors_engaged_in_medical_facilities_per_100k_pop
HM_Num_of_public_health_nurses_per_100k_pop

3.5.1.4 医療従事者

Num_of_certified_cancer_doctors_2020
Num_of_cardiologists_2020
Num_of_endoscopists_2020

3.5.1.5 文化

Book_purchase_price_2019
Num_of_libraries
Sports_Participant_Rate
Travel_Rate1
Volunteer_Activity_Participant_Rate

3.5.1.6 人口

pop_Young_pop_Ratio_2020
pop_oldElderly_pop_Ratio_2020
pop_Working_Age_pop_Ratio_2020
pop_Rough_Mortality_2020
pop_Double_income_household_ratio_2020

3.5.1.7 自然環境

Natural_environment_annual_avg_temperature
Natural_environment_annual_avg_relative_humidity
Natural_environment_annual_rain
Natural_environment_annual_Num_of_snow_days

3.5.1.8 経済

Economic_pref_income
Admin_base_Financial_strength_index
Admin_base_balance_ratio
Admin_infrastructure_living_protection_cost_ratio_(prefectural_finance)
Admin_infrastructure_Edu_cost_ratio_(prefectural_finance)
Gini_coeff_2014
Income_Gini_Coeff_Working_Household_2014

3.5.1.9 教育

Edu_Ptc_of_university_graduate_students_with_a_final_academic_background
Academic_ability_middle_school_2015
Academic_ability_elementary_school_2015

3.5.1.10 労働

Labor_primary_industry_emp_ratio
Labor_secondary_industry_emp_ratio
Labor_tertiary_industry_emp_ratio
Labor_Unemp_rate
Total_working_hours_2016
Total_salary_2016

3.5.1.11 住居

Residence_owner_ratio
Residence_house_ratio
Residence_water_supply_pop_ratio
Residence_sewerage_ratio
Residential_city_park_area_(per_pop)
Residence_road_pavement_rate
Residence_simachi_pavement_rate
Safety_Num_of_traffic_accidents_per_100k_pop
Residence_Num_of_city_parks

3.5.1.12 家計経済

Household_actual_income
Household_consumption_expenditure
Household_Edu_cost_ratio
Household_Liberal_Arts_and_Entertainment_Expenditure_Ratio
Household_Savings
Household_Smartphone_ownership_quantity
Household_PC_ownership_quantity
Household_Car_ownership_quantity
Household_Tablet_terminal_Ownership_quantity
pop_Household_Ratio_of_elderly_single_person_households

3.5.1.13 疾患

Hypertension_Hospitalization_2014
Hypertension_Outpatient_2014
Diabetes_hospitalization_2014
Diabetes_Outpatient_2014
Total_Num_of_caries_2014
Total_Num_of_periodontal_2014
Bone_density_disorder_2014
Fracture_2014
Tooth_supplement_2014
Alzheimer_dz_2014

3.5.1.14

Meat_2014
Seafood_2014
Milk_2014
Dairy_2014
Egg_2014
Soybean_2014

3.5.1.15 住居施設

Usual_barrier_free_handrails_2013
Usual_barrier_free_no_steps_2013
High_barrier_free_handrails_2013
High_barrier_free_no_steps_2013
High_barrier_free_wheelchairs_pass_Width

3.5.1.16 食消費

Fish_meat_consumption_2014
Fish_meat_consumption_2015
Fish_meat_consumption_2016
Fish_meat_consumption_avg_2014_2016
Confectionery_consumption_2014
Confectionery_consumption_2015
Confectionery_consumption_2016
Confectionery_consumption_avg_2014_2016
Fruits_consumption_2014
Fruits_consumption_2015
Fruits_consumption_2016
Fruit_consumption_avg_2014_2016
  • 男女区別のないデータの変数は98個

3.6 男女区別のあるデータ d_mf,d_m,d_f

{.tabset .tabset-fade .tabset-pills}

3.6.1 d_mf

name <- function(x) {
  !is.na(x)
}
d_mf<-d[, sapply(d[48,], name)]
d_mf %>% DT::datatable()
d_mf %>% dim()
## [1] 94 70
d_m<-d_mf %>% filter(sex=="M")
d_f<-d_mf %>% filter(sex=="F")
# summary(d_mf)

3.6.2 男性データd_m

d_m %>% DT::datatable()

3.6.3 女性データd_f

d_f %>% DT::datatable()

3.7 以下分析対象データd_common, d_m, d_f

4 dataの変数属性を確認

{.tabset .tabset-fade .tabset-pills}

4.1 d_common

d_common %>% colnames() %>% enframe() %>% DT::datatable()

4.2 d_m

d_m %>% colnames() %>% enframe() %>% DT::datatable()

4.3 d_f

d_f %>% colnames() %>% enframe() %>% DT::datatable()

5 寿命(目的変数)の都道府県の順位確認

{.tabset .tabset-fade .tabset-pills}

5.1 男性の平均寿命と健康寿命の差

suppressMessages(library(plotly))

t<-d_m %>% select(pref.J,HLE_2016,LE_2015) %>%
  mutate(pref.J=forcats::fct_reorder(pref.J, LE_2015))
t$pref.J
##  [1] \u5317\u6d77\u9053 \u9752\u68ee       \u5ca9\u624b       \u5bae\u57ce      
##  [5] \u79cb\u7530       \u5c71\u5f62       \u798f\u5cf6       \u8328\u57ce      
##  [9] \u6803\u6728       \u7fa4\u99ac       \u57fc\u7389       \u5343\u8449      
## [13] \u6771\u4eac       \u795e\u5948\u5ddd \u65b0\u6f5f       \u5bcc\u5c71      
## [17] \u77f3\u5ddd       \u798f\u4e95       \u5c71\u68a8       \u9577\u91ce      
## [21] \u5c90\u961c       \u9759\u5ca1       \u611b\u77e5       \u4e09\u91cd      
## [25] \u6ecb\u8cc0       \u4eac\u90fd       \u5927\u962a       \u5175\u5eab      
## [29] \u5948\u826f       \u548c\u6b4c\u5c71 \u9ce5\u53d6       \u5cf6\u6839      
## [33] \u5ca1\u5c71       \u5e83\u5cf6       \u5c71\u53e3       \u5fb3\u5cf6      
## [37] \u9999\u5ddd       \u611b\u5a9b       \u9ad8\u77e5       \u798f\u5ca1      
## [41] \u4f50\u8cc0       \u9577\u5d0e       \u718a\u672c       \u5927\u5206      
## [45] \u5bae\u5d0e       \u9e7f\u5150\u5cf6 \u6c96\u7e04      
## 47 Levels: \u9752\u68ee \u79cb\u7530 \u5ca9\u624b ... \u6ecb\u8cc0
t %>%
  plot_ly() %>%
    add_segments(
  x=~HLE_2016,y=~pref.J ,
  xend=~LE_2015,yend=~pref.J ,
    # x = ~c, y = ~model,
    # xend = ~h, yend = ~model,
    color = I("gray"), showlegend = FALSE
  ) %>%
  add_markers(
    # x = ~c, y = ~model,
  x=~HLE_2016,y=~pref.J ,
    color = I("blue"),
    name = "健康寿命"
  ) %>%
  add_markers(
    # x = ~h, y = ~model,
  x=~LE_2015,y=~pref.J ,
    color = I("red"),
    name  = "平均寿命"
  ) %>%
  layout(
    xaxis = list(
      range=c(60,83),
      title="男性の平均寿命と健康寿命の差"
    )
  )

5.2 女性の平均寿命と健康寿命の差

suppressMessages(library(plotly))

t<-d_f %>% select(pref.J,HLE_2016,LE_2015) %>%
  mutate(pref.J=forcats::fct_reorder(pref.J, LE_2015))
t$pref.J
##  [1] \u5317\u6d77\u9053 \u9752\u68ee       \u5ca9\u624b       \u5bae\u57ce      
##  [5] \u79cb\u7530       \u5c71\u5f62       \u798f\u5cf6       \u8328\u57ce      
##  [9] \u6803\u6728       \u7fa4\u99ac       \u57fc\u7389       \u5343\u8449      
## [13] \u6771\u4eac       \u795e\u5948\u5ddd \u65b0\u6f5f       \u5bcc\u5c71      
## [17] \u77f3\u5ddd       \u798f\u4e95       \u5c71\u68a8       \u9577\u91ce      
## [21] \u5c90\u961c       \u9759\u5ca1       \u611b\u77e5       \u4e09\u91cd      
## [25] \u6ecb\u8cc0       \u4eac\u90fd       \u5927\u962a       \u5175\u5eab      
## [29] \u5948\u826f       \u548c\u6b4c\u5c71 \u9ce5\u53d6       \u5cf6\u6839      
## [33] \u5ca1\u5c71       \u5e83\u5cf6       \u5c71\u53e3       \u5fb3\u5cf6      
## [37] \u9999\u5ddd       \u611b\u5a9b       \u9ad8\u77e5       \u798f\u5ca1      
## [41] \u4f50\u8cc0       \u9577\u5d0e       \u718a\u672c       \u5927\u5206      
## [45] \u5bae\u5d0e       \u9e7f\u5150\u5cf6 \u6c96\u7e04      
## 47 Levels: \u9752\u68ee \u6803\u6728 \u8328\u57ce \u79cb\u7530 ... \u9577\u91ce
t %>%
  # plot_ly(width = 600, height = 1000) %>%
  plot_ly() %>%
    add_segments(
  x=~HLE_2016,y=~pref.J ,
  xend=~LE_2015,yend=~pref.J ,
    # x = ~c, y = ~model,
    # xend = ~h, yend = ~model,
    color = I("gray"), showlegend = FALSE
  ) %>%
  add_markers(
    # x = ~c, y = ~model,
  x=~HLE_2016,y=~pref.J ,
    color = I("blue"),
    name = "健康寿命"
  ) %>%
  add_markers(
    # x = ~h, y = ~model,
  x=~LE_2015,y=~pref.J ,
    color = I("red"),
    name  = "平均寿命"
  ) %>%
  layout(
    xaxis = list(
      range=c(60,88),
      title="女性の平均寿命と健康寿命の差"
    )
  )

5.3 男性の平均寿命

d_m %>% select(LE_2015) %>% bind_cols(Wakayama::pref["pref.J"]) %>% dplyr::arrange(LE_2015) %>% mutate(rank=row_number()) %>%DT::datatable()
jc.dotplot <- function(x) {
  # x is HLE or LE in c() vector
  names(x)<-Wakayama::pref$pref.J
  x<-x[order(x)]

}

colfunc <- colorRampPalette(c("gray90","black"))

par(family= "HiraKakuProN-W3")

dotchart(
  main = "平均寿命(2015年, 男性)",
  jc.dotplot(d_m$LE_2015),
  cex = 0.7,
  lcolor = "gray90",
  pch = 19,
  col = colfunc(47),
  pt.cex = 1.5
)

abline(v = 79, lty = 2)

5.4 男性の健康寿命

d_m %>% select(HLE_2016) %>% bind_cols(Wakayama::pref["pref.J"]) %>% dplyr::arrange(HLE_2016) %>% mutate(rank=row_number()) %>%DT::datatable()
jc.dotplot <- function(x) {
  # x is HLE or LE in c() vector
  names(x)<-Wakayama::pref$pref.J
  x<-x[order(x)]

}

colfunc <- colorRampPalette(c("gray90","black"))

par(family= "HiraKakuProN-W3")

dotchart(
  main = "健康寿命(2016年, 男性)",
  jc.dotplot(d_m$HLE_2016),
  cex = 0.7,
  lcolor = "gray90",
  pch = 19,
  col = colfunc(47),
  pt.cex = 1.5
)

abline(v = 79, lty = 2)

5.5 女性の平均寿命

d_f %>% select(LE_2015) %>% bind_cols(Wakayama::pref["pref.J"]) %>% dplyr::arrange(LE_2015) %>% mutate(rank=row_number()) %>%DT::datatable()
jc.dotplot <- function(x) {
  # x is HLE or LE in c() vector
  names(x)<-Wakayama::pref$pref.J
  x<-x[order(x)]

}

colfunc <- colorRampPalette(c("gray90","black"))

par(family= "HiraKakuProN-W3")

dotchart(
  main = "平均寿命(2015年, 女性)",
  jc.dotplot(d_f$LE_2015),
  cex = 0.7,
  lcolor = "gray90",
  pch = 19,
  col = colfunc(47),
  pt.cex = 1.5
)

abline(v = 79, lty = 2)

5.6 女性の健康寿命

d_f %>% select(HLE_2016) %>% bind_cols(Wakayama::pref["pref.J"]) %>% dplyr::arrange(HLE_2016) %>% mutate(rank=row_number()) %>%DT::datatable()
jc.dotplot <- function(x) {
  # x is HLE or LE in c() vector
  names(x)<-Wakayama::pref$pref.J
  x<-x[order(x)]

}

colfunc <- colorRampPalette(c("gray90","black"))

par(family= "HiraKakuProN-W3")

dotchart(
  main = "平均寿命(2015年, 女性)",
  jc.dotplot(d_f$HLE_2016),
  cex = 0.7,
  lcolor = "gray90",
  pch = 19,
  col = colfunc(47),
  pt.cex = 1.5
)

abline(v = 79, lty = 2)

6 d_common dataの説明変数の順位確認及び正規性検定

{.tabset .tabset-fade .tabset-pills}

6.1 d_common dataの和歌山県の順位確認

d_common %>% dplyr::select_if(is.numeric) %>% sapply(rank) %>%
  tbl_df() %>%dplyr::filter(key==30) %>% t() %>% as.data.frame() %>% DT::datatable()

6.2 d_common dataの連続データ分布確認1

# purrr::map(iris[,-5], ~hist(.x))

suppressMessages(library(plotly))


d_common_standarize<-
d_common %>% dplyr::select_if(is.numeric) %>%
  select(-key, -pref.id) %>% scale() %>% tbl_df()

d_common_standarize<-bind_cols(d_common["pref.id"], d_common_standarize)


d_common_standarize_long<-

  d_common_standarize %>% dplyr::select_if(is.numeric) %>%
  pivot_longer(
       cols = -pref.id,
    names_to ="var_name" ,
    values_to ="value"
  )


plot_ly(
  x =d_common_standarize_long$value,
  type = "histogram",
  name = "Histogram",
  frame =  ~d_common_standarize_long$var_name
)

6.3 d_common dataの連続データ分布確認2

name1 <- function(temp) {
  temp %>% select(value) %>% .[[1]] %>% density() %>%
    broom::tidy()
}

d_common_standarize_long_density<-
d_common_standarize_long %>% group_by(var_name) %>% nest() %>% mutate(dens=map(data, name1)) %>% select(var_name, dens) %>% unnest()
## Warning: `cols` is now required when using unnest().
## Please use `cols = c(dens)`
d_common_standarize_long_density %>% DT::datatable()
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
d_common_standarize_long_density %>%
  plot_ly(x=~x, y=~y) %>%
  # add_lines(frame=~var_name)
  add_lines(frame=~var_name, 
            text = ~paste(var_name),
            hoverinfo = "text"
            ) %>% 
  layout(title="分布")
# add_lines(color=~status) %>%
  # storms %>%

6.4 d_common dataの連続データの正規性検定

  • shapiro 検定
jc_shapiro <- function(x) {
  shapiro.test(x) %>% broom::tidy() %>% select(p.value) %>% round(3)
}


d_common_normality_test<-d_common %>% dplyr::select_if(is.numeric) %>%
  select(-key, -pref.id) %>%
  # scale() %>%
  tbl_df() %>% purrr::map_df(jc_shapiro)

rownames(d_common_normality_test)<-d_common %>% dplyr::select_if(is.numeric) %>%
  select(-key, -pref.id) %>% colnames()

DT::datatable(d_common_normality_test)

6.5 d_common data連続データの正規性を満たさない変数

  • 要対数変換:47個の変数
d_common_normality_test %>% filter(p.value<0.05) %>% DT::datatable()

7 d_common data 和歌山,青森,滋賀,長野の説明変数の様子

library(plotly)
temp<-left_join(d_common_standarize_long, d_common,
          by = c("pref.id"="key")) %>% select(pref.J, 2,3)


temp %>% filter(pref.J=="和歌山"|
                pref.J=="青森"|
                pref.J=="滋賀"|
                pref.J=="長野") %>%
plot_ly(y =  ~ pref.J,
        x =  ~ value,
        color =  ~ pref.J) %>%
  add_bars(frame=~var_name, 
           hoverinfo="text", 
           text=~paste(var_name))

8 出典整理

{.tabset .tabset-fade .tabset-pills}

8.1 出典リンク先の詳細