경기도 공공체육시설 운영비 표준모형 시각화 프로젝트

##문제의 제기

경기도에는 100개에 가까운 공공체육시설이 있습니다.

또한 앞으로도 다수의 공공체육시설 건설이 계획되고 있습니다.

이렇게 건립된 공공체육시설은 최소 30년 이상 운영됩니다.

시민들의 건강과 삶의 질을 향상시키기 위한 공공성이 우선될 것입니다.

그래서 시민들에게 과도한 금전적 부담을 지울 수는 없습니다.

이는 운영비의 상당 부분을 지방정부의 재원으로 충당해야 함을 의미합니다.

그렇다면 계획단계에서부터, 이정도 규모의 시설이라면 어느 정도의 운영비가 들어갈 것인지, 사전적으로 분석이 이루어져야 할 것입니다.

그렇지만 안타깝게도, 현재 경기도 공공체육시설이 준공 후 어느 정도의 운영비가 소모될 것인지에 대한 체계적인 정보가 존재하지 않습니다.

그렇기 때문에 현재 공공체육시설 계획단계에서 운영비 추정은, 주먹구구식으로 이루어지는게 보통입니다.

##연구문제

그래서 본 연구는, 계획단계에서 참조할 수 있는 경기도 공공체육시설의 표준 운영비를 제시하고자 합니다.

계획 단계에서 사전적으로, 준공 후 어느 정도의 재정지원이 필요할 것인지 파악할 수 있는 체계적인 자료를 도출하는 것이 목표입니다.

어떠한 과정을 거쳐 경기도 공공체육시설의 운영비 표준모형을 도출하였는지, 시각화 기법을 적극적으로 사용하여 논지를 전개해나가고자 합니다.

데이터 밑작업

##업로드할것
library(readxl)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)

ath0 <- read_xlsx("D:\\_work\\2020\\12\\ath\\ath0_final.xlsx", sheet = "total1")

ath0$ln_area <- log10(ath0$area)
ath0$ln_total_cost <- log10(ath0$total_cost)
ath0$pool_yn_l <- ifelse(ath0$pool_yn == 1, "1_수영장 있음", "2_수영장 없음")
ath0$pool_yn_l2 <- ifelse(ath0$pool_yn == 0 & ath0$area_type == "대형", "3_수영장 없는\n     대형시설", ath0$pool_yn_l)

규모와 운영비와의 관계

그런데, 도대체 경기도 공공체육시설의 운영비는 시설별로 어떠한 분포를 지니고 있을까요?

연간 운영비가 작은 곳과 큰 곳은 어느 정도 차이가 있을까요?

또한, 이러한 연간 운영비는 시설 규모와 기능, 그러니깐 쉽게 말해 체육시설의 세부 기능(수영장, 헬스장 등)별로 차이가 어느 정도나 나타날까요?

이를 알아보기 위해, 연면적과 연간 총 운영비와의 관계를 산점도로 나타내보겠습니다.

스케일 문제로 인해 연면적과 연간 총 운영비 모두를 로그 변환 후 그려봅니다.

두 가지를 알 수 있습니다.

첫째, 경기도 공공체육시설은 규모와 연간 총 운영비가 참 다양하다는 사실, 둘째, 대체로 규모(연면적)이 증가함에 따라 연간 총 운영비가 증가한다는 사실.

여기서 중요한 점은, 이렇듯 공공체육시설의 규모에 따라, 또는 규모 이외의 변수에 따라 연간 총 운영비가 상당히 차이가 크기 때문에, 적절한 분류에 따라 상이한 운영비 표준모형을 뽑아야 한다는 점입니다.

ggplot(ath0, aes(x = ln_area, y = ln_total_cost)) + geom_point(size = 3, alpha = .5)

기능별 분류

선행연구에서 공통적으로 지적하고 있는 점이 있습니다.

수영장이 존재하는 경우, 그렇지 않은 시설보다 높은 수준의 운영비가 지출된다는 점입니다.

수영장을 운영하기 위한 급수비와 급탕비 뿐만 아니라 다수의 수영강사가 필요하기 때문입니다.

그 비용이 상당한 수준입니다. 선행연구에 따르면 수영장이 있고 없고에 따라 단위면적당 운영비가 2-3배 차이가 날 정도니까요.

따라서 수영장 운영 유무를 앞의 산점도에 추가하여 분석하도록 하겠습니다.

명확하게, 수영장 존재 유무에 따라 y축에 표시한 연간 총 운영비가 높고, 낮은 두 개의 그룹으로 분할됨을 알 수 있습니다.

여기서 두 가지 시사점을 도출할 수 있습니다.

첫째, 운영비의 관점에서 체육시설의 기능 분류 핵심은, 수영장 운영 유무이다. 둘째, 기능 분류 뿐만 아니라 적절한 수준에서 규모 기준의 그룹 분류가 필요하다.

ggplot(ath0, aes(x = ln_area, y = ln_total_cost, shape = pool_yn_l)) + geom_point(size = 3, alpha = .5)

규모 기준의 분류

앞서 수영장 존재 여부라는 기능분류를 그룹 구분의 기준으로 제시하였습니다.

다음으로 규모 기준의 분류를 추가적으로 제시하고자 합니다.

어느 정도 수준으로 두 집단의 분류를 나눌 수 있을까요?

여러가지 방법이 존재하겠지만, 선행연구의 기준을 적용하고자 합니다.

문화체육관광부의 기초지자체 공공체육시설 표준모형에서 제시하는 최대 규모는 연면적 3,760m^2입니다.

이 기준으로 경기도 공공체육시설을 분류해 보면 비교적 잘 들어맞는 것을 알 수 있습니다.

또한 이 기준을 적용할 경우, 수영장이 없는 대형체육시설이 명확하게 별도의 그룹으로 분류되는 것을 알 수 있습니다.

따라서 경기도 체육시설의 기본적 분류 기준은 수영장 유/무, 연면적 대/소로 제시할 수 있습니다.

ggplot(ath0, aes(x = ln_area, y = ln_total_cost, shape = pool_yn_l2, color = pool_yn_l2)) + geom_point(size = 3) +
  scale_color_manual(values = c("grey20", "grey50", "grey80")) +
  geom_vline(aes(xintercept = 3.575188), color = "grey50", size = 1, linetype = 2) +
  labs(x = "연면적(m^2, 로그변환)", y = "연간 총 운영비(천원, 로그변환)") +
  scale_x_continuous(limits = c(2, 4.2), breaks = c(2, 3, 4), labels = c("log(10^2)", "log(10^3)", "log(10^4)")) +
  scale_y_continuous(limits = c(4, 7), breaks = c(4, 5, 6, 7), 
                     labels = c("log(10^4)", "log(10^5)", "log(10^6)", "log(10^7)"))

자율이용시설의 분류

그런데 규모가 작고 수영장이 없는 시설은 아주 재미있는 점이 있습니다.

일반적으로 우리가 생각하는 생활체육시설은, 강사의 지도를 받아 운동을 할 수 있는 곳이죠.

그런데 의외로, 이 유형에서는 강사를 아예 고용하지 않고 운영되는 시설의 숫자가 생각보다 많습니다.

이를 분석하기 위해, 규모가 작고 수영장이 없는 시설만 따로 뽑아, 강사 수와 규모에 대한 산점도를 그려 봤습니다. 또한 연간 총 운영비가 낮은 기관은 연한 색의 점으로, 운영비가 높은 기관은 진한 색의 점으로 나타냈습니다.

재미있는 점은, 규모가 작고 연간 총 운영비가 낮은 기관 중에서 강사를 아예 고용하지 않는 시설이 많이 존재한다는겁니다. 이러한 기관은 강사를 고용하는 시설에 비해 연간 총 운영비가 확연히 낮습니다.

따라서, 수영장이 없고 규모가 작은 시설 유형에서도, 강사를 고용하는 시설과 그렇지 않은 시설을 구분해서 분석할 필요가 있습니다.

ath0$error <- rnorm(70, 0, 1)

ath1 <- subset(ath0, pool_yn == 0)

ggplot(ath1, aes(x = hr_teacher, y = ln_area, color = total_cost)) + geom_point(size = 8, alpha = .7) +
  scale_color_gradient(low = "grey60", high = "black") +
  scale_y_continuous(limits = c(2, 4.5), breaks = c(2, 3, 4), 
                     labels = c("log(10^2)", "log(10^3)", "log(10^4)"))

몰아치기

## 4개 유형의 인력분포
head(ath0)

## # A tibble: 6 x 43
##   sigungu name   area 강사유무 small pool_yn area_type type0 type_r lab_cost
##   <chr>   <chr> <dbl>    <dbl> <chr>   <dbl> <chr>     <dbl>  <dbl>    <dbl>
## 1 양평군  문화체육~   234        0 대여        0 소형          1      1    2620.
## 2 이천시  장호원국~   260        0 대여        0 소형          1      1   11332 
## 3 안산시  장화체육~   300        0 대여        0 소형          1      1   71479 
## 4 포천시  이동교육~   345        0 대여        0 소형          1      1   21128 
## 5 여주시  강천농어~   390        0 대여        0 소형          1      1       0 
## 6 여주시  산북농어~   526        0 대여        0 소형          1      1       0 
## # ... with 33 more variables: gen_cost <dbl>, main_cost <dbl>, op_cost <dbl>,
## #   etc_work <dbl>, mult_room <dbl>, multi_gym <dbl>, pool <dbl>,
## #   outdoor_fac <dbl>, spc_gym <dbl>, gx <dbl>, total <dbl>, hr_gx <dbl>,
## #   hr_pool <dbl>, hr_etc <dbl>, hr_teacher <dbl>, hr_admin <dbl>,
## #   hr_facmag <dbl>, hr_mag <dbl>, hr_total <dbl>, cost_gx <dbl>,
## #   cost_pool <dbl>, cost_etc <dbl>, cost_teacher <dbl>, cost_admin <dbl>,
## #   cost_facmag <dbl>, cost_mag <dbl>, cost_lab <dbl>, total_cost <dbl>,
## #   ln_area <dbl>, ln_total_cost <dbl>, pool_yn_l <chr>, pool_yn_l2 <chr>,
## #   error <dbl>

teacher <- ath0 %>%
  group_by(type0) %>%
  summarize(p000 = min(hr_teacher, na.rm = TRUE),
            p010 = quantile(hr_teacher, prob = .1, na.rm = TRUE)[[1]],
            p025 = quantile(hr_teacher, na.rm = TRUE)[[2]],
            p050 = median(hr_teacher, na.rm = TRUE),
            p075 = quantile(hr_teacher, na.rm = TRUE)[[4]],
            p090 = quantile(hr_teacher, prob = .9, na.rm = TRUE)[[1]],
            p100 = max(hr_teacher, na.rm = TRUE))

## `summarise()` ungrouping output (override with `.groups` argument)

teacher

## # A tibble: 4 x 8
##   type0  p000  p010  p025  p050  p075  p090  p100
##   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1     1     0   0     0     0     0.5   6.8    32
## 2     2     4   4.6   5.5   7.5   9.5  10.4    11
## 3     3     6  11.6  16    25    44    57.8    83
## 4     4     2  15.2  18.5  38    51.5  64.3    74

library(reshape2)


teacher_t <- melt(teacher, id.var = "type0")

teacher_t

##    type0 variable value
## 1      1     p000   0.0
## 2      2     p000   4.0
## 3      3     p000   6.0
## 4      4     p000   2.0
## 5      1     p010   0.0
## 6      2     p010   4.6
## 7      3     p010  11.6
## 8      4     p010  15.2
## 9      1     p025   0.0
## 10     2     p025   5.5
## 11     3     p025  16.0
## 12     4     p025  18.5
## 13     1     p050   0.0
## 14     2     p050   7.5
## 15     3     p050  25.0
## 16     4     p050  38.0
## 17     1     p075   0.5
## 18     2     p075   9.5
## 19     3     p075  44.0
## 20     4     p075  51.5
## 21     1     p090   6.8
## 22     2     p090  10.4
## 23     3     p090  57.8
## 24     4     p090  64.3
## 25     1     p100  32.0
## 26     2     p100  11.0
## 27     3     p100  83.0
## 28     4     p100  74.0

teacher_t$x <- as.numeric(substr(teacher_t$variable, 2, 4))

ggplot(teacher_t, aes(x = x, y = value, shape = factor(type0), linetype = factor(type0))) + 
  geom_point() + geom_line()

mag <- ath0 %>%
  group_by(type0) %>%
  summarize(p000 = min(hr_mag, na.rm = TRUE),
            p010 = quantile(hr_mag, prob = .1, na.rm = TRUE)[[1]],
            p025 = quantile(hr_mag, na.rm = TRUE)[[2]],
            p050 = median(hr_mag, na.rm = TRUE),
            p075 = quantile(hr_mag, na.rm = TRUE)[[4]],
            p090 = quantile(hr_mag, prob = .9, na.rm = TRUE)[[1]],
            p100 = max(hr_mag, na.rm = TRUE))

## `summarise()` ungrouping output (override with `.groups` argument)

mag

## # A tibble: 4 x 8
##   type0  p000  p010  p025  p050  p075  p090  p100
##   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1     1     0   1       1   1     3.5   8      21
## 2     2     0   2.4     6  10.5  16.2  22.1    26
## 3     3     3   5       8  11    16    26.     43
## 4     4     3  10.3    14  15    18.8  40.2    66

mag_t <- melt(mag, id.var = "type0")

mag_t$x <- as.numeric(substr(mag_t$variable, 2, 4))

mag_t

##    type0 variable value   x
## 1      1     p000  0.00   0
## 2      2     p000  0.00   0
## 3      3     p000  3.00   0
## 4      4     p000  3.00   0
## 5      1     p010  1.00  10
## 6      2     p010  2.40  10
## 7      3     p010  5.00  10
## 8      4     p010 10.30  10
## 9      1     p025  1.00  25
## 10     2     p025  6.00  25
## 11     3     p025  8.00  25
## 12     4     p025 14.00  25
## 13     1     p050  1.00  50
## 14     2     p050 10.50  50
## 15     3     p050 11.00  50
## 16     4     p050 15.00  50
## 17     1     p075  3.50  75
## 18     2     p075 16.25  75
## 19     3     p075 16.00  75
## 20     4     p075 18.75  75
## 21     1     p090  8.00  90
## 22     2     p090 22.10  90
## 23     3     p090 26.00  90
## 24     4     p090 40.20  90
## 25     1     p100 21.00 100
## 26     2     p100 26.00 100
## 27     3     p100 43.00 100
## 28     4     p100 66.00 100

ggplot(mag_t, aes(x = x, y = value, shape = factor(type0), linetype = factor(type0))) + 
  geom_point() + geom_line()

## 강사고용 여부를 기준으로, 연면적과 운영비의 관계

ath1 <- subset(ath0, type0 == 1)

ggplot(ath1, aes(x = area, y = total_cost, shape = factor(강사유무), color = factor(강사유무))) + 
  geom_point(size = 3) +
  scale_color_manual(values = c("grey70", "grey40"))

ggplot(ath1, aes(x = hr_total, y = ln_total_cost, shape = factor(강사유무), color = factor(강사유무))) + 
  geom_point(size = 3) +
  scale_color_manual(values = c("grey70", "grey40"))

## 분류 현황

table(ath0$type_r)

## 
##   1   2   3 3.5   4 
##  26   9  17   4  14

## 수영장 없는 대형?

ath2 <- subset(ath0, type0 == 2)

ggplot(ath2, aes(x = area, y = total_cost)) + geom_point()

## 수영장 없는 대형을 제외

ath3 <- subset(ath0, type_r != 3.5)

ggplot(ath3, aes(x = ln_area, y = ln_total_cost, shape = factor(type_r), color(type_r))) + geom_point()

ggplot(ath3, aes(x = hr_total, y = ln_total_cost, shape = factor(type_r), color(type_r))) + 
  geom_point(size = 3)

ggplot(ath3, aes(x = hr_total, y = ln_area, shape = factor(type_r), color(type_r))) + 
  geom_point(size = 3)

ggplot(ath3, aes(x = ln_area, y = ln_total_cost, shape = factor(type_r), color(type_r))) + 
  geom_point(size = 3)

## 유형별 강사?

hr <- ath3 %>%
  group_by(type_r) %>%
  summarize(teacher = mean(hr_teacher),
            mag = mean(hr_mag))

## `summarise()` ungrouping output (override with `.groups` argument)

hr_t <- melt(hr, id.vars = "type_r")

hr_t

##   type_r variable     value
## 1      1  teacher  0.000000
## 2      2  teacher  9.222222
## 3      3  teacher 32.470588
## 4      4  teacher 37.000000
## 5      1      mag  2.423077
## 6      2      mag  5.222222
## 7      3      mag 14.117647
## 8      4      mag 21.000000

ggplot(hr_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "dodge", stat = "identity")

hr1 <- ath3 %>%
  group_by(type_r) %>%
  summarize(hr_gx = mean(hr_gx),
            hr_pool = mean(hr_pool),
            hr_etc = mean(hr_etc))

## `summarise()` ungrouping output (override with `.groups` argument)

hr1

## # A tibble: 4 x 4
##   type_r hr_gx hr_pool hr_etc
##    <dbl> <dbl>   <dbl>  <dbl>
## 1      1  0        0     0   
## 2      2  1.78     0     7.44
## 3      3  1.82    17.1  13.6 
## 4      4  3.14    19.1  14.8

hr1_t <- melt(hr1, id.vars = "type_r")

hr1_t

##    type_r variable     value
## 1       1    hr_gx  0.000000
## 2       2    hr_gx  1.777778
## 3       3    hr_gx  1.823529
## 4       4    hr_gx  3.142857
## 5       1  hr_pool  0.000000
## 6       2  hr_pool  0.000000
## 7       3  hr_pool 17.058824
## 8       4  hr_pool 19.071429
## 9       1   hr_etc  0.000000
## 10      2   hr_etc  7.444444
## 11      3   hr_etc 13.588235
## 12      4   hr_etc 14.785714

ggplot(hr1_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "dodge", stat = "identity")

ggplot(hr1_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "stack", stat = "identity")

hr2 <- ath3 %>%
  group_by(type_r) %>%
  summarise(hr_admin = mean(hr_admin),
            hr_facmag = mean(hr_facmag))

## `summarise()` ungrouping output (override with `.groups` argument)

hr2

## # A tibble: 4 x 3
##   type_r hr_admin hr_facmag
##    <dbl>    <dbl>     <dbl>
## 1      1     1.31      1.12
## 2      2     1.78      3.44
## 3      3     4.41      9.71
## 4      4     4.79     16.2

hr2_t <- melt(hr2, id.vars = "type_r")

hr2_t

##   type_r  variable     value
## 1      1  hr_admin  1.307692
## 2      2  hr_admin  1.777778
## 3      3  hr_admin  4.411765
## 4      4  hr_admin  4.785714
## 5      1 hr_facmag  1.115385
## 6      2 hr_facmag  3.444444
## 7      3 hr_facmag  9.705882
## 8      4 hr_facmag 16.214286

ggplot(hr2_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "stack", stat = "identity")

## 인건비 및 인건비성 경비

head(ath3)

## # A tibble: 6 x 43
##   sigungu name   area 강사유무 small pool_yn area_type type0 type_r lab_cost
##   <chr>   <chr> <dbl>    <dbl> <chr>   <dbl> <chr>     <dbl>  <dbl>    <dbl>
## 1 양평군  문화체육~   234        0 대여        0 소형          1      1    2620.
## 2 이천시  장호원국~   260        0 대여        0 소형          1      1   11332 
## 3 안산시  장화체육~   300        0 대여        0 소형          1      1   71479 
## 4 포천시  이동교육~   345        0 대여        0 소형          1      1   21128 
## 5 여주시  강천농어~   390        0 대여        0 소형          1      1       0 
## 6 여주시  산북농어~   526        0 대여        0 소형          1      1       0 
## # ... with 33 more variables: gen_cost <dbl>, main_cost <dbl>, op_cost <dbl>,
## #   etc_work <dbl>, mult_room <dbl>, multi_gym <dbl>, pool <dbl>,
## #   outdoor_fac <dbl>, spc_gym <dbl>, gx <dbl>, total <dbl>, hr_gx <dbl>,
## #   hr_pool <dbl>, hr_etc <dbl>, hr_teacher <dbl>, hr_admin <dbl>,
## #   hr_facmag <dbl>, hr_mag <dbl>, hr_total <dbl>, cost_gx <dbl>,
## #   cost_pool <dbl>, cost_etc <dbl>, cost_teacher <dbl>, cost_admin <dbl>,
## #   cost_facmag <dbl>, cost_mag <dbl>, cost_lab <dbl>, total_cost <dbl>,
## #   ln_area <dbl>, ln_total_cost <dbl>, pool_yn_l <chr>, pool_yn_l2 <chr>,
## #   error <dbl>

hr_cost0 <- ath3 %>%
  group_by(type_r) %>%
  summarize(l_cost1 = mean(cost_lab),
            l_cost2 = mean(lab_cost))

## `summarise()` ungrouping output (override with `.groups` argument)

hr_cost0

## # A tibble: 4 x 3
##   type_r l_cost1 l_cost2
##    <dbl>   <dbl>   <dbl>
## 1      1  59625.  31747.
## 2      2 197526.  55090.
## 3      3 776882. 458028.
## 4      4 960864. 695792.

hr_cost0_t <- melt(hr_cost0, id.vars = "type_r")

hr_cost0_t

##   type_r variable     value
## 1      1  l_cost1  59625.19
## 2      2  l_cost1 197525.56
## 3      3  l_cost1 776881.85
## 4      4  l_cost1 960863.72
## 5      1  l_cost2  31747.47
## 6      2  l_cost2  55090.11
## 7      3  l_cost2 458028.33
## 8      4  l_cost2 695792.13

ggplot(hr_cost0_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "stack", stat = "identity")

hr_cost <- ath3 %>%
  group_by(type_r) %>%
  summarize(n = mean(hr_total),
            l_cost1 = mean(cost_lab),
            l_cost2 = mean(lab_cost))

## `summarise()` ungrouping output (override with `.groups` argument)

##인력운영비
hr_cost

## # A tibble: 4 x 4
##   type_r     n l_cost1 l_cost2
##    <dbl> <dbl>   <dbl>   <dbl>
## 1      1  2.42  59625.  31747.
## 2      2 14.4  197526.  55090.
## 3      3 46.6  776882. 458028.
## 4      4 58    960864. 695792.

hr_cost$n2 <- round(hr_cost$n, 1)

hr_cost$l_cost_total <- with(hr_cost, l_cost1 + l_cost2)

hr_cost$avr <- with(hr_cost, l_cost_total / n2)

ggplot(hr_cost, aes(x = type_r, y = avr)) + geom_bar(stat = "identity")

## 관리운영비

mag <- ath3 %>%
  group_by(type_r) %>%
  summarize(gen_cost = mean(gen_cost),
            main_cost = mean(main_cost))

## `summarise()` ungrouping output (override with `.groups` argument)

mag

## # A tibble: 4 x 3
##   type_r gen_cost main_cost
##    <dbl>    <dbl>     <dbl>
## 1      1   55548.    14264.
## 2      2  119813.    48569.
## 3      3  466023.   196524.
## 4      4  945368.   385996.

mag_t <- melt(mag, id.vars = "type_r")

mag_t

##   type_r  variable     value
## 1      1  gen_cost  55548.47
## 2      2  gen_cost 119812.76
## 3      3  gen_cost 466022.97
## 4      4  gen_cost 945367.87
## 5      1 main_cost  14264.06
## 6      2 main_cost  48568.70
## 7      3 main_cost 196524.33
## 8      4 main_cost 385995.81

ggplot(mag_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(position = "stack", stat = "identity")

mag2 <- ath3 %>%
  group_by(type_r) %>%
  summarize(area = sum(area),
            gen_cost = sum(gen_cost),
            main_cost = sum(main_cost))

## `summarise()` ungrouping output (override with `.groups` argument)

mag2$gen_per <- with(mag2, gen_cost / area)
mag2$main_per <- with(mag2, main_cost / area)

##운영관리비
mag2

## # A tibble: 4 x 6
##   type_r    area  gen_cost main_cost gen_per main_per
##    <dbl>   <dbl>     <dbl>     <dbl>   <dbl>    <dbl>
## 1      1  29979.  1444260.   370866.    48.2     12.4
## 2      2  16021.  1078315.   437118.    67.3     27.3
## 3      3  35212.  7922390.  3340914.   225.      94.9
## 4      4 107665. 13235150.  5403941.   123.      50.2

mag2_1 <- subset(mag2, select = c("type_r", "gen_per", "main_per"))

mag2_1

## # A tibble: 4 x 3
##   type_r gen_per main_per
##    <dbl>   <dbl>    <dbl>
## 1      1    48.2     12.4
## 2      2    67.3     27.3
## 3      3   225.      94.9
## 4      4   123.      50.2

mag2_1$pool <- ifelse(mag2_1$type_r <= 2, 0, 1)

mag2_1

## # A tibble: 4 x 4
##   type_r gen_per main_per  pool
##    <dbl>   <dbl>    <dbl> <dbl>
## 1      1    48.2     12.4     0
## 2      2    67.3     27.3     0
## 3      3   225.      94.9     1
## 4      4   123.      50.2     1

mag2_1_t <- melt(mag2_1, id.vars = c("type_r", "pool"))

mag2_1_t

##   type_r pool variable     value
## 1      1    0  gen_per  48.17511
## 2      2    0  gen_per  67.30647
## 3      3    1  gen_per 224.98910
## 4      4    1  gen_per 122.92857
## 5      1    0 main_per  12.37068
## 6      2    0 main_per  27.28414
## 7      3    1 main_per  94.87908
## 8      4    1 main_per  50.19201

mag2_1_t$type_p <- ifelse(mag2_1_t$type_r == 1 | mag2_1_t$type_r == 3, 1, 2)

ggplot(mag2_1_t, aes(x = type_p, y = value, fill = variable)) + 
  geom_bar(position = "dodge", stat = "identity") +
  facet_wrap(~pool)

## 운영표준모형

std_hr <- ath3 %>%
  group_by(type_r) %>%
  summarize(hr_gx = mean(hr_gx),
            hr_pool = mean(hr_pool),
            hr_etc = mean(hr_etc),
            hr_admin = mean(hr_admin),
            hr_facmag = mean(hr_facmag))

## `summarise()` ungrouping output (override with `.groups` argument)

std_hr

## # A tibble: 4 x 6
##   type_r hr_gx hr_pool hr_etc hr_admin hr_facmag
##    <dbl> <dbl>   <dbl>  <dbl>    <dbl>     <dbl>
## 1      1  0        0     0        1.31      1.12
## 2      2  1.78     0     7.44     1.78      3.44
## 3      3  1.82    17.1  13.6      4.41      9.71
## 4      4  3.14    19.1  14.8      4.79     16.2

std_hr1 <- ceiling(std_hr)

std_hr1$total <- with(std_hr1, hr_gx + hr_pool + hr_etc + hr_admin + hr_facmag)

std_hr1

## # A tibble: 4 x 7
##   type_r hr_gx hr_pool hr_etc hr_admin hr_facmag total
##    <dbl> <dbl>   <dbl>  <dbl>    <dbl>     <dbl> <dbl>
## 1      1     0       0      0        2         2     4
## 2      2     2       0      8        2         4    16
## 3      3     2      18     14        5        10    49
## 4      4     4      20     15        5        17    61

std_model <- subset(std_hr1, select = c("type_r", "total"))

std_model

## # A tibble: 4 x 2
##   type_r total
##    <dbl> <dbl>
## 1      1     4
## 2      2    16
## 3      3    49
## 4      4    61

std_model$area <- c(1000, 2000, 2000, 7500)

##인력운영비
hr_cost

## # A tibble: 4 x 7
##   type_r     n l_cost1 l_cost2    n2 l_cost_total    avr
##    <dbl> <dbl>   <dbl>   <dbl> <dbl>        <dbl>  <dbl>
## 1      1  2.42  59625.  31747.   2.4       91373. 38072.
## 2      2 14.4  197526.  55090.  14.4      252616. 17543.
## 3      3 46.6  776882. 458028.  46.6     1234910. 26500.
## 4      4 58    960864. 695792.  58       1656656. 28563.

std_model$l_cost <- hr_cost$avr
std_model$l_cost_total <- with(std_model, total * l_cost)


##운영관리비
mag2

## # A tibble: 4 x 6
##   type_r    area  gen_cost main_cost gen_per main_per
##    <dbl>   <dbl>     <dbl>     <dbl>   <dbl>    <dbl>
## 1      1  29979.  1444260.   370866.    48.2     12.4
## 2      2  16021.  1078315.   437118.    67.3     27.3
## 3      3  35212.  7922390.  3340914.   225.      94.9
## 4      4 107665. 13235150.  5403941.   123.      50.2

std_model$gen_per <- round(mag2$gen_per, 1)
std_model$main_per <- round(mag2$main_per, 1)

std_model

## # A tibble: 4 x 7
##   type_r total  area l_cost l_cost_total gen_per main_per
##    <dbl> <dbl> <dbl>  <dbl>        <dbl>   <dbl>    <dbl>
## 1      1     4  1000 38072.      152288.    48.2     12.4
## 2      2    16  2000 17543.      280684.    67.3     27.3
## 3      3    49  2000 26500.     1298511.   225       94.9
## 4      4    61  7500 28563.     1742345.   123.      50.2

std_model$gen_cost <- with(std_model, area * gen_per)
std_model$main_cost <- with(std_model, area * main_per)

std_model$op_main_total <- with(std_model, gen_cost + main_cost)

std_model$total_cost <- with(std_model, op_main_total + l_cost_total)

std_model

## # A tibble: 4 x 11
##   type_r total  area l_cost l_cost_total gen_per main_per gen_cost main_cost
##    <dbl> <dbl> <dbl>  <dbl>        <dbl>   <dbl>    <dbl>    <dbl>     <dbl>
## 1      1     4  1000 38072.      152288.    48.2     12.4    48200     12400
## 2      2    16  2000 17543.      280684.    67.3     27.3   134600     54600
## 3      3    49  2000 26500.     1298511.   225       94.9   450000    189800
## 4      4    61  7500 28563.     1742345.   123.      50.2   921750    376500
## # ... with 2 more variables: op_main_total <dbl>, total_cost <dbl>

?melt

## starting httpd help server ...

##  done

std_model_t <- melt(std_model, id.vars = "type_r", measure.vars = c("l_cost_total", "op_main_total"))

std_model_t

##   type_r      variable     value
## 1      1  l_cost_total  152287.8
## 2      2  l_cost_total  280684.1
## 3      3  l_cost_total 1298510.7
## 4      4  l_cost_total 1742344.9
## 5      1 op_main_total   60600.0
## 6      2 op_main_total  189200.0
## 7      3 op_main_total  639800.0
## 8      4 op_main_total 1298250.0

ggplot(std_model_t, aes(x = type_r, y = value, fill = variable)) + geom_bar(stat = "identity")

## 모형별 뜯어보기

std_model$hr_gx <- std_hr1$hr_gx
std_model$hr_pool <- std_hr1$hr_pool
std_model$hr_etc <- std_hr1$hr_etc
std_model$hr_admin <- std_hr1$hr_admin
std_model$hr_facmag <- std_hr1$hr_facmag

std_model$hr_teacher <- with(std_model, hr_gx + hr_pool + hr_etc)
std_model$hr_mag <- with(std_model, hr_admin + hr_facmag)

std_model_final <- subset(std_model, select = c("type_r", "area", "hr_teacher", "hr_mag", "total", 
                                                "l_cost_total", "op_main_total", "total_cost"))
std_model_final

## # A tibble: 4 x 8
##   type_r  area hr_teacher hr_mag total l_cost_total op_main_total total_cost
##    <dbl> <dbl>      <dbl>  <dbl> <dbl>        <dbl>         <dbl>      <dbl>
## 1      1  1000          0      4     4      152288.         60600    212888.
## 2      2  2000         10      6    16      280684.        189200    469884.
## 3      3  2000         34     15    49     1298511.        639800   1938311.
## 4      4  7500         39     22    61     1742345.       1298250   3040595.