Ejercicio 1: Swimming Pools

setwd("C:/Users/Angelica Elena/Desktop/Estadistica Computacional/dataset")
df_Angie <- read.csv("swimming_pools.csv", stringsAsFactors = TRUE)
str(df_Angie)

## 'data.frame':    20 obs. of  4 variables:
##  $ Name     : Factor w/ 20 levels "Acacia Ridge Leisure Centre",..: 1 2 3 4 5 6 19 7 8 9 ...
##  $ Address  : Factor w/ 20 levels "1 Fairlead Crescent, Manly",..: 5 20 18 10 9 11 6 15 12 17 ...
##  $ Latitude : num  -27.6 -27.6 -27.6 -27.5 -27.4 ...
##  $ Longitude: num  153 153 153 153 153 ...

summary(df_Angie)

##                           Name                                     Address  
##  Acacia Ridge Leisure Centre: 1   1 Fairlead Crescent, Manly           : 1  
##  Bellbowrie Pool            : 1   100 Edmonstone Street, South Brisbane: 1  
##  Carole Park                : 1   11 Yallambee Road, Jindalee          : 1  
##  Centenary Pool (inner City): 1   131 Caxton Street, Paddington        : 1  
##  Chermside Pool             : 1   1391 Beaudesert Road, Acacia Ridge   : 1  
##  Colmslie Pool (Morningside): 1   14 Torrington Street, Springhill     : 1  
##  (Other)                    :14   (Other)                              :14  
##     Latitude        Longitude    
##  Min.   :-27.61   Min.   :152.9  
##  1st Qu.:-27.55   1st Qu.:153.0  
##  Median :-27.49   Median :153.0  
##  Mean   :-27.49   Mean   :153.0  
##  3rd Qu.:-27.45   3rd Qu.:153.1  
##  Max.   :-27.31   Max.   :153.2  
##

print(df_Angie)

##                                         Name
## 1                Acacia Ridge Leisure Centre
## 2                            Bellbowrie Pool
## 3                                Carole Park
## 4                Centenary Pool (inner City)
## 5                             Chermside Pool
## 6                Colmslie Pool (Morningside)
## 7             Spring Hill Baths (inner City)
## 8                 Dunlop Park Pool (Corinda)
## 9                      Fortitude Valley Pool
## 10 Hibiscus Sports Complex (upper MtGravatt)
## 11                 Ithaca Pool ( Paddington)
## 12                             Jindalee Pool
## 13                                Manly Pool
## 14            Mt Gravatt East Aquatic Centre
## 15       Musgrave Park Pool (South Brisbane)
## 16                            Newmarket Pool
## 17                              Runcorn Pool
## 18                             Sandgate Pool
## 19      Langlands Parks Pool (Stones Corner)
## 20                         Yeronga Park Pool
##                                        Address  Latitude Longitude
## 1           1391 Beaudesert Road, Acacia Ridge -27.58616  153.0264
## 2                 Sugarwood Street, Bellbowrie -27.56547  152.8911
## 3   Cnr Boundary Road and Waterford Road Wacol -27.60744  152.9315
## 4             400 Gregory Terrace, Spring Hill -27.45537  153.0251
## 5                 375 Hamilton Road, Chermside -27.38583  153.0351
## 6                 400 Lytton Road, Morningside -27.45516  153.0789
## 7             14 Torrington Street, Springhill -27.45960  153.0215
## 8                      794 Oxley Road, Corinda -27.54652  152.9806
## 9         432 Wickham Street, Fortitude Valley -27.45390  153.0368
## 10         90 Klumpp Road, Upper Mount Gravatt -27.55183  153.0735
## 11               131 Caxton Street, Paddington -27.46226  153.0103
## 12                 11 Yallambee Road, Jindalee -27.53236  152.9427
## 13                  1 Fairlead Crescent, Manly -27.45228  153.1874
## 14 Cnr wecker Road and Newnham Road, Mansfield -27.53214  153.0943
## 15       100 Edmonstone Street, South Brisbane -27.47978  153.0168
## 16                71 Alderson Stret, Newmarket -27.42968  153.0062
## 17                   37 Bonemill Road, Runcorn -27.59156  153.0764
## 18               231 Flinders Parade, Sandgate -27.31196  153.0691
## 19             5 Panitya Street, Stones Corner -27.49769  153.0487
## 20                     81 School Road, Yeronga -27.52053  153.0185

El dataset swimming_pools.csv contiene información relevante sobre piscinas. Se identificó la estructura de las variables y un resumen estadístico que será clave para identificar valores atípicos y patrones relevantes en el análisis.

##Ejercicio 2. Hotdogs -------------------------------------------------------------
# Importar hotdogs.txt
setwd("C:/Users/Angelica Elena/Desktop/Estadistica Computacional/dataset")
df_hotdogs_txt=read.delim("hotdogs.txt",sep="",stringsAsFactors = FALSE)
str(df_hotdogs_txt)

## 'data.frame':    53 obs. of  3 variables:
##  $ Beef: chr  "Beef" "Beef" "Beef" "Beef" ...
##  $ X186: int  181 176 149 184 190 158 139 175 148 152 ...
##  $ X495: int  477 425 322 482 587 370 322 479 375 330 ...

summary(df_hotdogs_txt)

##      Beef                X186            X495      
##  Length:53          Min.   : 86.0   Min.   :144.0  
##  Class :character   1st Qu.:132.0   1st Qu.:360.0  
##  Mode  :character   Median :144.0   Median :405.0  
##                     Mean   :144.7   Mean   :423.5  
##                     3rd Qu.:172.0   3rd Qu.:506.0  
##                     Max.   :195.0   Max.   :645.0

# Resumen hotdogs
summary(df_hotdogs_txt)

##      Beef                X186            X495      
##  Length:53          Min.   : 86.0   Min.   :144.0  
##  Class :character   1st Qu.:132.0   1st Qu.:360.0  
##  Mode  :character   Median :144.0   Median :405.0  
##                     Mean   :144.7   Mean   :423.5  
##                     3rd Qu.:172.0   3rd Qu.:506.0  
##                     Max.   :195.0   Max.   :645.0

print(df_hotdogs_txt)

##       Beef X186 X495
## 1     Beef  181  477
## 2     Beef  176  425
## 3     Beef  149  322
## 4     Beef  184  482
## 5     Beef  190  587
## 6     Beef  158  370
## 7     Beef  139  322
## 8     Beef  175  479
## 9     Beef  148  375
## 10    Beef  152  330
## 11    Beef  111  300
## 12    Beef  141  386
## 13    Beef  153  401
## 14    Beef  190  645
## 15    Beef  157  440
## 16    Beef  131  317
## 17    Beef  149  319
## 18    Beef  135  298
## 19    Beef  132  253
## 20    Meat  173  458
## 21    Meat  191  506
## 22    Meat  182  473
## 23    Meat  190  545
## 24    Meat  172  496
## 25    Meat  147  360
## 26    Meat  146  387
## 27    Meat  139  386
## 28    Meat  175  507
## 29    Meat  136  393
## 30    Meat  179  405
## 31    Meat  153  372
## 32    Meat  107  144
## 33    Meat  195  511
## 34    Meat  135  405
## 35    Meat  140  428
## 36    Meat  138  339
## 37 Poultry  129  430
## 38 Poultry  132  375
## 39 Poultry  102  396
## 40 Poultry  106  383
## 41 Poultry   94  387
## 42 Poultry  102  542
## 43 Poultry   87  359
## 44 Poultry   99  357
## 45 Poultry  107  528
## 46 Poultry  113  513
## 47 Poultry  135  426
## 48 Poultry  142  513
## 49 Poultry   86  358
## 50 Poultry  143  581
## 51 Poultry  152  588
## 52 Poultry  146  522
## 53 Poultry  144  545

El dataset contiene información sobre diferentes tipos de hotdogs, incluyendo calorías y contenido de sodio. Esta estructura es útil para análisis nutricionales y comparaciones entre categorías.

setwd("C:/Users/Angelica Elena/Desktop/Estadistica Computacional/dataset")
##Ejercicio 3. hotdogs con enunciados-------------------------------------------------
#Importe hotdogs.txt file con read.table 
df_Hotdogs_TABLE=read.table("hotdogs.txt",header=TRUE,sep="",stringsAsFactors = TRUE)
str(df_Hotdogs_TABLE)

## 'data.frame':    53 obs. of  3 variables:
##  $ Beef: Factor w/ 3 levels "Beef","Meat",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ X186: int  181 176 149 184 190 158 139 175 148 152 ...
##  $ X495: int  477 425 322 482 587 370 322 479 375 330 ...

#coloque los titulos a las columnas (type,calories,sodium)usaremosla funcion setNames:
df_Hotdogs_TABLE= setNames(df_Hotdogs_TABLE,c("type", "calories", "sodium"))
summary(df_Hotdogs_TABLE)

##       type       calories         sodium     
##  Beef   :19   Min.   : 86.0   Min.   :144.0  
##  Meat   :17   1st Qu.:132.0   1st Qu.:360.0  
##  Poultry:17   Median :144.0   Median :405.0  
##               Mean   :144.7   Mean   :423.5  
##               3rd Qu.:172.0   3rd Qu.:506.0  
##               Max.   :195.0   Max.   :645.0

#Muestre 7 filas de hotdogs
df_Hotdogs_TABLE[1:7, c("type", "calories","sodium")]

##   type calories sodium
## 1 Beef      181    477
## 2 Beef      176    425
## 3 Beef      149    322
## 4 Beef      184    482
## 5 Beef      190    587
## 6 Beef      158    370
## 7 Beef      139    322

#Selecciona el perro caliente con mas calorías: Tom

max_calories= -Inf  
max_index=1  
for (i in 1:nrow(df_Hotdogs_TABLE)) {
  if (df_Hotdogs_TABLE[i, "calories"] > max_calories) {
    max_calories=df_Hotdogs_TABLE[i, "calories"]
    max_index =i  
  }
}
Tom=df_Hotdogs_TABLE[max_index, c("type", "calories","sodium")]# Extrae el tipo y calorías
print(Tom)

##    type calories sodium
## 33 Meat      195    511

print(paste("Levels:", paste(levels(df_Hotdogs_TABLE$type), collapse = " ")))

## [1] "Levels: Beef Meat Poultry"

# Selecciona el perro caliente con menos calorías :Lili
min_calories= Inf  
min_index=1  
for (i in 1:nrow(df_Hotdogs_TABLE)) {
  if (df_Hotdogs_TABLE[i, "calories"] < min_calories) {
    min_calories=df_Hotdogs_TABLE[i, "calories"]
    min_index =i  
  }
}

Lili=df_Hotdogs_TABLE[min_index, c("type", "calories","sodium" )]# Extrae el tipo y calorías
print(Lili)

##       type calories sodium
## 49 Poultry       86    358

print(paste("Levels:", paste(levels(df_Hotdogs_TABLE$type), collapse = " ")))

## [1] "Levels: Beef Meat Poultry"

Se renombraron las columnas para mejorar la legibilidad y se visualizaron las primeras 7 filas. Ahora es más fácil interpretar la información nutricional de los hotdogs. Adicionalmente Se identificó el hotdog con más calorías, lo que permite clasificar productos según su impacto en la dieta.

El análisis identifica los extremos nutricionales en el dataset:

Tom: Hotdog más calórico, ideal para estudios de consumo alto en calorías. Lili: Hotdog menos calórico, relevante para opciones más saludables. El uso de niveles (type) categoriza los hotdogs, facilitando análisis por tipo.

##Ejercicio 4. Urban Pop --------------------------------------------------------------
# imprime los nombres de todos los sheets de los datos urbanpop.xlsx
library(readxl)
setwd("C:/Users/Angelica Elena/Desktop/Estadistica Computacional/dataset")
sheets=excel_sheets("urbanpop.xlsx")
# carga todos los sheet, 1 por 1 y ponlos en una lista: pop_list
df_1960_1996= read_excel("urbanpop.xlsx",sheet = "1960-1966")
df_1967_1974= read_excel("urbanpop.xlsx",sheet = "1967-1974")
df_1975_2011= read_excel("urbanpop.xlsx",sheet = "1975-2011")
pop_list = list(
  "1960-1966" = df_1960_1996,
  "1967-1974" = df_1967_1974,
  "1975-2011" = df_1975_2011
)
# realiza los anterior con la funcion lapply
pop_list=lapply(sheets, read_excel, path = "urbanpop.xlsx")
# muestra la estructura de pop_list
str(pop_list)

## List of 3
##  $ : tibble [209 × 8] (S3: tbl_df/tbl/data.frame)
##   ..$ country: chr [1:209] "Afghanistan" "Albania" "Algeria" "American Samoa" ...
##   ..$ 1960   : num [1:209] 769308 494443 3293999 NA NA ...
##   ..$ 1961   : num [1:209] 814923 511803 3515148 13660 8724 ...
##   ..$ 1962   : num [1:209] 858522 529439 3739963 14166 9700 ...
##   ..$ 1963   : num [1:209] 903914 547377 3973289 14759 10748 ...
##   ..$ 1964   : num [1:209] 951226 565572 4220987 15396 11866 ...
##   ..$ 1965   : num [1:209] 1000582 583983 4488176 16045 13053 ...
##   ..$ 1966   : num [1:209] 1058743 602512 4649105 16693 14217 ...
##  $ : tibble [209 × 9] (S3: tbl_df/tbl/data.frame)
##   ..$ country: chr [1:209] "Afghanistan" "Albania" "Algeria" "American Samoa" ...
##   ..$ 1967   : num [1:209] 1119067 621180 4826104 17349 15440 ...
##   ..$ 1968   : num [1:209] 1182159 639964 5017299 17996 16727 ...
##   ..$ 1969   : num [1:209] 1248901 658853 5219332 18619 18088 ...
##   ..$ 1970   : num [1:209] 1319849 677839 5429743 19206 19529 ...
##   ..$ 1971   : num [1:209] 1409001 698932 5619042 19752 20929 ...
##   ..$ 1972   : num [1:209] 1502402 720207 5815734 20263 22406 ...
##   ..$ 1973   : num [1:209] 1598835 741681 6020647 20742 23937 ...
##   ..$ 1974   : num [1:209] 1696445 763385 6235114 21194 25482 ...
##  $ : tibble [209 × 38] (S3: tbl_df/tbl/data.frame)
##   ..$ country: chr [1:209] "Afghanistan" "Albania" "Algeria" "American Samoa" ...
##   ..$ 1975   : num [1:209] 1793266 785350 6460138 21632 27019 ...
##   ..$ 1976   : num [1:209] 1905033 807990 6774099 22047 28366 ...
##   ..$ 1977   : num [1:209] 2021308 830959 7102902 22452 29677 ...
##   ..$ 1978   : num [1:209] 2142248 854262 7447728 22899 31037 ...
##   ..$ 1979   : num [1:209] 2268015 877898 7810073 23457 32572 ...
##   ..$ 1980   : num [1:209] 2398775 901884 8190772 24177 34366 ...
##   ..$ 1981   : num [1:209] 2493265 927224 8637724 25173 36356 ...
##   ..$ 1982   : num [1:209] 2590846 952447 9105820 26342 38618 ...
##   ..$ 1983   : num [1:209] 2691612 978476 9591900 27655 40983 ...
##   ..$ 1984   : num [1:209] 2795656 1006613 10091289 29062 43207 ...
##   ..$ 1985   : num [1:209] 2903078 1037541 10600112 30524 45119 ...
##   ..$ 1986   : num [1:209] 3006983 1072365 11101757 32014 46254 ...
##   ..$ 1987   : num [1:209] 3113957 1109954 11609104 33548 47019 ...
##   ..$ 1988   : num [1:209] 3224082 1146633 12122941 35095 47669 ...
##   ..$ 1989   : num [1:209] 3337444 1177286 12645263 36618 48577 ...
##   ..$ 1990   : num [1:209] 3454129 1198293 13177079 38088 49982 ...
##   ..$ 1991   : num [1:209] 3617842 1215445 13708813 39600 51972 ...
##   ..$ 1992   : num [1:209] 3788685 1222544 14248297 41049 54469 ...
##   ..$ 1993   : num [1:209] 3966956 1222812 14789176 42443 57079 ...
##   ..$ 1994   : num [1:209] 4152960 1221364 15322651 43798 59243 ...
##   ..$ 1995   : num [1:209] 4347018 1222234 15842442 45129 60598 ...
##   ..$ 1996   : num [1:209] 4531285 1228760 16395553 46343 60927 ...
##   ..$ 1997   : num [1:209] 4722603 1238090 16935451 47527 60462 ...
##   ..$ 1998   : num [1:209] 4921227 1250366 17469200 48705 59685 ...
##   ..$ 1999   : num [1:209] 5127421 1265195 18007937 49906 59281 ...
##   ..$ 2000   : num [1:209] 5341456 1282223 18560597 51151 59719 ...
##   ..$ 2001   : num [1:209] 5564492 1315690 19198872 52341 61062 ...
##   ..$ 2002   : num [1:209] 5795940 1352278 19854835 53583 63212 ...
##   ..$ 2003   : num [1:209] 6036100 1391143 20529356 54864 65802 ...
##   ..$ 2004   : num [1:209] 6285281 1430918 21222198 56166 68301 ...
##   ..$ 2005   : num [1:209] 6543804 1470488 21932978 57474 70329 ...
##   ..$ 2006   : num [1:209] 6812538 1512255 22625052 58679 71726 ...
##   ..$ 2007   : num [1:209] 7091245 1553491 23335543 59894 72684 ...
##   ..$ 2008   : num [1:209] 7380272 1594351 24061749 61118 73335 ...
##   ..$ 2009   : num [1:209] 7679982 1635262 24799591 62357 73897 ...
##   ..$ 2010   : num [1:209] 7990746 1676545 25545622 63616 74525 ...
##   ..$ 2011   : num [1:209] 8316976 1716842 26216968 64817 75207 ...

# Importe la segunda hoja de "urbanpop.xlsx", pero omita las primeras 21 filas.
read_excel("urbanpop.xlsx", sheet = "1967-1974", skip = 21)

## # A tibble: 188 × 9
##    Benin          `382022.12157999998` `411859.45118400001` `443013.10755199997`
##    <chr>                         <dbl>                <dbl>                <dbl>
##  1 Bermuda                      52000                53000                54000 
##  2 Bhutan                       14379.               15617.               16946.
##  3 Bolivia                    1527065.             1575177.             1625173.
##  4 Bosnia and He…              851692.              890270.              929450.
##  5 Botswana                     34320.               40576.               47222.
##  6 Brazil                    47193517.            49316879.            51489096.
##  7 Brunei                       61289.               66222.               71503.
##  8 Bulgaria                   4019906.             4158186.             4300669.
##  9 Burkina Faso                296824.              308661.              320961.
## 10 Burundi                      76166.               78816.               81356.
## # ℹ 178 more rows
## # ℹ 5 more variables: `475611.38027999998` <dbl>, `515819.52856800001` <dbl>,
## #   `557937.59942999994` <dbl>, `602093.16211999999` <dbl>,
## #   `648409.65390599996` <dbl>

Los datos de población urbana fueron extraídos de un archivo Excel, organizados en listas para facilitar su manipulación. Se omiten filas innecesarias para optimizar el análisis.

##Ejercicio 5. Densidad --------------------------------------------------------------
library(tidyverse)

## Warning: package 'tidyverse' was built under R version 4.4.2

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

counties <- readRDS(file="C:/Users/Angelica Elena/Desktop/Estadistica Computacional/2. Manipulación/dataset/counties.rds")
counties$state <- factor(counties$state)
counties %>%  
  select(state, county, population, land_area) %>% 
  group_by(state) %>%
  summarize(totalpol = sum(population), totalar = sum(land_area)) %>%
  mutate(density = totalpol / totalar) %>%
  count(state, wt = density, sort = TRUE)

## # A tibble: 50 × 2
##    state             n
##    <fct>         <dbl>
##  1 New Jersey    1211.
##  2 Rhode Island  1019.
##  3 Massachusetts  860.
##  4 Connecticut    742.
##  5 Maryland       611.
##  6 Delaware       475.
##  7 New York       417.
##  8 Florida        366.
##  9 Pennsylvania   286.
## 10 Ohio           283.
## # ℹ 40 more rows

Se calculó la densidad poblacional por estado, proporcionando una métrica clave para el análisis de la distribución de la población.

##Ejercicio 6. Trabajadores que Caminan--------------------------------------------
counties %>%
  select(region, state, county, metro, population, walk) %>% 
  group_by(region) %>%
  top_n(1, walk)

## # A tibble: 4 × 6
## # Groups:   region [4]
##   region        state        county                 metro    population  walk
##   <chr>         <fct>        <chr>                  <chr>         <dbl> <dbl>
## 1 West          Alaska       Aleutians East Borough Nonmetro       3304  71.2
## 2 Northeast     New York     New York               Metro       1629507  20.7
## 3 North Central North Dakota McIntosh               Nonmetro       2759  17.5
## 4 South         Virginia     Lexington city         Nonmetro       7071  31.7

Se identificaron las regiones con mayor número de personas que caminan al trabajo, lo cual es clave para la planificación de infraestructura urbana.

##Ejercicio 7. Ingreso mas Alto---------------------------------------------------
counties %>%
  # selecciona las columnas (region, state, county, metro, population, walk)
  select(region, state, county, metro, population, walk, income) %>% 
  # realiza grupos por region y estado
  group_by(region,state) %>%
  # Calcula el promedio income
  summarize(prom_income = mean(income))%>%
  # Encuentre el ingreso mas alto por estado en cada region
  top_n(1,prom_income)

## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.

## # A tibble: 4 × 3
## # Groups:   region [4]
##   region        state        prom_income
##   <chr>         <fct>              <dbl>
## 1 North Central North Dakota      55575.
## 2 Northeast     New Jersey        73014.
## 3 South         Maryland          69200.
## 4 West          Alaska            65125.

Se analizaron los ingresos promedio por región y estado, identificando las zonas con mayor poder adquisitivo.

##Ejercicio 8. Personas en areas Metrop y No Metrop--------------------------------------
counties %>%
  # selecciona las columnas (state, county, population, land_area) 
  select(state, county, population, metro, land_area) %>%
  # Encuentre la poblacion total por la cada estado y metro
  group_by(state, metro) %>%
  mutate(total_population = sum(population)) %>%
  # Extrae la fila con mas poblaci?n por estado 
  top_n(1,total_population)%>%
  # # Cuenta los estados con mas personas en areas metropolitanas o no metropolitanas
  count(metro, wt = total_population, sort = TRUE)

## # A tibble: 97 × 3
## # Groups:   state, metro [97]
##    state          metro             n
##    <fct>          <chr>         <dbl>
##  1 Texas          Metro    1927875760
##  2 California     Metro    1390734873
##  3 Florida        Metro     833440124
##  4 New York       Metro     694157932
##  5 Georgia        Metro     609307564
##  6 Virginia       Metro     575628240
##  7 Texas          Nonmetro  517756707
##  8 Illinois       Metro     454745880
##  9 Pennsylvania   Metro     417545888
## 10 North Carolina Metro     351439632
## # ℹ 87 more rows

Se comparó la población en áreas metropolitanas y rurales, proporcionando insights sobre la urbanización.

##Ejercicio 9. Personaldel trabajo Publico--------------------------------------------
counties %>%
  # Selecciona el estado, el condado, la población y los que terminan en "trabajo
  select(state, county, population, ends_with("work")) %>%
  # Filtrar los condados que tienen al menos el 50% de la población dedicada al trabajo público
  filter(public_work>=50)

## # A tibble: 7 × 6
##   state        county            population private_work public_work family_work
##   <fct>        <chr>                  <dbl>        <dbl>       <dbl>       <dbl>
## 1 Alaska       Lake and Peninsu…       1474         42.2        51.6         0.2
## 2 Alaska       Yukon-Koyukuk Ce…       5644         33.3        61.7         0  
## 3 California   Lassen                 32645         42.6        50.5         0.1
## 4 Hawaii       Kalawao                   85         25          64.1         0  
## 5 North Dakota Sioux                   4380         32.9        56.8         0.1
## 6 South Dakota Todd                    9942         34.4        55           0.8
## 7 Wisconsin    Menominee               4451         36.8        59.1         0.4

Se identificaron los condados con alta dependencia del sector público, lo cual es relevante para estudios de empleo.

##Ejercicio 10. Renombre de Columna n-------------------------------------------------
counties %>%
  # Cuente el número de condados de cada estado
  select(state, county) %>% group_by(state) %>%   count(state) %>% 
  # renombrar la columna n por num_counties
  rename(num_counties = n)

## # A tibble: 50 × 2
## # Groups:   state [50]
##    state       num_counties
##    <fct>              <int>
##  1 Alabama               67
##  2 Alaska                28
##  3 Arizona               15
##  4 Arkansas              75
##  5 California            58
##  6 Colorado              64
##  7 Connecticut            8
##  8 Delaware               3
##  9 Florida               67
## 10 Georgia              159
## # ℹ 40 more rows

Se renombró la columna n para mejorar la interpretación del número de condados por estado.

##Ejercicio 11. Densidad Ordenada------------------------------------------------------
counties %>%
  # Mantenga las columnas de state, state, county, y population, 
  # y agregue la columna density (poblacion por metro cuadrado (land_area))
  transmute(state, county, population, density  = population/land_area) %>% 
  # Filtrar por county con una población superior a un millón 
  filter(population>1000000) %>% 
  # Ordena la densidad en orden ascendente 
  arrange(density)

## # A tibble: 41 × 4
##    state      county         population density
##    <fct>      <chr>               <dbl>   <dbl>
##  1 California San Bernardino    2094769    104.
##  2 Nevada     Clark             2035572    258.
##  3 California Riverside         2298032    319.
##  4 Arizona    Maricopa          4018143    437.
##  5 Florida    Palm Beach        1378806    700.
##  6 California San Diego         3223096    766.
##  7 Washington King              2045756    967.
##  8 Texas      Travis            1121645   1133.
##  9 Florida    Hillsborough      1302884   1277.
## 10 Florida    Orange            1229039   1360.
## # ℹ 31 more rows

Se calcularon y ordenaron los condados más densamente poblados con más de 1 millón de habitantes, lo cual es útil para analizar áreas con alta presión demográfica.

Taller 2: Gabriel Fernandez y Angelica Sierra

Angelica Sierra

2025-02-07

Ejercicio 1: Swimming Pools