LUIS ALFREDO LEMUS PAZ

Tarea de Econometria 1. Analisis de datos sobre las votaciones primarias en Estados Unidos.

setwd("~/Desktop/Primarias")
primary_results <- read.csv(file = "primary_results.csv")

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Estas dos instrucciones ayudan a conocer en general la estructura de los datos.

tbl_df(primary_results)
## # A tibble: 24,611 <U+00D7> 8
##      state state_abbreviation  county  fips    party       candidate votes
##     <fctr>             <fctr>  <fctr> <int>   <fctr>          <fctr> <int>
## 1  Alabama                 AL Autauga  1001 Democrat  Bernie Sanders   544
## 2  Alabama                 AL Autauga  1001 Democrat Hillary Clinton  2387
## 3  Alabama                 AL Baldwin  1003 Democrat  Bernie Sanders  2694
## 4  Alabama                 AL Baldwin  1003 Democrat Hillary Clinton  5290
## 5  Alabama                 AL Barbour  1005 Democrat  Bernie Sanders   222
## 6  Alabama                 AL Barbour  1005 Democrat Hillary Clinton  2567
## 7  Alabama                 AL    Bibb  1007 Democrat  Bernie Sanders   246
## 8  Alabama                 AL    Bibb  1007 Democrat Hillary Clinton   942
## 9  Alabama                 AL  Blount  1009 Democrat  Bernie Sanders   395
## 10 Alabama                 AL  Blount  1009 Democrat Hillary Clinton   564
## # ... with 24,601 more rows, and 1 more variables: fraction_votes <dbl>
glimpse(primary_results)
## Observations: 24,611
## Variables: 8
## $ state              <fctr> Alabama, Alabama, Alabama, Alabama, Alabam...
## $ state_abbreviation <fctr> AL, AL, AL, AL, AL, AL, AL, AL, AL, AL, AL...
## $ county             <fctr> Autauga, Autauga, Baldwin, Baldwin, Barbou...
## $ fips               <int> 1001, 1001, 1003, 1003, 1005, 1005, 1007, 1...
## $ party              <fctr> Democrat, Democrat, Democrat, Democrat, De...
## $ candidate          <fctr> Bernie Sanders, Hillary Clinton, Bernie Sa...
## $ votes              <int> 544, 2387, 2694, 5290, 222, 2567, 246, 942,...
## $ fraction_votes     <dbl> 0.182, 0.800, 0.329, 0.647, 0.078, 0.906, 0...

1. Encontrar la cantidad de candidatos en las primarias.

unique(primary_results$candidate)
##  [1] Bernie Sanders  Hillary Clinton Ben Carson      Donald Trump   
##  [5] John Kasich     Marco Rubio     Ted Cruz         Uncommitted   
##  [9] Martin O'Malley Carly Fiorina   Chris Christie  Jeb Bush       
## [13] Mike Huckabee   Rand Paul       Rick Santorum    No Preference 
## 16 Levels:  No Preference  Uncommitted Ben Carson ... Ted Cruz

Para las primarias habian 14 candidatos.

2. Candidatos republicanos

republican <- filter(primary_results, party == "Republican")
unique(republican$candidate)
##  [1] Ben Carson     Donald Trump   John Kasich    Marco Rubio   
##  [5] Ted Cruz       Carly Fiorina  Chris Christie Jeb Bush      
##  [9] Mike Huckabee  Rand Paul      Rick Santorum 
## 16 Levels:  No Preference  Uncommitted Ben Carson ... Ted Cruz

Los candidatos republicanos son 11.

3. Votos de los partidos en Florida.

unique(primary_results$state)
##  [1] Alabama        Alaska         Arizona        Arkansas      
##  [5] California     Colorado       Connecticut    Delaware      
##  [9] Florida        Georgia        Hawaii         Idaho         
## [13] Illinois       Indiana        Iowa           Kansas        
## [17] Kentucky       Louisiana      Maine          Maryland      
## [21] Massachusetts  Michigan       Mississippi    Missouri      
## [25] Montana        Nebraska       Nevada         New Hampshire 
## [29] New Jersey     New Mexico     New York       North Carolina
## [33] North Dakota   Ohio           Oklahoma       Oregon        
## [37] Pennsylvania   Rhode Island   South Carolina South Dakota  
## [41] Tennessee      Texas          Utah           Vermont       
## [45] Virginia       Washington     West Virginia  Wisconsin     
## [49] Wyoming       
## 49 Levels: Alabama Alaska Arizona Arkansas California ... Wyoming
Florida <- filter(primary_results, state == "Florida")
by_party <- group_by(primary_results,party)
summarise(by_party, votes = n())
## # A tibble: 2 <U+00D7> 2
##        party votes
##       <fctr> <int>
## 1   Democrat  8959
## 2 Republican 15652

En Florida los Democratas obtuvieron 8,959 votos y los Republicanos obtuverion 15,652 votos.

4. Condado de Florida con la mayor cantidad de votantes.

unique(Florida$state)
## [1] Florida
## 49 Levels: Alabama Alaska Arizona Arkansas California ... Wyoming
nrow(Florida)
## [1] 402
unique(Florida$county)
##  [1] Alachua      Baker        Bay          Bradford     Brevard     
##  [6] Broward      Calhoun      Charlotte    Citrus       Clay        
## [11] Collier      Columbia     DeSoto       Dixie        Duval       
## [16] Escambia     Flagler      Franklin     Gadsden      Gilchrist   
## [21] Glades       Gulf         Hamilton     Hardee       Hendry      
## [26] Hernando     Highlands    Hillsborough Holmes       Indian River
## [31] Jackson      Jefferson    Lafayette    Lake         Lee         
## [36] Leon         Levy         Liberty      Madison      Manatee     
## [41] Marion       Martin       Miami-Dade   Monroe       Nassau      
## [46] Okaloosa     Okeechobee   Orange       Osceola      Palm Beach  
## [51] Pasco        Pinellas     Polk         Putnam       Santa Rosa  
## [56] Sarasota     Seminole     St. Johns    St. Lucie    Sumter      
## [61] Suwannee     Taylor       Union        Volusia      Wakulla     
## [66] Walton       Washington  
## 2633 Levels: Abbeville Abbot Abington Acadia Accomack Acton Acushnet ... Ziebach
by_county<-group_by(Florida, county)
Florida_county<-summarise(by_county, sumvotes = sum(votes, na.rm = FALSE))
arrange(Florida_county, desc(sumvotes))
## # A tibble: 67 <U+00D7> 2
##          county sumvotes
##          <fctr>    <int>
## 1    Miami-Dade   344894
## 2       Broward   285433
## 3    Palm Beach   266832
## 4  Hillsborough   232702
## 5      Pinellas   227497
## 6        Orange   203012
## 7         Duval   193375
## 8       Brevard   147315
## 9           Lee   146936
## 10         Polk   116161
## # ... with 57 more rows

El condado con la mayor cantidad de votos en Florida es Miami-Dade con 344,894 votos.

5. En el condado de florida que tuvo la mayor cantidad de votantes, ??Que candidato tuvo la mayor cantidad de votos y de que partido era?

Miami_Dade<-filter(Florida, county == "Miami-Dade")
by_votes<-group_by(Miami_Dade, candidate)
Miami_Dade_candidadte<-summarise(by_votes, sumvotes = sum(votes, na.rm = FALSE))
arrange(Miami_Dade_candidadte, desc(sumvotes))
## # A tibble: 6 <U+00D7> 2
##         candidate sumvotes
##            <fctr>    <int>
## 1 Hillary Clinton   129467
## 2     Marco Rubio   111898
## 3  Bernie Sanders    42009
## 4    Donald Trump    40156
## 5        Ted Cruz    16170
## 6     John Kasich     5194

El candidato con mayor votos en el condado Miami-Dade de Florida fue Hillary Clinton del partido democrata.

6. Cuantas personas Votaron por Hillary Clinton y cuantas por Donald Trump en estados unidos?

Hillary_votes<-filter(primary_results,candidate == "Hillary Clinton")
sum(Hillary_votes$votes)
## [1] 15692452
Donald_votes<-filter(primary_results,candidate == "Donald Trump")
sum(Donald_votes$votes)
## [1] 13302541

En Estados Unidos Hillary Clinton tuvo una cantidad de 15,692,452 votos y Donald Trump tuvo 13,302,541 votos.

7. ??Cual es la probabilidad de que si alguien sea republicano en florida haya votado por Jeb Bush?

Jeb_Bush<-filter(primary_results, candidate == "Jeb Bush")
unique(Jeb_Bush$state)
## [1] Iowa           New Hampshire  South Carolina
## 49 Levels: Alabama Alaska Arizona Arkansas California ... Wyoming

0 probabilidades. Jeb Bush no obtuvo votos en Florida; solamente en Iowa, New Hampshire y South Carolina.

8. Dado que una persona voto por Ted Cruz, ??Cual es la probabilidad que sea de California?

California_Ted<-filter(primary_results, candidate == "Ted Cruz")
by_state_Ted<-group_by(California_Ted,state)
state_sumvotes_Ted<-summarise(by_state_Ted, sumvotes = sum(votes, na.rm = FALSE))
a<-filter(state_sumvotes_Ted, state == "California")
b<-sum(state_sumvotes_Ted$sumvotes)
a/b
## Warning in Ops.factor(left, right): '/' not meaningful for factors
##   state   sumvotes
## 1    NA 0.01895632

La probabilida es del 1.89% que sea de California.

9. Dado que un persona es de texas, ??Cual es la probabilidad que vote por Donald Trump?

Donald_Trump<-filter(primary_results,  candidate == "Donald Trump")
by_state_Donald<-group_by(Donald_Trump,state)
Donald_Texas<-summarise(by_state_Donald, sumvotes = sum(votes, na.rm = FALSE))
a<-filter(Donald_Texas, state == "Texas")
b<-sum(Donald_Texas$sumvotes)
a/b
## Warning in Ops.factor(left, right): '/' not meaningful for factors
##   state   sumvotes
## 1    NA 0.05695288

La probabilidad que una persona de Texas vote por Donald Trump es de 5.69%.

10. ??Que condado de los Estados Unidos es el que tuvo la mayor cantidad de votantes?

by_county_total<-group_by(primary_results,county)
condado_total<-summarise(by_county_total, sumvotestotal = sum(votes, na.rm = FALSE))
arrange(condado_total, desc(sumvotestotal))
## # A tibble: 2,633 <U+00D7> 2
##          county sumvotestotal
##          <fctr>         <int>
## 1   Los Angeles       1268622
## 2    Montgomery        823976
## 3       Chicago        760894
## 4        Orange        740240
## 5  Cook Suburbs        678313
## 6     Jefferson        635690
## 7        Harris        545932
## 8         Wayne        522322
## 9      Franklin        488365
## 10     Maricopa        464471
## # ... with 2,623 more rows

El condado de los Estados Unidos que mas votantes tuvo fue Los Angeles.

11. ??Quien gano en Los Angeles para los democratas? ??para los republicanos?

filter(primary_results, county == "Los Angeles")
##        state state_abbreviation      county fips      party
## 1 California                 CA Los Angeles 6037   Democrat
## 2 California                 CA Los Angeles 6037   Democrat
## 3 California                 CA Los Angeles 6037 Republican
## 4 California                 CA Los Angeles 6037 Republican
## 5 California                 CA Los Angeles 6037 Republican
##         candidate  votes fraction_votes
## 1  Bernie Sanders 434656          0.420
## 2 Hillary Clinton 590502          0.570
## 3    Donald Trump 179130          0.698
## 4     John Kasich  33559          0.131
## 5        Ted Cruz  30775          0.120
Los_Angeles_Democrat<-filter(primary_results, party == "Democrat")
by_Los_Angeles<-group_by(Los_Angeles_Democrat,candidate)
LA_candidate<-summarise(by_Los_Angeles, sumvotesLA = sum(votes, na.rm = FALSE))
arrange(LA_candidate, desc(sumvotesLA))
## # A tibble: 5 <U+00D7> 2
##         candidate sumvotesLA
##            <fctr>      <int>
## 1 Hillary Clinton   15692452
## 2  Bernie Sanders   11959102
## 3   No Preference       8152
## 4 Martin O'Malley        752
## 5     Uncommitted         43
filter(primary_results, county == "Los Angeles")
##        state state_abbreviation      county fips      party
## 1 California                 CA Los Angeles 6037   Democrat
## 2 California                 CA Los Angeles 6037   Democrat
## 3 California                 CA Los Angeles 6037 Republican
## 4 California                 CA Los Angeles 6037 Republican
## 5 California                 CA Los Angeles 6037 Republican
##         candidate  votes fraction_votes
## 1  Bernie Sanders 434656          0.420
## 2 Hillary Clinton 590502          0.570
## 3    Donald Trump 179130          0.698
## 4     John Kasich  33559          0.131
## 5        Ted Cruz  30775          0.120
Los_Angeles_Republican<-filter(primary_results, party == "Republican")
by_Los_AngelesR<-group_by(Los_Angeles_Republican,candidate)
LA_candidateR<-summarise(by_Los_AngelesR, sumvotesLA = sum(votes, na.rm = FALSE))
arrange(LA_candidateR, desc(sumvotesLA))
## # A tibble: 11 <U+00D7> 2
##         candidate sumvotesLA
##            <fctr>      <int>
## 1    Donald Trump   13302541
## 2        Ted Cruz    7603006
## 3     John Kasich    4159949
## 4     Marco Rubio    3321076
## 5      Ben Carson     564553
## 6        Jeb Bush      94411
## 7  Chris Christie      24353
## 8   Carly Fiorina      15191
## 9       Rand Paul       8479
## 10  Mike Huckabee       3345
## 11  Rick Santorum       1782

En Los Angeles para los democratas gano Hillary Clinton con 15,692,452 votos. En Los Angeles para los republicanos gano Donald Trump con 13,302,541.

12. ??Cuantas personas votaron por los republicanos y cuantas por los democartas en todo estados unidos?

by_votes_USA<-group_by(primary_results,party)
summarise(by_votes_USA, sumvotes = sum(votes, na.rm = FALSE))
## # A tibble: 2 <U+00D7> 2
##        party sumvotes
##       <fctr>    <int>
## 1   Democrat 27660501
## 2 Republican 29098686

Para los republicanos en todos los Estados Unidos votaron 29,098,686 personas, mientras que por los dem??cratos votaron 27,660,501 personas.