Hood’s Texas Brigade Dataset Collection:

These are the different CSV files that I’ve created from yours out of excel. They are here to direct the content so I would ignore this part.

HTB_Vets <- read.csv("Compare_HTB_Vets_R.csv")
Stats_1860 <- read.csv("1860_Stats_R.csv")
Stats_1870 <- read.csv("1870_Stats_R.csv")
htb_all <- read.csv("htb_all.csv", stringsAsFactors = FALSE)

R Libraries:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(stringr)
library(USAboundaries)
library(maptools)
## Loading required package: sp
## Checking rgeos availability: TRUE
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
library(broom)
library(readr)
library(scales)
## 
## Attaching package: 'scales'
## The following objects are masked from 'package:readr':
## 
##     col_factor, col_numeric
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following object is masked from 'package:lubridate':
## 
##     here
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize

These graphs begin the data collected from your 10% sample.

ggplot(HTB_Vets, aes(x= en_state)) +
  geom_bar(width = 1) +
  labs(title = "States Where HTB Enlisted",
       x= "States Enlisted")

ggplot(HTB_Vets, aes(x = rank, y = en_state)) +
geom_count(color = "blue") +
  labs(title = "Enlistment Rank Divided By State",
       x= "Rank",
       y= "States Enlisted")

Average Total Wealth of Sample in 1860:

This code reveals the the average wealth of your sample was 11451.19 in 1860.

wealth1 <- as.character(HTB_Vets$total_1860)
wealth1a <- as.numeric(wealth1)
## Warning: NAs introduced by coercion
sum_wealth1 <- sum(wealth1a, na.rm = TRUE)

mean_wealth1 <- sum_wealth1/nrow(HTB_Vets)

mean_wealth1
## [1] 11451.19

Average Total Wealth of Sample in 1870: This code reveals that the average wealth of your 10% in 1870 was 2966.068, which means they took a MAJOR hit. While you already knew thier total wealth went down, there was a net loss of total wealth of about 8484.31. While these are still averages and not individual comparisons (which we can do!) it is still really telling.

wealth2 <- as.character(HTB_Vets$total_1870)

wealth2a <- as.numeric(wealth2)

sum_wealth2 <- sum(wealth2a, na.rm = TRUE)

mean_wealth2 <- sum_wealth2/nrow(HTB_Vets)

mean_wealth2
## [1] 2966.068

This data starts your ALL of your Texas Brigade soliders–not just the 10%.

Information surrounding Rank:

This just gives you the different ranks listed:

unique(htb_all$rank)
## [1] "N"   "E"   "O"   "HQO" "HQN" "HQE"

How Many of Each Rank? These are the totals of how many soldiers fell into the different ranks:

count(htb_all$rank)
##     x freq
## 1   E 1068
## 2 HQE   11
## 3 HQN   11
## 4 HQO   31
## 5   N  144
## 6   O   70

Number of People who Mustered by Year:

table(year(as.Date(htb_all$muster_date)))
## 
## 1861 1862 1863 
##  849   26    2

Number of People who Enlisted by Year:

table(year(as.Date(htb_all$enlist_date)))
## 
## 1861 1862 1863 1864 
##   54  290   22    3

Wound 1 Data, how many people were injured and in what year:

table(year(as.Date(htb_all$wound_date_1)))
## 
## 1862 1863 1864 1865 
##  330   88   83    1

Wound 2 Data, how many people were injured a second time and in what year:

table(year(as.Date(htb_all$wound_date_2)))
## 
## 1862 1863 1864 1865 
##   50   19   59    1

Wound 3 Data, how many people were injured a third time and in what year:

table(year(as.Date(htb_all$wound_date_3)))
## 
## 1863 1864 
##    5   15

Wound 4 Data, how many people were injured a fourth time and in what year:

table(year(as.Date(htb_all$wound_date_4)))
## 
## 1863 1864 
##    1    5

How many people went AWOL and in what year?

table(year(as.Date(htb_all$awol_date)))
## 
## 1861 1862 1863 1864 1865 
##    1    8   10    1    1

How many people went AWOL tiwce and in what year? Just 1!

table(year(as.Date(htb_all$awol_2_date)))
## 
## 1864 
##    1

How many people deserted and in what year:

table(year(as.Date(htb_all$desert_date)))
## 
## 1861 1862 1863 1864 1865 
##   14   29   17   26    2

Total number of deaths and the years of deaths. These numbers seem really low–although quite a few in the data sheets you guys haven’t been able to locate information on. So these numbers will go up for sure:

table(year(as.Date(htb_all$death_date)))
## 
## 1861 1862 1863 1864 1865 
##   37  198   79   40    4

Here is the big data you were looking for. In the spreadsheets you guys supplied I noticed that for every “enlist” data ya’ll had empty, you normally had that person’s “muster” date. So instead of just giving you the enlist vs desert stats, I also ran the muster vs desert stats. The fist list is your muster comparrisons. So 55 men who mustered in 1861 deserted, with only one deserter from musters in 1862. For enlistment dates, you had 5 men desert in ’61, 20 in ’62, 2 in 1863, and 1 in 1864.

deserters <- htb_all %>%
  filter(!is.na(desert_date))

test <- as.Date(deserters$desert_date)
test2 <- year(test)

table(year(as.Date(deserters$muster_date)))
## 
## 1861 1862 
##   55    1
table(year(as.Date(deserters$enlist_date)))
## 
## 1861 1862 1863 1864 
##    5   20    2    1

Who died of disease vs. wounds vs. accidents. This shows that 56 died of wounds and 22 died of disease–however you might want to consider whether some of these other causes of death should be considered under the larger “disease” category. This list needs to be cleaned up–let me know how you would like me to fix it!

count(htb_all$death_cause)
##                                  x freq
## 1                                     4
## 2                         Apoplexy    1
## 3                           Asthma    3
## 4                      Brain fever    1
## 5                 Brain hemmorhage    1
## 6                       Bronchitis    2
## 7                Chronic hepatitis    1
## 8                   Died of wounds   56
## 9                          Disease   22
## 10                         Drowned    3
## 11                       Dysentery   11
## 12       Gunshot wound to the head    1
## 13                             KIA  142
## 14   Killed before joining company    1
## 15                    Lung Trouble    1
## 16                         Measles    6
## 17                       Pneumonia   23
## 18        Pneumonia; Typhoid fever    1
## 19               Probably of wound    1
## 20                        RR wreck    1
## 21 Rumored to have died of disease    1
## 22                    Tuberculosis    2
## 23                   Typhoid fever   18
## 24                          Ulcers    1
## 25                         Unknown    1
## 26                           Wound    1
## 27            Wound; Typhoid fever    1
## 28                            <NA> 1028

Also according to our calculations that we did not need R to look up: - There are only 5 men who re-enlist. Of those 5, 3 were wounded at Second Manasas and re-enlist!

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.