Vietnam War Casualties

Estimates of casualties of the Vietnam War vary widely. Estimates include both civilian and military deaths in North and South Vietnam, Laos, and Cambodia.

The war persisted from 1955 to 1975 and most of the fighting took place in South Vietnam; accordingly it suffered the most casualties. The war also spilled over into the neighboring countries of Cambodia and Laos which also endured casualties from aerial and ground fighting.

Civilian deaths caused by both sides amounted to a significant percentage of total deaths. Civilian deaths were partly caused by assassinations, massacres and terror tactics. Civilian deaths were also caused by mortar and artillery, extensive aerial bombing and the use of firepower in military operations conducted in heavily populated areas. Some 365,000 Vietnamese civilians are estimated by one source to have died as a result of the war during the period of American involvement.

A number of incidents occurred during the war in which civilians were deliberately targeted or killed. The best-known are the Massacre at Huế and the My Lai massacre.

According to The Virtual Wall Vietnam Veterans Memorial, the number of U.S casualties during Vietnam War was 58,226.

Collecting Data for 58,226 US Deaths

In this section I present R codes for collecting detailed data (first name, last name, gender, death date, category, birth date, service number, rank, death location, and other information) for 58,226 US Deaths during Vietnam War.

# https://www.honorstates.org/index.php?page=wars&war=Vietnam+Conflict&do=states
# https://laahgp.genealogyvillage.com/MilitaryIndex/vietnam-war-causalities.html
# https://www.jstor.org/stable/2137774?seq=1#page_scan_tab_contents
# https://www.nwitimes.com/the-region-fallen-of-the-vietnam-war/table_1f7fde93-a183-5f1c-ae69-ccac4fdebb5e.html
# http://www.vietnammemorial.com/vietnam-memorial-list-of-heroes.html
# Korean https://www.vvmf.org/Wall-of-Faces/?fbclid=IwAR3JJi8fPklR5QtfRHGfN0oxaDjf4uuHmHFDxulYqICS1aBuzjJIrEmKIqs


# Load some R packages and clear workspace:

rm(list = ls())
library(rvest)
library(xml2)
library(tidyverse)

#=============================================
#    Stage 1:  Get links for all soldiers
#=============================================


get_panel_link <- function(url) {
  
  # Create an html document from the url: 
  webpage <- xml2::read_html(url)
  
  # Extract the URLs: 
  url_ <- webpage %>%
    rvest::html_nodes("a") %>%
    rvest::html_attr("href")
  
  # Extract the link text: 
  link_ <- webpage %>%
    rvest::html_nodes("a") %>%
    rvest::html_text()
  return(data_frame(link = link_, url = url_))
}


all_panels <- "http://www.virtualwall.org/iPanels.htm"
panel_links <- get_panel_link(all_panels)


# All suffixes: 

all_suffix <- panel_links$url[str_detect(panel_links$url, "iPanels/ipan")] %>% na.omit()
all_suffix <- str_sub(all_suffix, start = 2, end = str_count(all_suffix))
all_names_InPanel <- paste0("http://www.virtualwall.org", all_suffix)


# All soldier links: 

get_link_for_soldiers <- function(panel_selected) {
  
  soldier_suffix <- get_panel_link(panel_selected)
  
  soldier_suffix %>% 
    filter(!str_detect(url, "misspelledname")) %>% 
    filter(!str_detect(url, "noscriptIndexMenu")) %>% 
    filter(!str_detect(url, "index")) %>% 
    filter(!str_detect(url, "ipan")) %>% 
    mutate(url = paste0("http://www.VirtualWall.org", str_sub(url, start = 3, end = str_count(url)))) %>% 
    rename(name = link, link_for_soldier = url) %>% 
    return()
  
}


lapply(all_names_InPanel, get_link_for_soldiers) -> link_soldiers_list
do.call("bind_rows", link_soldiers_list) -> df_link_soldiers

df_link_soldiers %>% 
  filter(!duplicated(link_for_soldier)) -> df_link_soldiers


# Save our data: 

write.csv(df_link_soldiers, "df_link_soldiers.csv", row.names = FALSE)

#=============================================
#    Stage 2:  Get data for all soldiers
#=============================================

# Function for collecting all from a soldier: 

collect_all_DataSoldier <- function(my_link) {
  
  pg <- read_html(my_link)
  
  html_text(pg, trim = TRUE) %>% 
    str_split("\\\n", simplify = TRUE) %>% 
    as.character() %>% 
    str_squish() %>% 
    .[3:27] -> m
  
  
  m %>% 
    str_split("\\/", simplify = TRUE) %>% 
    data.frame() %>% 
    mutate_all(as.character) -> df
  
  
  df_date <- df %>% 
    slice(c(17:19)) %>% 
    mutate(my_date = paste0(X1, X2, X3)) %>% 
    select(my_date, X5)
  
  t(df_date %>% select(my_date)) %>% 
    as.data.frame() -> df_date_soldier
  
  names(df_date_soldier) <- c("birth", "start", "cas_date")
  

  df_remain <- df %>% 
    slice(-c(17:19)) %>% 
    select(X1, X3)
  
  t(df_remain %>% select(X1)) %>% 
    as.data.frame() -> df
  
  
  names(df) <- c(df_remain$X3[1:2], "notKnown", "Grade_at_loss", df_remain$X3[5:22])   
  all_data_soldier <- bind_cols(df, df_date_soldier)
  return(all_data_soldier %>% mutate(link = my_link))
  
}

get_SoldierData <- function(link_selected) {
  return(tryCatch(collect_all_DataSoldier(link_selected), error = function(e) {NULL}))
}


# Use this function: 

df_link_soldiers <- read_csv("df_link_soldiers.csv")
soldier_links <- df_link_soldiers$link_for_soldier
lapply(soldier_links, get_SoldierData) -> all_data_list
save(all_data_list, file = "all_data_list.RData")

do.call("bind_rows", all_data_list) %>% 
  select(-V1) -> all_data_df

# Save our data: 
write.csv(all_data_df, "all_us_deaths_in_Vietnam_war.csv", row.names = FALSE)

US Fatal Deaths by Military Rank

#==================================================
#  Stage 3: Data-preprocessing and visualization
#==================================================

library(tidyverse)

# Import data: 
all_data_df <- read_csv("C:\\Users\\Zbook\\Documents\\all_us_deaths_in_Vietnam_war.csv")

all_data_df %>% 
  select(1:26) -> df_USdeaths


library(hrbrthemes)
library(scales)
my_colors <- c("#3E606F")
my_font <- "Roboto Condensed"


my_cleanText1 <- function(x) {
  str_replace_all(x, "[^A-Za-z]", " ") %>% 
    str_squish() %>% 
    return()
}



df_USdeaths %>% 
  mutate(Rank = my_cleanText1(Rank)) %>% 
  group_by(Rank) %>% 
  count() %>% 
  ungroup() %>% 
  top_n(20, n) %>% 
  arrange(n) %>% 
  mutate(Rank = factor(Rank, levels = Rank)) %>% 
  mutate(label = comma_format()(n)) -> deaths_byRank

deaths_byRank %>% 
  ggplot(aes(Rank, n)) + 
  geom_col(width = 0.8, fill = "firebrick", color = "firebrick") + 
  coord_flip() + 
  theme_ft_rc() + 
  scale_y_continuous(expand = c(0.015, 0)) + 
  theme(panel.grid = element_blank()) + 
  theme(axis.text.x = element_blank()) + 
  theme(axis.text.y = element_text(color = "white", size = 14, family = my_font)) + 
  theme(plot.margin = unit(c(1.2, 1.2, 1.2, 1.2), "cm")) + 
  geom_text(aes(label = label), hjust = -0.2, color = "white", size = 5, family = my_font) + 
  geom_text(data = deaths_byRank %>% slice(which.max(n)), aes(label = label), hjust = 1.1, color = "white", size = 5, family = my_font) + 
  theme(plot.title = element_text(size = 23)) + 
  theme(plot.subtitle = element_text(size = 14, color = "grey90")) + 
  theme(plot.caption = element_text(size = 12, face = "italic")) + 
  labs(x = NULL, y = NULL, 
       title = "Figure 1: US Fatal Deaths by Military Rank", 
       subtitle = "Note: For top 20 by number of deaths.", 
       caption = "Data Source: http://www.virtualwall.org")

US Fatal Deaths by Location

df_USdeaths %>% 
  mutate(location = my_cleanText1(location)) %>% 
  mutate(location1 = case_when(str_detect(location, "North Vietnam") ~ paste0("X Province", location), 
                               str_detect(location, "Cambodia") ~ paste0("Cambodia Province", location), 
                               str_detect(location, "Laos") ~ paste0("Laos Province", location), 
                               str_detect(location, "Thailand") ~ paste0("Thailand Province", location), 
                               str_detect(location, "China") ~ paste0("China Province", location), 
                               str_detect(location, "not reported") ~ paste0("Unknown Province", location), 
                               TRUE ~ location)) -> df_USdeaths_location1


df_USdeaths_location1 %>% 
  filter(str_detect(location1, "Province")) %>% 
  pull(location1) %>% 
  str_split("Province", simplify = TRUE) %>% 
  as.data.frame() %>% 
  mutate_all(as.character) %>% 
  mutate_all(str_squish) %>% 
  rename(Province = V1) %>% 
  mutate(Province = case_when(str_detect(Province, "X") ~ "North Vietnam", TRUE ~ Province)) %>% 
  group_by(Province) %>% 
  count() %>% 
  ungroup() %>% 
  arrange(n) %>% 
  mutate(Province = factor(Province, levels = Province)) %>% 
  mutate(bar_color = case_when(str_detect(Province, "North Vietnam") ~ my_colors, 
                               str_detect(Province, "Laos") ~ my_colors,
                               str_detect(Province, "Cambodia") ~ my_colors, 
                               str_detect(Province, "Thailand") ~ my_colors, 
                               TRUE ~ "firebrick")) %>% 
  mutate(label = comma_format()(n)) -> df_death_byProvince



df_death_byProvince %>% 
  ggplot(aes(Province, n)) + 
  geom_col(width = 0.8, fill = "firebrick", color = "firebrick") + 
  coord_flip() + 
  theme_ft_rc() + 
  scale_y_continuous(expand = c(0.01, 0)) + 
  theme(panel.grid = element_blank()) + 
  theme(axis.text.x = element_blank()) + 
  theme(axis.text.y = element_text(color = "white", size = 11, family = my_font)) + 
  theme(plot.margin = unit(c(1.2, 1.2, 1.2, 1.2), "cm")) + 
  geom_text(data = df_death_byProvince %>% slice(1:45), aes(label = label), hjust = -0.2, color = "white", size = 3.5, family = my_font) + 
  geom_text(data = df_death_byProvince %>% slice(46:48), aes(label = label), hjust = 1.1, color = "white", size = 3.5, family = my_font) + 
  theme(plot.title = element_text(size = 23)) + 
  theme(plot.subtitle = element_text(size = 14, color = "grey90")) + 
  theme(plot.caption = element_text(size = 12, face = "italic")) + 
  labs(x = NULL, y = NULL, 
       title = "Figure 2: US Fatal Deaths by Location", 
       subtitle = "Note: Unverified Locations are labelled as Unknown.", 
       caption = "Data Source: http://www.virtualwall.org")

Reflections Of My Life

The changing

Of sunlight to moonlight

Reflections of my life

Oh how they fill my eyes

The greetings

Of people in trouble

Reflections of my life

Oh how they fill my eyes

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings

Feel I’m dying, dying

Take me back to my own home

I’m changing, arranging

I’m changing, I’m changing everything

Oh, everything around me

The world is a bad place

A bad place, a terrible place to live

Oh, but I don’t wanna die

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings

Feel I’m dying, dying

Take me back to my own home

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings…

To Be Continued

---
title: "US Military Fatal Casualties During the Vietnam War"
subtitle: "Daily Graph Series"
author: "Nguyen Chi Dung"
output:
  html_document:
    code_download: yes
    code_folding: hide
    highlight: zenburn
    theme: flatly
    toc: yes
    toc_float: yes
  word_document:
    toc: yes
---

```{r setup,include=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE, fig.retina=2)
```


![](C:\\Users\\Zbook\\Desktop\\pic\\war11.jpg)

# Vietnam War Casualties


Estimates of casualties of the Vietnam War vary widely. Estimates include both civilian and military deaths in North and South Vietnam, Laos, and Cambodia.

The war persisted from 1955 to 1975 and most of the fighting took place in South Vietnam; accordingly it suffered the most casualties. The war also spilled over into the neighboring countries of Cambodia and Laos which also endured casualties from aerial and ground fighting.

Civilian deaths caused by both sides amounted to a significant percentage of total deaths. Civilian deaths were partly caused by assassinations, massacres and terror tactics. Civilian deaths were also caused by mortar and artillery, extensive aerial bombing and the use of firepower in military operations conducted in heavily populated areas. Some 365,000 Vietnamese civilians are estimated by one source to have died as a result of the war during the period of American involvement.

A number of incidents occurred during the war in which civilians were deliberately targeted or killed. The best-known are the Massacre at Huế and the My Lai massacre.

According to [The Virtual Wall Vietnam Veterans Memorial](http://www.virtualwall.org/), the number of U.S casualties during Vietnam War was 58,226. 

# Collecting Data for 58,226 US Deaths

In this section I present R codes for collecting detailed data (first name, last name, gender, death date, category, birth date, service number, rank, death location, and other information) for 58,226 US Deaths during Vietnam War. 

```{r, eval=FALSE}

# https://www.honorstates.org/index.php?page=wars&war=Vietnam+Conflict&do=states
# https://laahgp.genealogyvillage.com/MilitaryIndex/vietnam-war-causalities.html
# https://www.jstor.org/stable/2137774?seq=1#page_scan_tab_contents
# https://www.nwitimes.com/the-region-fallen-of-the-vietnam-war/table_1f7fde93-a183-5f1c-ae69-ccac4fdebb5e.html
# http://www.vietnammemorial.com/vietnam-memorial-list-of-heroes.html
# Korean https://www.vvmf.org/Wall-of-Faces/?fbclid=IwAR3JJi8fPklR5QtfRHGfN0oxaDjf4uuHmHFDxulYqICS1aBuzjJIrEmKIqs


# Load some R packages and clear workspace:

rm(list = ls())
library(rvest)
library(xml2)
library(tidyverse)

#=============================================
#    Stage 1:  Get links for all soldiers
#=============================================


get_panel_link <- function(url) {
  
  # Create an html document from the url: 
  webpage <- xml2::read_html(url)
  
  # Extract the URLs: 
  url_ <- webpage %>%
    rvest::html_nodes("a") %>%
    rvest::html_attr("href")
  
  # Extract the link text: 
  link_ <- webpage %>%
    rvest::html_nodes("a") %>%
    rvest::html_text()
  return(data_frame(link = link_, url = url_))
}


all_panels <- "http://www.virtualwall.org/iPanels.htm"
panel_links <- get_panel_link(all_panels)


# All suffixes: 

all_suffix <- panel_links$url[str_detect(panel_links$url, "iPanels/ipan")] %>% na.omit()
all_suffix <- str_sub(all_suffix, start = 2, end = str_count(all_suffix))
all_names_InPanel <- paste0("http://www.virtualwall.org", all_suffix)


# All soldier links: 

get_link_for_soldiers <- function(panel_selected) {
  
  soldier_suffix <- get_panel_link(panel_selected)
  
  soldier_suffix %>% 
    filter(!str_detect(url, "misspelledname")) %>% 
    filter(!str_detect(url, "noscriptIndexMenu")) %>% 
    filter(!str_detect(url, "index")) %>% 
    filter(!str_detect(url, "ipan")) %>% 
    mutate(url = paste0("http://www.VirtualWall.org", str_sub(url, start = 3, end = str_count(url)))) %>% 
    rename(name = link, link_for_soldier = url) %>% 
    return()
  
}


lapply(all_names_InPanel, get_link_for_soldiers) -> link_soldiers_list
do.call("bind_rows", link_soldiers_list) -> df_link_soldiers

df_link_soldiers %>% 
  filter(!duplicated(link_for_soldier)) -> df_link_soldiers


# Save our data: 

write.csv(df_link_soldiers, "df_link_soldiers.csv", row.names = FALSE)

#=============================================
#    Stage 2:  Get data for all soldiers
#=============================================

# Function for collecting all from a soldier: 

collect_all_DataSoldier <- function(my_link) {
  
  pg <- read_html(my_link)
  
  html_text(pg, trim = TRUE) %>% 
    str_split("\\\n", simplify = TRUE) %>% 
    as.character() %>% 
    str_squish() %>% 
    .[3:27] -> m
  
  
  m %>% 
    str_split("\\/", simplify = TRUE) %>% 
    data.frame() %>% 
    mutate_all(as.character) -> df
  
  
  df_date <- df %>% 
    slice(c(17:19)) %>% 
    mutate(my_date = paste0(X1, X2, X3)) %>% 
    select(my_date, X5)
  
  t(df_date %>% select(my_date)) %>% 
    as.data.frame() -> df_date_soldier
  
  names(df_date_soldier) <- c("birth", "start", "cas_date")
  

  df_remain <- df %>% 
    slice(-c(17:19)) %>% 
    select(X1, X3)
  
  t(df_remain %>% select(X1)) %>% 
    as.data.frame() -> df
  
  
  names(df) <- c(df_remain$X3[1:2], "notKnown", "Grade_at_loss", df_remain$X3[5:22])   
  all_data_soldier <- bind_cols(df, df_date_soldier)
  return(all_data_soldier %>% mutate(link = my_link))
  
}

get_SoldierData <- function(link_selected) {
  return(tryCatch(collect_all_DataSoldier(link_selected), error = function(e) {NULL}))
}


# Use this function: 

df_link_soldiers <- read_csv("df_link_soldiers.csv")
soldier_links <- df_link_soldiers$link_for_soldier
lapply(soldier_links, get_SoldierData) -> all_data_list
save(all_data_list, file = "all_data_list.RData")

do.call("bind_rows", all_data_list) %>% 
  select(-V1) -> all_data_df

# Save our data: 
write.csv(all_data_df, "all_us_deaths_in_Vietnam_war.csv", row.names = FALSE)
```

# US Fatal Deaths by Military Rank

![](C:\\Users\\Zbook\\Desktop\\pic\\w1.jpg)


```{r, eval=FALSE}
#==================================================
#  Stage 3: Data-preprocessing and visualization
#==================================================

library(tidyverse)

# Import data: 
all_data_df <- read_csv("C:\\Users\\Zbook\\Documents\\all_us_deaths_in_Vietnam_war.csv")

all_data_df %>% 
  select(1:26) -> df_USdeaths


library(hrbrthemes)
library(scales)
my_colors <- c("#3E606F")
my_font <- "Roboto Condensed"


my_cleanText1 <- function(x) {
  str_replace_all(x, "[^A-Za-z]", " ") %>% 
    str_squish() %>% 
    return()
}



df_USdeaths %>% 
  mutate(Rank = my_cleanText1(Rank)) %>% 
  group_by(Rank) %>% 
  count() %>% 
  ungroup() %>% 
  top_n(20, n) %>% 
  arrange(n) %>% 
  mutate(Rank = factor(Rank, levels = Rank)) %>% 
  mutate(label = comma_format()(n)) -> deaths_byRank

deaths_byRank %>% 
  ggplot(aes(Rank, n)) + 
  geom_col(width = 0.8, fill = "firebrick", color = "firebrick") + 
  coord_flip() + 
  theme_ft_rc() + 
  scale_y_continuous(expand = c(0.015, 0)) + 
  theme(panel.grid = element_blank()) + 
  theme(axis.text.x = element_blank()) + 
  theme(axis.text.y = element_text(color = "white", size = 14, family = my_font)) + 
  theme(plot.margin = unit(c(1.2, 1.2, 1.2, 1.2), "cm")) + 
  geom_text(aes(label = label), hjust = -0.2, color = "white", size = 5, family = my_font) + 
  geom_text(data = deaths_byRank %>% slice(which.max(n)), aes(label = label), hjust = 1.1, color = "white", size = 5, family = my_font) + 
  theme(plot.title = element_text(size = 23)) + 
  theme(plot.subtitle = element_text(size = 14, color = "grey90")) + 
  theme(plot.caption = element_text(size = 12, face = "italic")) + 
  labs(x = NULL, y = NULL, 
       title = "Figure 1: US Fatal Deaths by Military Rank", 
       subtitle = "Note: For top 20 by number of deaths.", 
       caption = "Data Source: http://www.virtualwall.org")


```

# US Fatal Deaths by Location

![](C:\\Users\\Zbook\\Desktop\\pic\\w2.jpg)

```{r, eval=FALSE}
df_USdeaths %>% 
  mutate(location = my_cleanText1(location)) %>% 
  mutate(location1 = case_when(str_detect(location, "North Vietnam") ~ paste0("X Province", location), 
                               str_detect(location, "Cambodia") ~ paste0("Cambodia Province", location), 
                               str_detect(location, "Laos") ~ paste0("Laos Province", location), 
                               str_detect(location, "Thailand") ~ paste0("Thailand Province", location), 
                               str_detect(location, "China") ~ paste0("China Province", location), 
                               str_detect(location, "not reported") ~ paste0("Unknown Province", location), 
                               TRUE ~ location)) -> df_USdeaths_location1


df_USdeaths_location1 %>% 
  filter(str_detect(location1, "Province")) %>% 
  pull(location1) %>% 
  str_split("Province", simplify = TRUE) %>% 
  as.data.frame() %>% 
  mutate_all(as.character) %>% 
  mutate_all(str_squish) %>% 
  rename(Province = V1) %>% 
  mutate(Province = case_when(str_detect(Province, "X") ~ "North Vietnam", TRUE ~ Province)) %>% 
  group_by(Province) %>% 
  count() %>% 
  ungroup() %>% 
  arrange(n) %>% 
  mutate(Province = factor(Province, levels = Province)) %>% 
  mutate(bar_color = case_when(str_detect(Province, "North Vietnam") ~ my_colors, 
                               str_detect(Province, "Laos") ~ my_colors,
                               str_detect(Province, "Cambodia") ~ my_colors, 
                               str_detect(Province, "Thailand") ~ my_colors, 
                               TRUE ~ "firebrick")) %>% 
  mutate(label = comma_format()(n)) -> df_death_byProvince



df_death_byProvince %>% 
  ggplot(aes(Province, n)) + 
  geom_col(width = 0.8, fill = "firebrick", color = "firebrick") + 
  coord_flip() + 
  theme_ft_rc() + 
  scale_y_continuous(expand = c(0.01, 0)) + 
  theme(panel.grid = element_blank()) + 
  theme(axis.text.x = element_blank()) + 
  theme(axis.text.y = element_text(color = "white", size = 11, family = my_font)) + 
  theme(plot.margin = unit(c(1.2, 1.2, 1.2, 1.2), "cm")) + 
  geom_text(data = df_death_byProvince %>% slice(1:45), aes(label = label), hjust = -0.2, color = "white", size = 3.5, family = my_font) + 
  geom_text(data = df_death_byProvince %>% slice(46:48), aes(label = label), hjust = 1.1, color = "white", size = 3.5, family = my_font) + 
  theme(plot.title = element_text(size = 23)) + 
  theme(plot.subtitle = element_text(size = 14, color = "grey90")) + 
  theme(plot.caption = element_text(size = 12, face = "italic")) + 
  labs(x = NULL, y = NULL, 
       title = "Figure 2: US Fatal Deaths by Location", 
       subtitle = "Note: Unverified Locations are labelled as Unknown.", 
       caption = "Data Source: http://www.virtualwall.org")


```

# Reflections Of My Life

```{r, echo=FALSE}
library(vembedr)
embed_url("https://www.youtube.com/watch?v=_1lIANRJNbA")
```

The changing

Of sunlight to moonlight

Reflections of my life

Oh how they fill my eyes

The greetings

Of people in trouble

Reflections of my life

Oh how they fill my eyes

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings

Feel I'm dying, dying

Take me back to my own home

I'm changing, arranging

I'm changing, I'm changing everything

Oh, everything around me

The world is a bad place

A bad place, a terrible place to live

Oh, but I don't wanna die

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings

Feel I'm dying, dying

Take me back to my own home

All my sorrows

Sad tomorrows

Take me back to my own home

All my cryings...

# To Be Continued


