Report Prepared by Ahmed Muhammad
This report comments on Civil War history through the analysis of declarations by Georgia, Mississippi, South Carolina, Texas, and Virginia when they were seceding. The Civil War is often said to be fought over slavery, but historians have different ideas about what really caused the Civil War. Below are questions answered by using R.
Each declaration can be broken down into individual words, and we can run analysis on those words to remove unnecessary words that don’t add value to analysis, find word frequencies, and represent that information visually. Here are the top 5 words from each declaration.
| word | n |
|---|---|
| states | 28 |
| slavery | 26 |
| government | 22 |
| north | 20 |
| constitution | 16 |
| word | n |
|---|---|
| union | 6 |
| institution | 4 |
| state | 4 |
| fathers | 3 |
| government | 3 |
| word | n |
|---|---|
| states | 49 |
| government | 25 |
| constitution | 20 |
| state | 16 |
| united | 12 |
| word | n |
|---|---|
| states | 32 |
| slave | 16 |
| holding | 15 |
| federal | 14 |
| texas | 13 |
| word | n |
|---|---|
| constitution | 8 |
| state | 8 |
| virginia | 8 |
| states | 7 |
| people | 6 |
Clearly, from these graphs and tables, the data and word frequencies suggests that states’ rights were a bigger issue than slavery. Of course, this doesn’t necessarily mean that the Civil War wasn’t fought to end slavery. It simply means that, according to these declarations, states were presenting rhetoric focused on states’ rights more than that for slavery. Their true motives could be different, as they often are in the realms of politics and diplomacy. Simply looking at the data though, it seems that the civil war was fought over states’ rights.
Sentiment analysis can be done by breaking down texts into words and matching them to a lexicon that associates the word as negative or positive. Because words like “miss” can be either positive or negative depending on context, we have to be careful about which lexicon we use. Using a lexicon appropriate for historical documents, below are the different overall sentiments of each the declarations.
| States | Scores |
|---|---|
| Georgia | -72 |
| Mississippi | -3 |
| South Carolina | -9 |
| Texas | -16 |
| Virginia | 8 |
Clearly, most of the states have a negative sentiment score, which makes sense becuase war is usually surrounded by violent and angry langauage.
The first question asked us to see what indavidual words say about the foremost cause of secession. To do so, we saved the declerations into a text table from a CSV file. Then, we used a function called “unnest_tokens” to essentially break down each of the declerations into indavidual words, so we can analyze them seprately. Then, we removed words that don’t add any meaningful value to analysis called stop words. Words like ‘the’ and ‘as’ are considered stop words.
Then, we just used simple count functions and rearranged the data to find words that were used most, and exactly how many times they were used. We found that words like “state”, “constitution”, and “government” were used almost always more than words like “slavery”.
To graph and show tables, we coverted the data so that functions like ‘ggplot’ and ‘kable’ could grab data and illustrate them with clarity.
To answer the question of overall sentiments of each decleration, we simply took the unnested versions of each statement and matched them with a sentiment lexicon. Because words like “miss” can be either positive or negative depending on context, we have to be careful about which lexicon we use. We then counted the matches, added them up to an overall sentiment score, and illustrated the scores on a table using ‘kable’.