PS1.1 When you roll a fair die 3 times, how many possible outcomes are there?

# Let X be the event of the rollong the die once
# Let Y be the event of the rollong the die twice
# Let Z be the event of the rollong the die three times.

X<- matrix(c(1:6))
Y<- matrix(c(1:6), nrow= 1, byrow= FALSE)
Z<- matrix(c(1:6))

# Each outcome above has 6 elements. Hence the total possible outcome is: 6*6*6 

# Calculating how many outcomes by rolling the die three times using variable "count"
# Please note the possible outcome "L" is commented below. 
count=0
for (i in 1:6)  {
     for (j in 1:6) {
             for (n in 1:6) {
            L <- (c(X[i], Y[j],Z[n]))
            count=count+1           
            #print(L)  #Please uncomment this line to see all possible outcomes
            }
     }
    }
print(count)
## [1] 216

PS1.2 What is the probability of getting a sum total of 3 when you roll a die two times?

# Let X be the event of the rollong the die once
# Let Y be the event of the rollong the die twice

X<- matrix(c(1:6))
Y<- matrix(c(1:6), nrow= 1, byrow= FALSE)

# Finding all the outcomes using variable count. 
count = 0
sum3=0
for (i in 1:6)  {
  for (j in 1:6){
    z <- c(X[i],Y[j])
    count=count+1 
    
    # find when the sum of total outcome of X and Y is 3  
    if (X[i] + Y[j] == 3) 
        sum3= sum3+1
  }
}

print ("The probability of getting a sum total of 3 when you roll a die two times is:")
## [1] "The probability of getting a sum total of 3 when you roll a die two times is:"
(sum3/count)
## [1] 0.05555556

PS1.3

Assume a room of 25 strangers. What is the probability that two of them have the
same birthday? Assume that all birthdays are equally likely and equal to 1=365
each. What happens to this probability when there are 50 people in the room?

PS1.3 Solution

The first person birthday is fixed which is 1 = 365/365 .
The second person birthday either matches the first birthday, which occurs with probability 1/365,
or it does not, with probability 364/365.
The third person probability is either 2/365 when matches or 363/365 when it does not match.
We can write the probabilities of not matching 364/365 and 363/365 as 1-1/365 and 1-2/265
for persons 2 and 3 respectively .
For persons 4 and 5 as 1 - 3/365 and 1 - 4/365.
Hence, for the kth person the probability of not matching is the product of (1 - i/365) for i=1 to k-1.
And the probability of matching of is just 1 - [(1 - i / 365) for i=1 to k-1].
 birthday <- function(k){
  x =1
  for (i in 1:k-1) {
    #print(x)
    x=  (x *  (1 - i/ 365))
    #print(x)
    #print(i)
  }
  return(1-x)
}

# Assume a room of 25 strangers. What is the probability that two of them have the same birthday
birthday(25)
## [1] 0.5686997
# Assume a room of 50 strangers. What is the probability that two of them have the same birthday
birthday(50)
## [1] 0.9703736

Problem Set 2

Write a program to take a document in English and print out the estimated probabilities
for each of the words that occur in that document. Your program should take in a le
containing a large document and write out the probabilities of each of the words that
appear in that document. Please remove all punctuation (quotes, commas, hyphens etc)
and convert the words to lower case before you perform your calculations.
# Load the text file 
datapath <- "C:/CUNY/Courses/IS605/Assignments/Assignment06/assign6.sample.txt"
text <- scan(datapath, what = 'char', sep = '\n')

# Convert to lower case
text <- tolower(text)

# remove all punctuation (quotes, commas, hyphens etc)
text.list <- unlist(strsplit(text, "[^a-z]+"))

# Remove empty spaces
text.list<- text.list[text.list != ""]

# Compute N, the number of tokens
N<- length(text.list)

# Convert to  vector 
text.vector<-unlist(text.list)
freq.list<-table(text.vector, useNA = "no")

# sort the list 
sorted.freq.list<-sort(freq.list, decreasing=TRUE)

# Just curious... the top 10
top10 <- sorted.freq.list[1:10] 
top10
## text.vector
##  the    a  and  for   in   of   to   is said that 
##   76   45   38   31   28   28   28   22   22   21
# the probabilities of each of the words that appear in that document
data.frame(word=names(sorted.freq.list), freq=sorted.freq.list, 
           probability=sorted.freq.list/N)
##                              word freq  probability
## the                           the   76 0.0558412932
## a                               a   45 0.0330639236
## and                           and   38 0.0279206466
## for                           for   31 0.0227773696
## in                             in   28 0.0205731080
## of                             of   28 0.0205731080
## to                             to   28 0.0205731080
## is                             is   22 0.0161645849
## said                         said   22 0.0161645849
## that                         that   21 0.0154298310
## it                             it   18 0.0132255694
## s                               s   16 0.0117560617
## tutwiler                 tutwiler   15 0.0110213079
## prison                     prison   13 0.0095518001
## at                             at   12 0.0088170463
## corrections           corrections   10 0.0073475386
## are                           are    9 0.0066127847
## been                         been    9 0.0066127847
## but                           but    9 0.0066127847
## have                         have    9 0.0066127847
## he                             he    9 0.0066127847
## she                           she    9 0.0066127847
## was                           was    9 0.0066127847
## who                           who    9 0.0066127847
## department             department    8 0.0058780309
## with                         with    8 0.0058780309
## has                           has    7 0.0051432770
## prisoners               prisoners    7 0.0051432770
## than                         than    7 0.0051432770
## about                       about    6 0.0044085231
## alabama                   alabama    6 0.0044085231
## an                             an    6 0.0044085231
## as                             as    6 0.0044085231
## conditions             conditions    6 0.0044085231
## had                           had    6 0.0044085231
## just                         just    6 0.0044085231
## justice                   justice    6 0.0044085231
## more                         more    6 0.0044085231
## officers                 officers    6 0.0044085231
## prisons                   prisons    6 0.0044085231
## state                       state    6 0.0044085231
## still                       still    6 0.0044085231
## there                       there    6 0.0044085231
## they                         they    6 0.0044085231
## women                       women    6 0.0044085231
## years                       years    6 0.0044085231
## after                       after    5 0.0036737693
## federal                   federal    5 0.0036737693
## like                         like    5 0.0036737693
## mr                             mr    5 0.0036737693
## on                             on    5 0.0036737693
## only                         only    5 0.0036737693
## we                             we    5 0.0036737693
## be                             be    4 0.0029390154
## by                             by    4 0.0029390154
## get                           get    4 0.0029390154
## government             government    4 0.0029390154
## inmates                   inmates    4 0.0029390154
## its                           its    4 0.0029390154
## million                   million    4 0.0029390154
## report                     report    4 0.0029390154
## sex                           sex    4 0.0029390154
## six                           six    4 0.0029390154
## this                         this    4 0.0029390154
## were                         were    4 0.0029390154
## year                         year    4 0.0029390154
## abuse                       abuse    3 0.0022042616
## among                       among    3 0.0022042616
## better                     better    3 0.0022042616
## change                     change    3 0.0022042616
## crimes                     crimes    3 0.0022042616
## from                         from    3 0.0022042616
## her                           her    3 0.0022042616
## here                         here    3 0.0022042616
## i                               i    3 0.0022042616
## into                         into    3 0.0022042616
## investigation       investigation    3 0.0022042616
## legislature           legislature    3 0.0022042616
## money                       money    3 0.0022042616
## months                     months    3 0.0022042616
## need                         need    3 0.0022042616
## not                           not    3 0.0022042616
## now                           now    3 0.0022042616
## policy                     policy    3 0.0022042616
## remains                   remains    3 0.0022042616
## sexual                     sexual    3 0.0022042616
## since                       since    3 0.0022042616
## system                     system    3 0.0022042616
## t                               t    3 0.0022042616
## thomas                     thomas    3 0.0022042616
## ward                         ward    3 0.0022042616
## almost                     almost    2 0.0014695077
## anything                 anything    2 0.0014695077
## asked                       asked    2 0.0014695077
## bad                           bad    2 0.0014695077
## basics                     basics    2 0.0014695077
## before                     before    2 0.0014695077
## bentley                   bentley    2 0.0014695077
## budget                     budget    2 0.0014695077
## came                         came    2 0.0014695077
## can                           can    2 0.0014695077
## capacity                 capacity    2 0.0014695077
## changing                 changing    2 0.0014695077
## child                       child    2 0.0014695077
## colby                       colby    2 0.0014695077
## commissioner         commissioner    2 0.0014695077
## conviction             conviction    2 0.0014695077
## don                           don    2 0.0014695077
## drug                         drug    2 0.0014695077
## employees               employees    2 0.0014695077
## equal                       equal    2 0.0014695077
## even                         even    2 0.0014695077
## female                     female    2 0.0014695077
## fix                           fix    2 0.0014695077
## governor                 governor    2 0.0014695077
## guard                       guard    2 0.0014695077
## guards                     guards    2 0.0014695077
## health                     health    2 0.0014695077
## highest                   highest    2 0.0014695077
## him                           him    2 0.0014695077
## how                           how    2 0.0014695077
## improve                   improve    2 0.0014695077
## initiative             initiative    2 0.0014695077
## inside                     inside    2 0.0014695077
## interview               interview    2 0.0014695077
## issued                     issued    2 0.0014695077
## items                       items    2 0.0014695077
## january                   january    2 0.0014695077
## julia                       julia    2 0.0014695077
## last                         last    2 0.0014695077
## least                       least    2 0.0014695077
## life                         life    2 0.0014695077
## live                         live    2 0.0014695077
## low                           low    2 0.0014695077
## many                         many    2 0.0014695077
## medical                   medical    2 0.0014695077
## mental                     mental    2 0.0014695077
## most                         most    2 0.0014695077
## ms                             ms    2 0.0014695077
## much                         much    2 0.0014695077
## nation                     nation    2 0.0014695077
## near                         near    2 0.0014695077
## needs                       needs    2 0.0014695077
## no                             no    2 0.0014695077
## offenders               offenders    2 0.0014695077
## one                           one    2 0.0014695077
## organization         organization    2 0.0014695077
## other                       other    2 0.0014695077
## others                     others    2 0.0014695077
## over                         over    2 0.0014695077
## plan                         plan    2 0.0014695077
## policies                 policies    2 0.0014695077
## problems                 problems    2 0.0014695077
## rampant                   rampant    2 0.0014695077
## raped                       raped    2 0.0014695077
## re                             re    2 0.0014695077
## recently                 recently    2 0.0014695077
## reform                     reform    2 0.0014695077
## released                 released    2 0.0014695077
## reports                   reports    2 0.0014695077
## republican             republican    2 0.0014695077
## running                   running    2 0.0014695077
## say                           say    2 0.0014695077
## says                         says    2 0.0014695077
## sentencing             sentencing    2 0.0014695077
## series                     series    2 0.0014695077
## served                     served    2 0.0014695077
## situation               situation    2 0.0014695077
## so                             so    2 0.0014695077
## some                         some    2 0.0014695077
## sometimes               sometimes    2 0.0014695077
## staffing                 staffing    2 0.0014695077
## support                   support    2 0.0014695077
## three                       three    2 0.0014695077
## top                           top    2 0.0014695077
## treatment               treatment    2 0.0014695077
## up                             up    2 0.0014695077
## use                           use    2 0.0014695077
## ve                             ve    2 0.0014695077
## very                         very    2 0.0014695077
## washington             washington    2 0.0014695077
## well                         well    2 0.0014695077
## what                         what    2 0.0014695077
## where                       where    2 0.0014695077
## whether                   whether    2 0.0014695077
## abundance               abundance    1 0.0007347539
## abundant                 abundant    1 0.0007347539
## abysmal                   abysmal    1 0.0007347539
## according               according    1 0.0007347539
## across                     across    1 0.0007347539
## act                           act    1 0.0007347539
## acting                     acting    1 0.0007347539
## administration     administration    1 0.0007347539
## aging                       aging    1 0.0007347539
## agreed                     agreed    1 0.0007347539
## alabaster               alabaster    1 0.0007347539
## also                         also    1 0.0007347539
## although                 although    1 0.0007347539
## analyst                   analyst    1 0.0007347539
## angel                       angel    1 0.0007347539
## appalled                 appalled    1 0.0007347539
## appetite                 appetite    1 0.0007347539
## approval                 approval    1 0.0007347539
## april                       april    1 0.0007347539
## arbuthnot               arbuthnot    1 0.0007347539
## argues                     argues    1 0.0007347539
## arise                       arise    1 0.0007347539
## armed                       armed    1 0.0007347539
## assaults                 assaults    1 0.0007347539
## assistant               assistant    1 0.0007347539
## attention               attention    1 0.0007347539
## attorney                 attorney    1 0.0007347539
## autopsy                   autopsy    1 0.0007347539
## average                   average    1 0.0007347539
## aware                       aware    1 0.0007347539
## back                         back    1 0.0007347539
## backward                 backward    1 0.0007347539
## banned                     banned    1 0.0007347539
## basic                       basic    1 0.0007347539
## bathtub                   bathtub    1 0.0007347539
## beaten                     beaten    1 0.0007347539
## because                   because    1 0.0007347539
## began                       began    1 0.0007347539
## beginning               beginning    1 0.0007347539
## believe                   believe    1 0.0007347539
## beyond                     beyond    1 0.0007347539
## bigger                     bigger    1 0.0007347539
## birth                       birth    1 0.0007347539
## blind                       blind    1 0.0007347539
## bodies                     bodies    1 0.0007347539
## botched                   botched    1 0.0007347539
## both                         both    1 0.0007347539
## box                           box    1 0.0007347539
## build                       build    1 0.0007347539
## building                 building    1 0.0007347539
## built                       built    1 0.0007347539
## buried                     buried    1 0.0007347539
## called                     called    1 0.0007347539
## calls                       calls    1 0.0007347539
## cam                           cam    1 0.0007347539
## cameras                   cameras    1 0.0007347539
## candidate               candidate    1 0.0007347539
## capita                     capita    1 0.0007347539
## care                         care    1 0.0007347539
## case                         case    1 0.0007347539
## caution                   caution    1 0.0007347539
## chairman                 chairman    1 0.0007347539
## challenging           challenging    1 0.0007347539
## charlotte               charlotte    1 0.0007347539
## choices                   choices    1 0.0007347539
## citizens                 citizens    1 0.0007347539
## civil                       civil    1 0.0007347539
## clean                       clean    1 0.0007347539
## clinical                 clinical    1 0.0007347539
## cologne                   cologne    1 0.0007347539
## coming                     coming    1 0.0007347539
## committee               committee    1 0.0007347539
## commodity               commodity    1 0.0007347539
## congress                 congress    1 0.0007347539
## constitutional     constitutional    1 0.0007347539
## contact                   contact    1 0.0007347539
## contraband             contraband    1 0.0007347539
## convicted               convicted    1 0.0007347539
## corners                   corners    1 0.0007347539
## court                       court    1 0.0007347539
## courts                     courts    1 0.0007347539
## created                   created    1 0.0007347539
## crime                       crime    1 0.0007347539
## criminals               criminals    1 0.0007347539
## crisis                     crisis    1 0.0007347539
## culture                   culture    1 0.0007347539
## curb                         curb    1 0.0007347539
## currency                 currency    1 0.0007347539
## custodial               custodial    1 0.0007347539
## damning                   damning    1 0.0007347539
## dangerously           dangerously    1 0.0007347539
## daughter                 daughter    1 0.0007347539
## dealing                   dealing    1 0.0007347539
## death                       death    1 0.0007347539
## december                 december    1 0.0007347539
## defendants             defendants    1 0.0007347539
## deliberate             deliberate    1 0.0007347539
## deprivation           deprivation    1 0.0007347539
## designed                 designed    1 0.0007347539
## disparages             disparages    1 0.0007347539
## do                             do    1 0.0007347539
## document                 document    1 0.0007347539
## doing                       doing    1 0.0007347539
## double                     double    1 0.0007347539
## down                         down    1 0.0007347539
## drowned                   drowned    1 0.0007347539
## drugs                       drugs    1 0.0007347539
## dynamite                 dynamite    1 0.0007347539
## elderly                   elderly    1 0.0007347539
## enough                     enough    1 0.0007347539
## environment           environment    1 0.0007347539
## examiner                 examiner    1 0.0007347539
## exchanged               exchanged    1 0.0007347539
## eyes                         eyes    1 0.0007347539
## f                               f    1 0.0007347539
## faced                       faced    1 0.0007347539
## faces                       faces    1 0.0007347539
## failed                     failed    1 0.0007347539
## family                     family    1 0.0007347539
## far                           far    1 0.0007347539
## favors                     favors    1 0.0007347539
## fearful                   fearful    1 0.0007347539
## few                           few    1 0.0007347539
## filled                     filled    1 0.0007347539
## finally                   finally    1 0.0007347539
## findings                 findings    1 0.0007347539
## food                         food    1 0.0007347539
## former                     former    1 0.0007347539
## forward                   forward    1 0.0007347539
## fresh                       fresh    1 0.0007347539
## gave                         gave    1 0.0007347539
## general                   general    1 0.0007347539
## george                     george    1 0.0007347539
## getting                   getting    1 0.0007347539
## give                         give    1 0.0007347539
## going                       going    1 0.0007347539
## good                         good    1 0.0007347539
## gov                           gov    1 0.0007347539
## grave                       grave    1 0.0007347539
## great                       great    1 0.0007347539
## group                       group    1 0.0007347539
## guidelines             guidelines    1 0.0007347539
## gun                           gun    1 0.0007347539
## half                         half    1 0.0007347539
## happened                 happened    1 0.0007347539
## harassed                 harassed    1 0.0007347539
## helped                     helped    1 0.0007347539
## highly                     highly    1 0.0007347539
## hire                         hire    1 0.0007347539
## hired                       hired    1 0.0007347539
## his                           his    1 0.0007347539
## home                         home    1 0.0007347539
## ignoring                 ignoring    1 0.0007347539
## important               important    1 0.0007347539
## improved                 improved    1 0.0007347539
## included                 included    1 0.0007347539
## includes                 includes    1 0.0007347539
## including               including    1 0.0007347539
## indifference         indifference    1 0.0007347539
## indigent                 indigent    1 0.0007347539
## inhumane                 inhumane    1 0.0007347539
## inmate                     inmate    1 0.0007347539
## instead                   instead    1 0.0007347539
## institute               institute    1 0.0007347539
## institutions         institutions    1 0.0007347539
## intervention         intervention    1 0.0007347539
## investigate           investigate    1 0.0007347539
## investigating       investigating    1 0.0007347539
## investigations     investigations    1 0.0007347539
## jail                         jail    1 0.0007347539
## jocelyn                   jocelyn    1 0.0007347539
## judiciary               judiciary    1 0.0007347539
## june                         june    1 0.0007347539
## kim                           kim    1 0.0007347539
## lack                         lack    1 0.0007347539
## larger                     larger    1 0.0007347539
## larry                       larry    1 0.0007347539
## law                           law    1 0.0007347539
## lawyer                     lawyer    1 0.0007347539
## legal                       legal    1 0.0007347539
## legislator             legislator    1 0.0007347539
## less                         less    1 0.0007347539
## level                       level    1 0.0007347539
## levels                     levels    1 0.0007347539
## liberal                   liberal    1 0.0007347539
## likely                     likely    1 0.0007347539
## living                     living    1 0.0007347539
## locked                     locked    1 0.0007347539
## long                         long    1 0.0007347539
## longtime                 longtime    1 0.0007347539
## look                         look    1 0.0007347539
## make                         make    1 0.0007347539
## makeup                     makeup    1 0.0007347539
## male                         male    1 0.0007347539
## management             management    1 0.0007347539
## marginally             marginally    1 0.0007347539
## marked                     marked    1 0.0007347539
## marsha                     marsha    1 0.0007347539
## matter                     matter    1 0.0007347539
## may                           may    1 0.0007347539
## me                             me    1 0.0007347539
## met                           met    1 0.0007347539
## middle                     middle    1 0.0007347539
## minimal                   minimal    1 0.0007347539
## misconduct             misconduct    1 0.0007347539
## monica                     monica    1 0.0007347539
## montgomery             montgomery    1 0.0007347539
## month                       month    1 0.0007347539
## morrison                 morrison    1 0.0007347539
## mother                     mother    1 0.0007347539
## moved                       moved    1 0.0007347539
## murder                     murder    1 0.0007347539
## named                       named    1 0.0007347539
## national                 national    1 0.0007347539
## never                       never    1 0.0007347539
## new                           new    1 0.0007347539
## nonviolent             nonviolent    1 0.0007347539
## number                     number    1 0.0007347539
## odds                         odds    1 0.0007347539
## offenses                 offenses    1 0.0007347539
## officer                   officer    1 0.0007347539
## officials               officials    1 0.0007347539
## often                       often    1 0.0007347539
## once                         once    1 0.0007347539
## open                         open    1 0.0007347539
## organize                 organize    1 0.0007347539
## original                 original    1 0.0007347539
## out                           out    1 0.0007347539
## overhaul                 overhaul    1 0.0007347539
## overturned             overturned    1 0.0007347539
## own                           own    1 0.0007347539
## page                         page    1 0.0007347539
## paper                       paper    1 0.0007347539
## parole                     parole    1 0.0007347539
## part                         part    1 0.0007347539
## past                         past    1 0.0007347539
## people                     people    1 0.0007347539
## per                           per    1 0.0007347539
## percent                   percent    1 0.0007347539
## period                     period    1 0.0007347539
## personally             personally    1 0.0007347539
## perspective           perspective    1 0.0007347539
## places                     places    1 0.0007347539
## political               political    1 0.0007347539
## practices               practices    1 0.0007347539
## premature               premature    1 0.0007347539
## pressing                 pressing    1 0.0007347539
## primary                   primary    1 0.0007347539
## primitive               primitive    1 0.0007347539
## problem                   problem    1 0.0007347539
## procedures             procedures    1 0.0007347539
## programs                 programs    1 0.0007347539
## project                   project    1 0.0007347539
## prominence             prominence    1 0.0007347539
## promising               promising    1 0.0007347539
## prompt                     prompt    1 0.0007347539
## property                 property    1 0.0007347539
## psychologist         psychologist    1 0.0007347539
## question                 question    1 0.0007347539
## quit                         quit    1 0.0007347539
## raise                       raise    1 0.0007347539
## ranging                   ranging    1 0.0007347539
## rate                         rate    1 0.0007347539
## recent                     recent    1 0.0007347539
## recruiting             recruiting    1 0.0007347539
## rectify                   rectify    1 0.0007347539
## relatives               relatives    1 0.0007347539
## releasing               releasing    1 0.0007347539
## remained                 remained    1 0.0007347539
## repeat                     repeat    1 0.0007347539
## replaced                 replaced    1 0.0007347539
## represents             represents    1 0.0007347539
## request                   request    1 0.0007347539
## rescinding             rescinding    1 0.0007347539
## resellable             resellable    1 0.0007347539
## review                     review    1 0.0007347539
## right                       right    1 0.0007347539
## rights                     rights    1 0.0007347539
## robbery                   robbery    1 0.0007347539
## robert                     robert    1 0.0007347539
## rodney                     rodney    1 0.0007347539
## routinely               routinely    1 0.0007347539
## row                           row    1 0.0007347539
## rules                       rules    1 0.0007347539
## same                         same    1 0.0007347539
## samuels                   samuels    1 0.0007347539
## scrutinizing         scrutinizing    1 0.0007347539
## second                     second    1 0.0007347539
## secure                     secure    1 0.0007347539
## see                           see    1 0.0007347539
## seen                         seen    1 0.0007347539
## sell                         sell    1 0.0007347539
## senate                     senate    1 0.0007347539
## senator                   senator    1 0.0007347539
## sending                   sending    1 0.0007347539
## senior                     senior    1 0.0007347539
## sent                         sent    1 0.0007347539
## sentence                 sentence    1 0.0007347539
## serious                   serious    1 0.0007347539
## services                 services    1 0.0007347539
## serving                   serving    1 0.0007347539
## session                   session    1 0.0007347539
## several                   several    1 0.0007347539
## sexualized             sexualized    1 0.0007347539
## show                         show    1 0.0007347539
## showed                     showed    1 0.0007347539
## showering               showering    1 0.0007347539
## sick                         sick    1 0.0007347539
## soft                         soft    1 0.0007347539
## solution                 solution    1 0.0007347539
## son                           son    1 0.0007347539
## spending                 spending    1 0.0007347539
## split                       split    1 0.0007347539
## spots                       spots    1 0.0007347539
## stacy                       stacy    1 0.0007347539
## step                         step    1 0.0007347539
## stephen                   stephen    1 0.0007347539
## stepped                   stepped    1 0.0007347539
## stetson                   stetson    1 0.0007347539
## stillborn               stillborn    1 0.0007347539
## stockades               stockades    1 0.0007347539
## strikes                   strikes    1 0.0007347539
## strip                       strip    1 0.0007347539
## strong                     strong    1 0.0007347539
## stuff                       stuff    1 0.0007347539
## stupid                     stupid    1 0.0007347539
## take                         take    1 0.0007347539
## tampons                   tampons    1 0.0007347539
## telephone               telephone    1 0.0007347539
## texas                       texas    1 0.0007347539
## their                       their    1 0.0007347539
## them                         them    1 0.0007347539
## then                         then    1 0.0007347539
## these                       these    1 0.0007347539
## thing                       thing    1 0.0007347539
## things                     things    1 0.0007347539
## think                       think    1 0.0007347539
## third                       third    1 0.0007347539
## those                       those    1 0.0007347539
## tied                         tied    1 0.0007347539
## toilet                     toilet    1 0.0007347539
## toting                     toting    1 0.0007347539
## toxic                       toxic    1 0.0007347539
## track                       track    1 0.0007347539
## tracked                   tracked    1 0.0007347539
## transparent           transparent    1 0.0007347539
## troubled                 troubled    1 0.0007347539
## trying                     trying    1 0.0007347539
## two                           two    1 0.0007347539
## unconstitutional unconstitutional    1 0.0007347539
## uncovered               uncovered    1 0.0007347539
## unfolding               unfolding    1 0.0007347539
## uniforms                 uniforms    1 0.0007347539
## using                       using    1 0.0007347539
## violations             violations    1 0.0007347539
## wanted                     wanted    1 0.0007347539
## wants                       wants    1 0.0007347539
## warden                     warden    1 0.0007347539
## watched                   watched    1 0.0007347539
## way                           way    1 0.0007347539
## week                         week    1 0.0007347539
## weighing                 weighing    1 0.0007347539
## which                       which    1 0.0007347539
## while                       while    1 0.0007347539
## whose                       whose    1 0.0007347539
## wide                         wide    1 0.0007347539
## will                         will    1 0.0007347539
## without                   without    1 0.0007347539
## woman                       woman    1 0.0007347539
## wood                         wood    1 0.0007347539
## work                         work    1 0.0007347539
## worked                     worked    1 0.0007347539
## working                   working    1 0.0007347539
## worse                       worse    1 0.0007347539
## would                       would    1 0.0007347539
## yes                           yes    1 0.0007347539
## you                           you    1 0.0007347539

Visualizing the single word frequency using wordcloud package.

library(wordcloud)
## Loading required package: RColorBrewer
library(RColorBrewer)

dm = data.frame(word=names(sorted.freq.list), freq=sorted.freq.list)

# plot wordcloud
wordcloud(dm$word, dm$freq, random.order=FALSE, colors=brewer.pal(8, "Dark2"))

PS 2.2

Extend your program to calculate the probability of two words occurring adjacent to
each other. It should take in a document, and two words (say the and for) and compute
the probability of each of the words occurring in the document and the joint probability
of both of them occurring together. The order of the two words is not important.

Solution PS 2.2

If we affix (n ??? 1) dummy symbols (, etc.) to the start or end of a text,
then the text will contain N n-grams.
therefore, a text of length N always has N n-grams (for any n)
in our case n = 2 (bigram), then we we need 1,( 2-1) dummy symbols.
## at tutwiler  6   0.004408523 
## in a 6   0.004408523 
## it s 6   0.004408523 
## in the   5   0.003673769 
## of the   5   0.003673769 
## federal government   4   0.002939015 
## for the  4   0.002939015 
## had been 4   0.002939015 
## he said  4   0.002939015 
## justice department   4   0.002939015 
## of corrections   4   0.002939015 
## she said 4   0.002939015 
## the department   4   0.002939015 
## the federal  4   0.002939015 
## the prison   4   0.002939015 
## to be    4   0.002939015 
## who is   4   0.002939015 
## by a 3   0.002204262 
## corrections officers 3   0.002204262 
## department of    3   0.002204262 
## have been    3   0.002204262 
## more than    3   0.002204262 
## that is  3   0.002204262 
## the justice  3   0.002204262 
## the legislature  3   0.002204262 
## the state    3   0.002204262 
## to get   3   0.002204262 
## a series 2   0.001469508 
## a situation  2   0.001469508 
## a state  2   0.001469508 
## about million    2   0.001469508 
## and abuse    2   0.001469508 
## and other    2   0.001469508 
## and others   2   0.001469508 
## are running  2   0.001469508 
## as the   2   0.001469508 
## asked the    2   0.001469508 
## at least 2   0.001469508 
## at the   2   0.001469508 
## but the  2   0.001469508 
## commissioner he  2   0.001469508 
## conditions for   2   0.001469508 
## corrections to   2   0.001469508 
## don t    2   0.001469508 
## employees have   2   0.001469508 
## equal justice    2   0.001469508 
## for a    2   0.001469508 
## for inmates  2   0.001469508 
## has been 2   0.001469508 
## i ve 2   0.001469508 
## in january   2   0.001469508 
## into tutwiler    2   0.001469508 
## is just  2   0.001469508 
## is sometimes 2   0.001469508 
## issued a 2   0.001469508 
## julia tutwiler   2   0.001469508 
## justice initiative   2   0.001469508 
## last year    2   0.001469508 
## legislature is   2   0.001469508 
## mental health    2   0.001469508 
## mr thomas    2   0.001469508 
## mr ward  2   0.001469508 
## need to  2   0.001469508 
## prison s 2   0.001469508 
## report came  2   0.001469508 
## s a  2   0.001469508 
## s still  2   0.001469508 
## said the 2   0.001469508 
## series of    2   0.001469508 
## sex is   2   0.001469508 
## since the    2   0.001469508 
## six months   2   0.001469508 
## state s  2   0.001469508 
## that s   2   0.001469508 
## that the 2   0.001469508 
## the conditions   2   0.001469508 
## the equal    2   0.001469508 
## the nation   2   0.001469508 
## there is 2   0.001469508 
## they are 2   0.001469508 
## thomas the   2   0.001469508 
## to fix   2   0.001469508 
## to improve   2   0.001469508 
## tutwiler but 2   0.001469508 
## was a    2   0.001469508 
## we need  2   0.001469508 
## we re    2   0.001469508 
## years and    2   0.001469508 
## a bathtub    1   0.0007347539    
## a bigger 1   0.0007347539    
## a box    1   0.0007347539    
## a candidate  1   0.0007347539    
## a clinical   1   0.0007347539    
## a court  1   0.0007347539    
## a culture    1   0.0007347539    
## a daughter   1   0.0007347539    
## a deliberate 1   0.0007347539    
## a document   1   0.0007347539    
## a female 1   0.0007347539    
## a former 1   0.0007347539    
## a good   1   0.0007347539    
## a larger 1   0.0007347539    
## a legal  1   0.0007347539    
## a liberal    1   0.0007347539    
## a life   1   0.0007347539    
## a long   1   0.0007347539    
## a marked 1   0.0007347539    
## a medical    1   0.0007347539    
## a month  1   0.0007347539    
## a mother 1   0.0007347539    
## a murder 1   0.0007347539    
## a page   1   0.0007347539    
## a percent    1   0.0007347539    
## a plan   1   0.0007347539    
## a policy 1   0.0007347539    
## a primitive  1   0.0007347539    
## a prison 1   0.0007347539    
## a rate   1   0.0007347539    
## a republican 1   0.0007347539    
## a senior 1   0.0007347539    
## a strip  1   0.0007347539    
## a system 1   0.0007347539    
## a telephone  1   0.0007347539    
## a third  1   0.0007347539    
## a very   1   0.0007347539    
## a wide   1   0.0007347539    
## a woman  1   0.0007347539    
## about a  1   0.0007347539    
## about because    1   0.0007347539    
## about officers   1   0.0007347539    
## about the    1   0.0007347539    
## abundance of 1   0.0007347539    
## abundant blind   1   0.0007347539    
## abuse not    1   0.0007347539    
## abuse that   1   0.0007347539    
## abuse the    1   0.0007347539    
## abysmal staffing 1   0.0007347539    
## according to 1   0.0007347539    
## across alabama   1   0.0007347539    
## act of   1   0.0007347539    
## acting assistant 1   0.0007347539    
## administration s 1   0.0007347539    
## after a  1   0.0007347539    
## after its    1   0.0007347539    
## after julia  1   0.0007347539    
## after the    1   0.0007347539    
## after two    1   0.0007347539    
## aging prison 1   0.0007347539    
## agreed that  1   0.0007347539    
## alabama a    1   0.0007347539    
## alabama faces    1   0.0007347539    
## alabama more 1   0.0007347539    
## alabama prisons  1   0.0007347539    
## alabama s    1   0.0007347539    
## alabama said 1   0.0007347539    
## alabaster who    1   0.0007347539    
## almost double    1   0.0007347539    
## almost years 1   0.0007347539    
## also asked   1   0.0007347539    
## although the 1   0.0007347539    
## among prisoners  1   0.0007347539    
## among the    1   0.0007347539    
## among them   1   0.0007347539    
## an abundance 1   0.0007347539    
## an act   1   0.0007347539    
## an important 1   0.0007347539    
## an interview 1   0.0007347539    
## an open  1   0.0007347539    
## an unfolding 1   0.0007347539    
## analyst with 1   0.0007347539    
## and as   1   0.0007347539    
## and changing 1   0.0007347539    
## and elderly  1   0.0007347539    
## and for  1   0.0007347539    
## and gave 1   0.0007347539    
## and harassed 1   0.0007347539    
## and have 1   0.0007347539    
## and hire 1   0.0007347539    
## and i    1   0.0007347539    
## and living   1   0.0007347539    
## and look 1   0.0007347539    
## and mental   1   0.0007347539    
## and named    1   0.0007347539    
## and once 1   0.0007347539    
## and only 1   0.0007347539    
## and policies 1   0.0007347539    
## and prison   1   0.0007347539    
## and prisoners    1   0.0007347539    
## and procedures   1   0.0007347539    
## and property 1   0.0007347539    
## and secure   1   0.0007347539    
## and sending  1   0.0007347539    
## and she  1   0.0007347539    
## and staffing 1   0.0007347539    
## and still    1   0.0007347539    
## and tampons  1   0.0007347539    
## and that 1   0.0007347539    
## and the  1   0.0007347539    
## and to   1   0.0007347539    
## and track    1   0.0007347539    
## and what 1   0.0007347539    
## and with 1   0.0007347539    
## angel of 1   0.0007347539    
## anything like    1   0.0007347539    
## anything that    1   0.0007347539    
## appalled at  1   0.0007347539    
## appetite for 1   0.0007347539    
## approval for 1   0.0007347539    
## april six    1   0.0007347539    
## arbuthnot served 1   0.0007347539    
## are better   1   0.0007347539    
## are enough   1   0.0007347539    
## are few  1   0.0007347539    
## are locked   1   0.0007347539    
## are most 1   0.0007347539    
## are not  1   0.0007347539    
## are so   1   0.0007347539    
## argues that  1   0.0007347539    
## arise citizens   1   0.0007347539    
## armed robbery    1   0.0007347539    
## as alabama   1   0.0007347539    
## as far   1   0.0007347539    
## as serious   1   0.0007347539    
## as that  1   0.0007347539    
## assaults and 1   0.0007347539    
## assistant attorney   1   0.0007347539    
## at almost    1   0.0007347539    
## at it    1   0.0007347539    
## attention from   1   0.0007347539    
## attorney general 1   0.0007347539    
## autopsy had  1   0.0007347539    
## average legislator   1   0.0007347539    
## aware of 1   0.0007347539    
## back up  1   0.0007347539    
## backward prison  1   0.0007347539    
## bad right    1   0.0007347539    
## bad that 1   0.0007347539    
## banned items 1   0.0007347539    
## basic needs  1   0.0007347539    
## basics like  1   0.0007347539    
## basics the   1   0.0007347539    
## bathtub but  1   0.0007347539    
## be here  1   0.0007347539    
## be soft  1   0.0007347539    
## be the   1   0.0007347539    
## be transparent   1   0.0007347539    
## beaten and   1   0.0007347539    
## because i    1   0.0007347539    
## been aware   1   0.0007347539    
## been botched 1   0.0007347539    
## been convicted   1   0.0007347539    
## been drowned 1   0.0007347539    
## been in  1   0.0007347539    
## been met 1   0.0007347539    
## been raped   1   0.0007347539    
## been stillborn   1   0.0007347539    
## been years   1   0.0007347539    
## before but   1   0.0007347539    
## before the   1   0.0007347539    
## began its    1   0.0007347539    
## beginning to 1   0.0007347539    
## believe it   1   0.0007347539    
## bentley in   1   0.0007347539    
## bentley that 1   0.0007347539    
## better investigate   1   0.0007347539    
## better this  1   0.0007347539    
## better treatment 1   0.0007347539    
## beyond capacity  1   0.0007347539    
## bigger problem   1   0.0007347539    
## birth to 1   0.0007347539    
## blind spots  1   0.0007347539    
## bodies don   1   0.0007347539    
## botched she  1   0.0007347539    
## both for 1   0.0007347539    
## box of   1   0.0007347539    
## budget session   1   0.0007347539    
## budget the   1   0.0007347539    
## build more   1   0.0007347539    
## building was 1   0.0007347539    
## built in 1   0.0007347539    
## buried him   1   0.0007347539    
## but going    1   0.0007347539    
## but in   1   0.0007347539    
## but it   1   0.0007347539    
## but they 1   0.0007347539    
## but to   1   0.0007347539    
## but tutwiler 1   0.0007347539    
## but women    1   0.0007347539    
## by promising 1   0.0007347539    
## called the   1   0.0007347539    
## calls for    1   0.0007347539    
## cam ward 1   0.0007347539    
## came about   1   0.0007347539    
## came out 1   0.0007347539    
## cameras created  1   0.0007347539    
## can change   1   0.0007347539    
## can then 1   0.0007347539    
## candidate disparages 1   0.0007347539    
## capacity and 1   0.0007347539    
## capacity just    1   0.0007347539    
## capita in    1   0.0007347539    
## care there   1   0.0007347539    
## case of  1   0.0007347539    
## caution and  1   0.0007347539    
## chairman of  1   0.0007347539    
## challenging mr   1   0.0007347539    
## change . 1   0.0007347539    
## change remains   1   0.0007347539    
## change the   1   0.0007347539    
## changing sentencing  1   0.0007347539    
## changing several 1   0.0007347539    
## charlotte morrison   1   0.0007347539    
## child had    1   0.0007347539    
## child support    1   0.0007347539    
## choices for  1   0.0007347539    
## citizens policy  1   0.0007347539    
## civil rights 1   0.0007347539    
## clean uniforms   1   0.0007347539    
## clinical psychologist    1   0.0007347539    
## colby a  1   0.0007347539    
## colby said   1   0.0007347539    
## cologne anything 1   0.0007347539    
## coming year  1   0.0007347539    
## committee we 1   0.0007347539    
## commodity there  1   0.0007347539    
## conditions and   1   0.0007347539    
## conditions are   1   0.0007347539    
## conditions at    1   0.0007347539    
## conditions remained  1   0.0007347539    
## congress to  1   0.0007347539    
## constitutional violations    1   0.0007347539    
## contact with 1   0.0007347539    
## contraband items 1   0.0007347539    
## convicted of 1   0.0007347539    
## conviction her   1   0.0007347539    
## conviction was   1   0.0007347539    
## corners of   1   0.0007347539    
## corrections argues   1   0.0007347539    
## corrections commissioner 1   0.0007347539    
## corrections employees    1   0.0007347539    
## corrections officer  1   0.0007347539    
## corrections says 1   0.0007347539    
## court agreed 1   0.0007347539    
## courts only  1   0.0007347539    
## created a    1   0.0007347539    
## crime but    1   0.0007347539    
## crimes a 1   0.0007347539    
## crimes since 1   0.0007347539    
## crimes that  1   0.0007347539    
## criminals the    1   0.0007347539    
## crisis even  1   0.0007347539    
## culture of   1   0.0007347539    
## curb it  1   0.0007347539    
## currency for 1   0.0007347539    
## custodial sexual 1   0.0007347539    
## damning investigations   1   0.0007347539    
## dangerously low  1   0.0007347539    
## daughter who 1   0.0007347539    
## dealing with 1   0.0007347539    
## death row    1   0.0007347539    
## december she 1   0.0007347539    
## defendants and   1   0.0007347539    
## deliberate indifference  1   0.0007347539    
## department began 1   0.0007347539    
## department investigation 1   0.0007347539    
## department is    1   0.0007347539    
## department s 1   0.0007347539    
## department who   1   0.0007347539    
## deprivation and  1   0.0007347539    
## designed for 1   0.0007347539    
## disparages criminals 1   0.0007347539    
## do it    1   0.0007347539    
## document from    1   0.0007347539    
## doing this   1   0.0007347539    
## double capacity  1   0.0007347539    
## down and 1   0.0007347539    
## drowned in   1   0.0007347539    
## drug and 1   0.0007347539    
## drug offenders   1   0.0007347539    
## drugs and    1   0.0007347539    
## dynamite the 1   0.0007347539    
## elderly and  1   0.0007347539    
## enough to    1   0.0007347539    
## environment she  1   0.0007347539    
## even so  1   0.0007347539    
## even stacy   1   0.0007347539    
## examiner said    1   0.0007347539    
## exchanged both   1   0.0007347539    
## eyes the 1   0.0007347539    
## f wood   1   0.0007347539    
## faced a  1   0.0007347539    
## faces federal    1   0.0007347539    
## failed to    1   0.0007347539    
## family is    1   0.0007347539    
## far as   1   0.0007347539    
## favors she   1   0.0007347539    
## fearful and  1   0.0007347539    
## federal intervention 1   0.0007347539    
## female corrections   1   0.0007347539    
## female inmate    1   0.0007347539    
## few places   1   0.0007347539    
## filled the   1   0.0007347539    
## finally getting  1   0.0007347539    
## findings he  1   0.0007347539    
## fix alabama  1   0.0007347539    
## fix the  1   0.0007347539    
## food and 1   0.0007347539    
## for about    1   0.0007347539    
## for armed    1   0.0007347539    
## for at   1   0.0007347539    
## for banned   1   0.0007347539    
## for basic    1   0.0007347539    
## for basics   1   0.0007347539    
## for change   1   0.0007347539    
## for changing 1   0.0007347539    
## for civil    1   0.0007347539    
## for custodial    1   0.0007347539    
## for drug 1   0.0007347539    
## for favors   1   0.0007347539    
## for her  1   0.0007347539    
## for many 1   0.0007347539    
## for me   1   0.0007347539    
## for mental   1   0.0007347539    
## for more 1   0.0007347539    
## for most 1   0.0007347539    
## for nonviolent   1   0.0007347539    
## for prison   1   0.0007347539    
## for repeat   1   0.0007347539    
## for that 1   0.0007347539    
## for women    1   0.0007347539    
## former corrections   1   0.0007347539    
## forward it   1   0.0007347539    
## fresh eyes   1   0.0007347539    
## from alabaster   1   0.0007347539    
## from gov 1   0.0007347539    
## from the 1   0.0007347539    
## gave birth   1   0.0007347539    
## general for  1   0.0007347539    
## george a 1   0.0007347539    
## get better   1   0.0007347539    
## get food 1   0.0007347539    
## get makeup   1   0.0007347539    
## get the  1   0.0007347539    
## getting about    1   0.0007347539    
## give corrections 1   0.0007347539    
## going forward    1   0.0007347539    
## good thing   1   0.0007347539    
## gov robert   1   0.0007347539    
## government has   1   0.0007347539    
## government says  1   0.0007347539    
## government to    1   0.0007347539    
## government was   1   0.0007347539    
## governor in  1   0.0007347539    
## governor this    1   0.0007347539    
## grave near   1   0.0007347539    
## great but    1   0.0007347539    
## group even   1   0.0007347539    
## guard and    1   0.0007347539    
## guard rodney 1   0.0007347539    
## guards have  1   0.0007347539    
## guards was   1   0.0007347539    
## guidelines that  1   0.0007347539    
## gun toting   1   0.0007347539    
## had last 1   0.0007347539    
## had sex  1   0.0007347539    
## half the 1   0.0007347539    
## happened at  1   0.0007347539    
## harassed women   1   0.0007347539    
## has faced    1   0.0007347539    
## has improved 1   0.0007347539    
## has since    1   0.0007347539    
## has stepped  1   0.0007347539    
## has the  1   0.0007347539    
## have failed  1   0.0007347539    
## have filled  1   0.0007347539    
## have had 1   0.0007347539    
## have raped   1   0.0007347539    
## have routinely   1   0.0007347539    
## have the 1   0.0007347539    
## he also  1   0.0007347539    
## he has   1   0.0007347539    
## he issued    1   0.0007347539    
## he quit  1   0.0007347539    
## he would 1   0.0007347539    
## health care  1   0.0007347539    
## health services  1   0.0007347539    
## helped prisoners 1   0.0007347539    
## her home 1   0.0007347539    
## her premature    1   0.0007347539    
## her work 1   0.0007347539    
## here for 1   0.0007347539    
## here period  1   0.0007347539    
## here said    1   0.0007347539    
## highest in   1   0.0007347539    
## highest number   1   0.0007347539    
## highly sexualized    1   0.0007347539    
## him down 1   0.0007347539    
## him in   1   0.0007347539    
## hire about   1   0.0007347539    
## hired at 1   0.0007347539    
## his request  1   0.0007347539    
## home a   1   0.0007347539    
## how much 1   0.0007347539    
## how they 1   0.0007347539    
## i wanted 1   0.0007347539    
## ignoring the 1   0.0007347539    
## important commodity  1   0.0007347539    
## improve conditions   1   0.0007347539    
## improve well 1   0.0007347539    
## improved only    1   0.0007347539    
## in after 1   0.0007347539    
## in alabama   1   0.0007347539    
## in an    1   0.0007347539    
## in and   1   0.0007347539    
## in april 1   0.0007347539    
## in child 1   0.0007347539    
## in contact   1   0.0007347539    
## in december  1   0.0007347539    
## in he    1   0.0007347539    
## in institutions  1   0.0007347539    
## in jail  1   0.0007347539    
## in many  1   0.0007347539    
## in may   1   0.0007347539    
## in prisons   1   0.0007347539    
## in to    1   0.0007347539    
## included recruiting  1   0.0007347539    
## includes million 1   0.0007347539    
## including some   1   0.0007347539    
## indifference on  1   0.0007347539    
## indigent defendants  1   0.0007347539    
## inhumane for 1   0.0007347539    
## initiative a 1   0.0007347539    
## initiative report    1   0.0007347539    
## inmate there 1   0.0007347539    
## inmates in   1   0.0007347539    
## inmates per  1   0.0007347539    
## inmates to   1   0.0007347539    
## inmates use  1   0.0007347539    
## inside say   1   0.0007347539    
## inside the   1   0.0007347539    
## instead the  1   0.0007347539    
## institute of 1   0.0007347539    
## institutions across  1   0.0007347539    
## intervention and 1   0.0007347539    
## interview has    1   0.0007347539    
## interview ms 1   0.0007347539    
## into treatment   1   0.0007347539    
## investigate and  1   0.0007347539    
## investigating tutwiler   1   0.0007347539    
## investigation in 1   0.0007347539    
## investigation into   1   0.0007347539    
## investigation more   1   0.0007347539    
## investigations into  1   0.0007347539    
## is a 1   0.0007347539    
## is about 1   0.0007347539    
## is among 1   0.0007347539    
## is an    1   0.0007347539    
## is chairman  1   0.0007347539    
## is challenging   1   0.0007347539    
## is dangerously   1   0.0007347539    
## is finally   1   0.0007347539    
## is in    1   0.0007347539    
## is no    1   0.0007347539    
## is not   1   0.0007347539    
## is now   1   0.0007347539    
## is often 1   0.0007347539    
## is only  1   0.0007347539    
## is resellable    1   0.0007347539    
## is serving   1   0.0007347539    
## is still 1   0.0007347539    
## is weighing  1   0.0007347539    
## it calls 1   0.0007347539    
## it don   1   0.0007347539    
## it for   1   0.0007347539    
## it had   1   0.0007347539    
## it has   1   0.0007347539    
## it is    1   0.0007347539    
## it needs 1   0.0007347539    
## it remains   1   0.0007347539    
## it the   1   0.0007347539    
## it to    1   0.0007347539    
## it will  1   0.0007347539    
## it with  1   0.0007347539    
## items like   1   0.0007347539    
## items that   1   0.0007347539    
## its budget   1   0.0007347539    
## its investigation    1   0.0007347539    
## its own  1   0.0007347539    
## its spending 1   0.0007347539    
## jail for 1   0.0007347539    
## january that 1   0.0007347539    
## january the  1   0.0007347539    
## jocelyn samuels  1   0.0007347539    
## judiciary committee  1   0.0007347539    
## june republican  1   0.0007347539    
## just a   1   0.0007347539    
## just at  1   0.0007347539    
## just over    1   0.0007347539    
## just stupid  1   0.0007347539    
## just to  1   0.0007347539    
## just tutwiler    1   0.0007347539    
## kim t    1   0.0007347539    
## lack of  1   0.0007347539    
## larger overhaul  1   0.0007347539    
## larry f  1   0.0007347539    
## law for  1   0.0007347539    
## lawyer with  1   0.0007347539    
## least six    1   0.0007347539    
## least years  1   0.0007347539    
## legal organization   1   0.0007347539    
## legislator it    1   0.0007347539    
## legislature for  1   0.0007347539    
## less than    1   0.0007347539    
## level drug   1   0.0007347539    
## levels abundant  1   0.0007347539    
## liberal policy   1   0.0007347539    
## life at  1   0.0007347539    
## life sentence    1   0.0007347539    
## like an  1   0.0007347539    
## like clean   1   0.0007347539    
## like drugs   1   0.0007347539    
## like this    1   0.0007347539    
## like toilet  1   0.0007347539    
## likely unconstitutional  1   0.0007347539    
## live it  1   0.0007347539    
## live there   1   0.0007347539    
## living with  1   0.0007347539    
## locked up    1   0.0007347539    
## long while   1   0.0007347539    
## longtime warden  1   0.0007347539    
## look at  1   0.0007347539    
## low level    1   0.0007347539    
## low said 1   0.0007347539    
## make their   1   0.0007347539    
## makeup cologne   1   0.0007347539    
## male guards  1   0.0007347539    
## management who   1   0.0007347539    
## many corners 1   0.0007347539    
## many years   1   0.0007347539    
## marginally monica    1   0.0007347539    
## marked grave 1   0.0007347539    
## marsha colby 1   0.0007347539    
## matter he    1   0.0007347539    
## may the  1   0.0007347539    
## me personally    1   0.0007347539    
## medical and  1   0.0007347539    
## medical examiner 1   0.0007347539    
## met by   1   0.0007347539    
## middle of    1   0.0007347539    
## million for  1   0.0007347539    
## million less 1   0.0007347539    
## million more 1   0.0007347539    
## million of   1   0.0007347539    
## minimal the  1   0.0007347539    
## misconduct he    1   0.0007347539    
## money and    1   0.0007347539    
## money are    1   0.0007347539    
## money she    1   0.0007347539    
## monica washington    1   0.0007347539    
## montgomery the   1   0.0007347539    
## month in 1   0.0007347539    
## months after 1   0.0007347539    
## months appalled  1   0.0007347539    
## months in    1   0.0007347539    
## more female  1   0.0007347539    
## more money   1   0.0007347539    
## more prisons 1   0.0007347539    
## morrison a   1   0.0007347539    
## most likely  1   0.0007347539    
## most of  1   0.0007347539    
## mother of    1   0.0007347539    
## moved to 1   0.0007347539    
## mr bentley   1   0.0007347539    
## ms colby 1   0.0007347539    
## ms washington    1   0.0007347539    
## much a   1   0.0007347539    
## much new 1   0.0007347539    
## murder conviction    1   0.0007347539    
## named after  1   0.0007347539    
## nation no    1   0.0007347539    
## nation now   1   0.0007347539    
## national institute   1   0.0007347539    
## near her 1   0.0007347539    
## near montgomery  1   0.0007347539    
## need just    1   0.0007347539    
## needs like   1   0.0007347539    
## needs million    1   0.0007347539    
## never seen   1   0.0007347539    
## new money    1   0.0007347539    
## no ignoring  1   0.0007347539    
## no one   1   0.0007347539    
## nonviolent offenses  1   0.0007347539    
## not great    1   0.0007347539    
## not just 1   0.0007347539    
## not to   1   0.0007347539    
## now and  1   0.0007347539    
## now as   1   0.0007347539    
## now for  1   0.0007347539    
## number of    1   0.0007347539    
## odds of  1   0.0007347539    
## of a 1   0.0007347539    
## of alabama   1   0.0007347539    
## of approval  1   0.0007347539    
## of assaults  1   0.0007347539    
## of caution   1   0.0007347539    
## of congress  1   0.0007347539    
## of constitutional    1   0.0007347539    
## of damning   1   0.0007347539    
## of deprivation   1   0.0007347539    
## of dynamite  1   0.0007347539    
## of his   1   0.0007347539    
## of inmates   1   0.0007347539    
## of its   1   0.0007347539    
## of prison    1   0.0007347539    
## of sexual    1   0.0007347539    
## of six   1   0.0007347539    
## of support   1   0.0007347539    
## of troubled  1   0.0007347539    
## of years 1   0.0007347539    
## offenders into   1   0.0007347539    
## offenders releasing  1   0.0007347539    
## offenses that    1   0.0007347539    
## officer who  1   0.0007347539    
## officers a   1   0.0007347539    
## officers have    1   0.0007347539    
## officers pressing    1   0.0007347539    
## officers she 1   0.0007347539    
## officers the 1   0.0007347539    
## officers were    1   0.0007347539    
## officials and    1   0.0007347539    
## often tied   1   0.0007347539    
## on crime 1   0.0007347539    
## on death 1   0.0007347539    
## on the   1   0.0007347539    
## on tutwiler  1   0.0007347539    
## on whether   1   0.0007347539    
## once helped  1   0.0007347539    
## one in   1   0.0007347539    
## one wants    1   0.0007347539    
## only currency    1   0.0007347539    
## only marginally  1   0.0007347539    
## only one 1   0.0007347539    
## only recently    1   0.0007347539    
## only three   1   0.0007347539    
## open question    1   0.0007347539    
## organization asked   1   0.0007347539    
## organization that    1   0.0007347539    
## organize a   1   0.0007347539    
## original building    1   0.0007347539    
## other basics 1   0.0007347539    
## other top    1   0.0007347539    
## others believe   1   0.0007347539    
## others say   1   0.0007347539    
## out in   1   0.0007347539    
## over a   1   0.0007347539    
## over half    1   0.0007347539    
## overhaul at  1   0.0007347539    
## overturned after 1   0.0007347539    
## own investigation    1   0.0007347539    
## page report  1   0.0007347539    
## paper and    1   0.0007347539    
## parole for   1   0.0007347539    
## part of  1   0.0007347539    
## past week    1   0.0007347539    
## people who   1   0.0007347539    
## per capita   1   0.0007347539    
## percent raise    1   0.0007347539    
## period marsha    1   0.0007347539    
## personally it    1   0.0007347539    
## perspective to   1   0.0007347539    
## places worse 1   0.0007347539    
## plan for 1   0.0007347539    
## plan in  1   0.0007347539    
## policies and 1   0.0007347539    
## policies at  1   0.0007347539    
## policy analyst   1   0.0007347539    
## policy group 1   0.0007347539    
## policy project   1   0.0007347539    
## political prominence 1   0.0007347539    
## practices and    1   0.0007347539    
## premature son    1   0.0007347539    
## pressing the 1   0.0007347539    
## primary by   1   0.0007347539    
## primitive very   1   0.0007347539    
## prison crisis    1   0.0007347539    
## prison for   1   0.0007347539    
## prison guard 1   0.0007347539    
## prison here  1   0.0007347539    
## prison management    1   0.0007347539    
## prison officers  1   0.0007347539    
## prison officials 1   0.0007347539    
## prison problems  1   0.0007347539    
## prison reform    1   0.0007347539    
## prison system    1   0.0007347539    
## prison was   1   0.0007347539    
## prisoners and    1   0.0007347539    
## prisoners are    1   0.0007347539    
## prisoners in 1   0.0007347539    
## prisoners organize   1   0.0007347539    
## prisoners were   1   0.0007347539    
## prisoners which  1   0.0007347539    
## prisoners who    1   0.0007347539    
## prisons are  1   0.0007347539    
## prisons but  1   0.0007347539    
## prisons for  1   0.0007347539    
## prisons in   1   0.0007347539    
## prisons that 1   0.0007347539    
## prisons well 1   0.0007347539    
## problem than 1   0.0007347539    
## problems before  1   0.0007347539    
## problems it  1   0.0007347539    
## procedures among 1   0.0007347539    
## programs instead 1   0.0007347539    
## project a    1   0.0007347539    
## prominence is    1   0.0007347539    
## promising to 1   0.0007347539    
## prompt reform    1   0.0007347539    
## property crimes  1   0.0007347539    
## psychologist who 1   0.0007347539    
## question whether 1   0.0007347539    
## quit after   1   0.0007347539    
## raise and    1   0.0007347539    
## rampant sexual   1   0.0007347539    
## rampant the  1   0.0007347539    
## ranging plan 1   0.0007347539    
## raped beaten 1   0.0007347539    
## raped by 1   0.0007347539    
## rate for 1   0.0007347539    
## re dealing   1   0.0007347539    
## re doing 1   0.0007347539    
## recent reports   1   0.0007347539    
## recently released    1   0.0007347539    
## recently tracked 1   0.0007347539    
## recruiting more  1   0.0007347539    
## rectify the  1   0.0007347539    
## reform it    1   0.0007347539    
## reform yes   1   0.0007347539    
## relatives near   1   0.0007347539    
## released and 1   0.0007347539    
## released in  1   0.0007347539    
## releasing the    1   0.0007347539    
## remained bad 1   0.0007347539    
## remains an   1   0.0007347539    
## remains in   1   0.0007347539    
## remains minimal  1   0.0007347539    
## repeat offenders 1   0.0007347539    
## replaced said    1   0.0007347539    
## report said  1   0.0007347539    
## report to    1   0.0007347539    
## reports of   1   0.0007347539    
## reports on   1   0.0007347539    
## represents indigent  1   0.0007347539    
## republican from  1   0.0007347539    
## republican primary   1   0.0007347539    
## request to   1   0.0007347539    
## rescinding the   1   0.0007347539    
## resellable that  1   0.0007347539    
## review practices 1   0.0007347539    
## right now    1   0.0007347539    
## rights for   1   0.0007347539    
## robbery said 1   0.0007347539    
## robert bentley   1   0.0007347539    
## rodney arbuthnot 1   0.0007347539    
## routinely watched    1   0.0007347539    
## row although 1   0.0007347539    
## rules rescinding 1   0.0007347539    
## running at   1   0.0007347539    
## running it   1   0.0007347539    
## s abysmal    1   0.0007347539    
## s budget 1   0.0007347539    
## s commissioner   1   0.0007347539    
## s how    1   0.0007347539    
## s inhumane   1   0.0007347539    
## s lack   1   0.0007347539    
## s like   1   0.0007347539    
## s prison 1   0.0007347539    
## s prisoners  1   0.0007347539    
## s prisons    1   0.0007347539    
## s problems   1   0.0007347539    
## s stuff  1   0.0007347539    
## said but 1   0.0007347539    
## said charlotte   1   0.0007347539    
## said he  1   0.0007347539    
## said in  1   0.0007347539    
## said it  1   0.0007347539    
## said jocelyn 1   0.0007347539    
## said kim 1   0.0007347539    
## said larry   1   0.0007347539    
## said male    1   0.0007347539    
## said mr  1   0.0007347539    
## said she 1   0.0007347539    
## said state   1   0.0007347539    
## said stephen 1   0.0007347539    
## said still   1   0.0007347539    
## said that    1   0.0007347539    
## said there   1   0.0007347539    
## said they    1   0.0007347539    
## said was 1   0.0007347539    
## said we  1   0.0007347539    
## said were    1   0.0007347539    
## same as  1   0.0007347539    
## samuels the  1   0.0007347539    
## say is   1   0.0007347539    
## say life 1   0.0007347539    
## says conditions  1   0.0007347539    
## says they    1   0.0007347539    
## scrutinizing medical 1   0.0007347539    
## second highest   1   0.0007347539    
## secure contraband    1   0.0007347539    
## see what 1   0.0007347539    
## seen anything    1   0.0007347539    
## sell to  1   0.0007347539    
## senate judiciary 1   0.0007347539    
## senator cam  1   0.0007347539    
## sending low  1   0.0007347539    
## senior lawyer    1   0.0007347539    
## sent a   1   0.0007347539    
## sentence without 1   0.0007347539    
## sentencing guidelines    1   0.0007347539    
## sentencing rules 1   0.0007347539    
## serious as   1   0.0007347539    
## served almost    1   0.0007347539    
## served six   1   0.0007347539    
## services i   1   0.0007347539    
## serving years    1   0.0007347539    
## session working  1   0.0007347539    
## several policies 1   0.0007347539    
## sex among    1   0.0007347539    
## sex with 1   0.0007347539    
## sexual abuse 1   0.0007347539    
## sexual crimes    1   0.0007347539    
## sexual misconduct    1   0.0007347539    
## sexualized environment   1   0.0007347539    
## she and  1   0.0007347539    
## she buried   1   0.0007347539    
## she had  1   0.0007347539    
## she remains  1   0.0007347539    
## she was  1   0.0007347539    
## show sex 1   0.0007347539    
## showed rampant   1   0.0007347539    
## showering and    1   0.0007347539    
## sick and 1   0.0007347539    
## since moved  1   0.0007347539    
## situation as 1   0.0007347539    
## situation where  1   0.0007347539    
## six corrections  1   0.0007347539    
## six served   1   0.0007347539    
## so bad   1   0.0007347539    
## so for   1   0.0007347539    
## soft on  1   0.0007347539    
## solution mr  1   0.0007347539    
## some on  1   0.0007347539    
## some tutwiler    1   0.0007347539    
## sometimes exchanged  1   0.0007347539    
## sometimes the    1   0.0007347539    
## son had  1   0.0007347539    
## spending choices 1   0.0007347539    
## split on 1   0.0007347539    
## spots and    1   0.0007347539    
## stacy george 1   0.0007347539    
## staffing is  1   0.0007347539    
## staffing levels  1   0.0007347539    
## state has    1   0.0007347539    
## state senator    1   0.0007347539    
## state system 1   0.0007347539    
## state where  1   0.0007347539    
## step in  1   0.0007347539    
## stephen stetson  1   0.0007347539    
## stepped in   1   0.0007347539    
## stetson a    1   0.0007347539    
## still fearful    1   0.0007347539    
## still in 1   0.0007347539    
## still inside 1   0.0007347539    
## still investigating  1   0.0007347539    
## still the    1   0.0007347539    
## still these  1   0.0007347539    
## stillborn and    1   0.0007347539    
## stockades for    1   0.0007347539    
## strikes law  1   0.0007347539    
## strip show   1   0.0007347539    
## strong case  1   0.0007347539    
## stuff that   1   0.0007347539    
## stupid mr    1   0.0007347539    
## support for  1   0.0007347539    
## support in   1   0.0007347539    
## system said  1   0.0007347539    
## system that  1   0.0007347539    
## system to    1   0.0007347539    
## t have   1   0.0007347539    
## t matter 1   0.0007347539    
## t thomas 1   0.0007347539    
## take a   1   0.0007347539    
## tampons but  1   0.0007347539    
## telephone interview  1   0.0007347539    
## texas the    1   0.0007347539    
## than a   1   0.0007347539    
## than it  1   0.0007347539    
## than just    1   0.0007347539    
## than last    1   0.0007347539    
## than the 1   0.0007347539    
## than they    1   0.0007347539    
## than women   1   0.0007347539    
## that conditions  1   0.0007347539    
## that happened    1   0.0007347539    
## that has 1   0.0007347539    
## that have    1   0.0007347539    
## that included    1   0.0007347539    
## that includes    1   0.0007347539    
## that it  1   0.0007347539    
## that much    1   0.0007347539    
## that prisoners   1   0.0007347539    
## that report  1   0.0007347539    
## that represents  1   0.0007347539    
## that there   1   0.0007347539    
## that they    1   0.0007347539    
## that uncovered   1   0.0007347539    
## the acting   1   0.0007347539    
## the administration   1   0.0007347539    
## the aging    1   0.0007347539    
## the angel    1   0.0007347539    
## the appetite 1   0.0007347539    
## the autopsy  1   0.0007347539    
## the average  1   0.0007347539    
## the child    1   0.0007347539    
## the coming   1   0.0007347539    
## the conviction   1   0.0007347539    
## the corrections  1   0.0007347539    
## the courts   1   0.0007347539    
## the crimes   1   0.0007347539    
## the employees    1   0.0007347539    
## the family   1   0.0007347539    
## the governor 1   0.0007347539    
## the guard    1   0.0007347539    
## the gun  1   0.0007347539    
## the highest  1   0.0007347539    
## the inmates  1   0.0007347539    
## the julia    1   0.0007347539    
## the june 1   0.0007347539    
## the longtime 1   0.0007347539    
## the middle   1   0.0007347539    
## the national 1   0.0007347539    
## the odds 1   0.0007347539    
## the officers 1   0.0007347539    
## the only 1   0.0007347539    
## the organization 1   0.0007347539    
## the original 1   0.0007347539    
## the part 1   0.0007347539    
## the people   1   0.0007347539    
## the perspective  1   0.0007347539    
## the prisons  1   0.0007347539    
## the recent   1   0.0007347539    
## the report   1   0.0007347539    
## the same 1   0.0007347539    
## the second   1   0.0007347539    
## the senate   1   0.0007347539    
## the sentencing   1   0.0007347539    
## the sick 1   0.0007347539    
## the solution 1   0.0007347539    
## the stockades    1   0.0007347539    
## the things   1   0.0007347539    
## the three    1   0.0007347539    
## the top  1   0.0007347539    
## the toxic    1   0.0007347539    
## the way  1   0.0007347539    
## the women    1   0.0007347539    
## their money  1   0.0007347539    
## them was 1   0.0007347539    
## then sell    1   0.0007347539    
## there are    1   0.0007347539    
## there including  1   0.0007347539    
## there it 1   0.0007347539    
## there ms 1   0.0007347539    
## these bodies 1   0.0007347539    
## they can 1   0.0007347539    
## they get 1   0.0007347539    
## they have    1   0.0007347539    
## they make    1   0.0007347539    
## thing sex    1   0.0007347539    
## things you   1   0.0007347539    
## think that   1   0.0007347539    
## third of 1   0.0007347539    
## this he  1   0.0007347539    
## this is  1   0.0007347539    
## this past    1   0.0007347539    
## this year    1   0.0007347539    
## thomas said  1   0.0007347539    
## those findings   1   0.0007347539    
## three cameras    1   0.0007347539    
## three strikes    1   0.0007347539    
## tied to  1   0.0007347539    
## to a 1   0.0007347539    
## to an    1   0.0007347539    
## to back  1   0.0007347539    
## to better    1   0.0007347539    
## to build 1   0.0007347539    
## to change    1   0.0007347539    
## to curb  1   0.0007347539    
## to give  1   0.0007347539    
## to how   1   0.0007347539    
## to live  1   0.0007347539    
## to prompt    1   0.0007347539    
## to rectify   1   0.0007347539    
## to review    1   0.0007347539    
## to see   1   0.0007347539    
## to step  1   0.0007347539    
## to texas 1   0.0007347539    
## to the   1   0.0007347539    
## toilet paper 1   0.0007347539    
## top of   1   0.0007347539    
## top prison   1   0.0007347539    
## toting governor  1   0.0007347539    
## toxic highly 1   0.0007347539    
## track reports    1   0.0007347539    
## tracked him  1   0.0007347539    
## transparent mr   1   0.0007347539    
## treatment and    1   0.0007347539    
## treatment programs   1   0.0007347539    
## troubled prisons 1   0.0007347539    
## trying to    1   0.0007347539    
## tutwiler a   1   0.0007347539    
## tutwiler are 1   0.0007347539    
## tutwiler has 1   0.0007347539    
## tutwiler in  1   0.0007347539    
## tutwiler prison  1   0.0007347539    
## tutwiler prisoners   1   0.0007347539    
## tutwiler said    1   0.0007347539    
## tutwiler scrutinizing    1   0.0007347539    
## tutwiler showed  1   0.0007347539    
## tutwiler using   1   0.0007347539    
## tutwiler we  1   0.0007347539    
## tutwiler were    1   0.0007347539    
## tutwiler whose   1   0.0007347539    
## two months   1   0.0007347539    
## unconstitutional is  1   0.0007347539    
## uncovered by 1   0.0007347539    
## unfolding justice    1   0.0007347539    
## uniforms at  1   0.0007347539    
## up and   1   0.0007347539    
## up for   1   0.0007347539    
## use about    1   0.0007347539    
## use it   1   0.0007347539    
## using those  1   0.0007347539    
## ve never 1   0.0007347539    
## ve worked    1   0.0007347539    
## very backward    1   0.0007347539    
## very strong  1   0.0007347539    
## violations here  1   0.0007347539    
## wanted an    1   0.0007347539    
## wants to 1   0.0007347539    
## ward a   1   0.0007347539    
## ward and 1   0.0007347539    
## ward said    1   0.0007347539    
## warden and   1   0.0007347539    
## was built    1   0.0007347539    
## was designed 1   0.0007347539    
## was hired    1   0.0007347539    
## was overturned   1   0.0007347539    
## was rampant  1   0.0007347539    
## was released 1   0.0007347539    
## was the  1   0.0007347539    
## washington said  1   0.0007347539    
## washington who   1   0.0007347539    
## watched women    1   0.0007347539    
## way we   1   0.0007347539    
## we think 1   0.0007347539    
## week issued  1   0.0007347539    
## weighing its 1   0.0007347539    
## well before  1   0.0007347539    
## well beyond  1   0.0007347539    
## were beginning   1   0.0007347539    
## were replaced    1   0.0007347539    
## were split   1   0.0007347539    
## were still   1   0.0007347539    
## what can 1   0.0007347539    
## what he  1   0.0007347539    
## where political  1   0.0007347539    
## where sex    1   0.0007347539    
## whether attention    1   0.0007347539    
## whether the  1   0.0007347539    
## which is 1   0.0007347539    
## while said   1   0.0007347539    
## who are  1   0.0007347539    
## who have 1   0.0007347539    
## who sent 1   0.0007347539    
## who she  1   0.0007347539    
## who was  1   0.0007347539    
## whose conditions 1   0.0007347539    
## wide ranging 1   0.0007347539    
## will take    1   0.0007347539    
## with a   1   0.0007347539    
## with arise   1   0.0007347539    
## with fresh   1   0.0007347539    
## with guards  1   0.0007347539    
## with prisoners   1   0.0007347539    
## with relatives   1   0.0007347539    
## with some    1   0.0007347539    
## with the 1   0.0007347539    
## without parole   1   0.0007347539    
## woman called 1   0.0007347539    
## women corrections    1   0.0007347539    
## women do 1   0.0007347539    
## women inside 1   0.0007347539    
## women live   1   0.0007347539    
## women recently   1   0.0007347539    
## women showering  1   0.0007347539    
## wood a   1   0.0007347539    
## work trying  1   0.0007347539    
## worked in    1   0.0007347539    
## working over 1   0.0007347539    
## worse than   1   0.0007347539    
## would use    1   0.0007347539    
## year alabama 1   0.0007347539    
## year it  1   0.0007347539    
## year s   1   0.0007347539    
## year than    1   0.0007347539    
## years according  1   0.0007347539    
## years for    1   0.0007347539    
## years of 1   0.0007347539    
## years since  1   0.0007347539    
## yes we   1   0.0007347539    
## you need 1   0.0007347539    

Visualizing the bigram word frequency using wordcloud package.

Please note that only high frequency bigrams are being plotted.