veggies <- "tomato, cucumber"
nchar(veggies)
## [1] 16
VeggiesandFruits <- c(veggies, "corn, wheat")
nchar(VeggiesandFruits[1])
## [1] 16
nchar(VeggiesandFruits[2])
## [1] 11
VeggiesandFruits[2]
## [1] "corn, wheat"
VeggiesandFruitsPlus <- paste(VeggiesandFruits[1], VeggiesandFruits[2], sep = "++")
nchar(VeggiesandFruitsPlus)
## [1] 29
VeggiesandFruitsPlus <- paste(VeggiesandFruits[1], VeggiesandFruits[2], sep = "+++")
nchar(VeggiesandFruitsPlus)
## [1] 30
substr(veggies, 1, 6)
## [1] "tomato"
substr(VeggiesandFruits, 1, 4)
## [1] "toma" "corn"
substr(VeggiesandFruits, nchar(VeggiesandFruits)-3, nchar(VeggiesandFruits))
## [1] "mber" "heat"
Splitting and combining strings
Split the veggies into two strings, the first and second favorites. Save the resulting list as veggies.list, and display it (the result should be a list). How do you access your second favorite veggie in this list?
#######IMPORTANT!!!! sierra has been helping me on these, and their strsplit is behaving differently than mine. mine is returning a list with only one item in it:
#veggies.list: [1]"tomato" "cucumber
#theirs returns a list of 2.
veggies
## [1] "tomato, cucumber"
veggies.list <- strsplit(veggies, ", ")
veggies.list[[1]][2]
## [1] "cucumber"
setwd("C:/Users/ethan/Desktop")
##...I've been using a friend's computer
AustenLines<-readLines("Austen.txt")
Austen.lines should be a vector of strings, each element representing a “line” of text. Do a few basic vector operations with Austen.lines so you can convince yourself of this. What is the 10,000th line? Display the first 50 lines.
AustenLines[10000]
## [1] "of finding him still with them--a hope which, when it proved to be"
AustenLines[c(1:50)]
## [1] ""
## [2] "Project Gutenberg's The Complete Works of Jane Austen, by Jane Austen"
## [3] ""
## [4] "This eBook is for the use of anyone anywhere at no cost and with"
## [5] "almost no restrictions whatsoever. You may copy it, give it away or"
## [6] "re-use it under the terms of the Project Gutenberg License included"
## [7] "with this eBook or online at www.gutenberg.org"
## [8] ""
## [9] ""
## [10] "Title: The Complete Project Gutenberg Works of Jane Austen"
## [11] ""
## [12] "Author: Jane Austen"
## [13] ""
## [14] "Editor: David Widger"
## [15] ""
## [16] "Release Date: January 25, 2010 [EBook #31100]"
## [17] ""
## [18] "Language: English"
## [19] ""
## [20] "Character set encoding: ASCII"
## [21] ""
## [22] "*** START OF THIS PROJECT GUTENBERG EBOOK THE WORKS OF JANE AUSTEN ***"
## [23] ""
## [24] ""
## [25] ""
## [26] ""
## [27] "Produced by many Project Gutenberg volunteers."
## [28] ""
## [29] ""
## [30] ""
## [31] ""
## [32] ""
## [33] ""
## [34] ""
## [35] "THE WORKS OF JANE AUSTEN"
## [36] ""
## [37] ""
## [38] ""
## [39] "Edited by David Widger"
## [40] ""
## [41] "Project Gutenberg Editions"
## [42] ""
## [43] ""
## [44] ""
## [45] " DEDICATION"
## [46] ""
## [47] " This Jane Austen collection"
## [48] " is dedicated to"
## [49] " Alice Goodson [Hart] Woodby"
## [50] ""
length(AustenLines)
## [1] 80478
b.How many characters in the longest line? Where is the longest line(s) located?
linLens <- nchar(AustenLines)
max(linLens)
## [1] 74
which(linLens == max(linLens))
## [1] 66986 66987 66997 67000 67005 67012
AustenLines[which(linLens == max(linLens))]
## [1] "problems which delight the cummin-splitters of criticism. In the _Cecilia_"
## [2] "of Madame D'Arblay--the forerunner, if not the model, of Miss Austen--is a"
## [3] "before _Sense and Sensibility_--its original title for several years being"
## [4] "she re-christened _Sense and Sensibility._ This, as we know, was her first"
## [5] "Marianne_ before she changed the title of _First Impressions_, as she well"
## [6] "simply substituted the leading characteristics of her principal personages"
mean(linLens)
## [1] 53.34525
length(which(linLens == 0))
## [1] 12601
NewAustenLines <- AustenLines[which(linLens != 0)]
length(AustenLines)
## [1] 80478
length(NewAustenLines)+length(which(linLens == 0))
## [1] 80478
Austen.all <- paste(NewAustenLines, collapse = " ")
nchar(Austen.all)
## [1] 4360995
substr(Austen.all, 1, 2000)
## [1] "Project Gutenberg's The Complete Works of Jane Austen, by Jane Austen This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org Title: The Complete Project Gutenberg Works of Jane Austen Author: Jane Austen Editor: David Widger Release Date: January 25, 2010 [EBook #31100] Language: English Character set encoding: ASCII *** START OF THIS PROJECT GUTENBERG EBOOK THE WORKS OF JANE AUSTEN *** Produced by many Project Gutenberg volunteers. THE WORKS OF JANE AUSTEN Edited by David Widger Project Gutenberg Editions DEDICATION This Jane Austen collection is dedicated to Alice Goodson [Hart] Woodby [Note: The accompanying HTML file has active links to all the volumes and chapters in this set.] CONTENTS: PERSUASION NORTHANGER ABBEY MANSFIELD PARK EMMA LADY SUSAN LOVE AND FREINDSHIP AND OTHER EARLY WORKS PRIDE AND PREJUDICE SENSE AND SENSIBILITY PERSUASION by Jane Austen (1818) Chapter 1 Sir Walter Elliot, of Kellynch Hall, in Somersetshire, was a man who, for his own amusement, never took up any book but the Baronetage; there he found occupation for an idle hour, and consolation in a distressed one; there his faculties were roused into admiration and respect, by contemplating the limited remnant of the earliest patents; there any unwelcome sensations, arising from domestic affairs changed naturally into pity and contempt as he turned over the almost endless creations of the last century; and there, if every other leaf were powerless, he could read his own history with an interest which never failed. This was the page at which the favourite volume always opened: \"ELLIOT OF KELLYNCH HALL. \"Walter Elliot, born March 1, 1760, married, July 15, 1784, Elizabeth, daughter of James Stevenson, Esq. of South Park, in the county of"
Austen.words <- strsplit(Austen.all, " ")
length(Austen.words)
## [1] 1
length(Austen.words[[1]])
## [1] 784869
Austen.words.unique <- unique(Austen.words[[1]])
length(Austen.words.unique)
## [1] 44361
To be continued! Save your work!