Harold Nelson
9/29/2020
You have a file “Pride and Prejudice.txt” in a folder “Pride and Prejudice”. In that folder, create a jupyter notebook and write a program to read the contents of the file line by line into a list called lines. In the reading loop, count the number of lines you read. Compare this with the length of the list you get from the len() function.
fh = open("Pride and Prejudice.txt","r")
count = 0
lines = []
for line in fh:
lines.append(line)
count = count + 1
fh.close()
len(lines)
## 13427
## 13427
What is the type of an item in the list?
What is the average length of a line?
## 52.445669174052284
How many lines have length 0?
How many lines have length 1?
Create a list “short_lines” of all the lines with length 1. What is in these lines? Look at the first 3 by printing them out character by character.
short_lines = []
for line in lines:
if len(line) == 1:
short_lines.append(line)
for i in range(3):
print("line",i)
for j in short_lines[i]:
print(j)
print("End of line",i)
print(" ")
## line 0
##
##
## End of line 0
##
## line 1
##
##
## End of line 1
##
## line 2
##
##
## End of line 2
##
Consider a word to be defined as any sequence of non-blank characters separated by whitespace. Create a list of all of the words in the file. Use the string method split(). Print out the first 100 words in the list of words. Print the length of the list of words.
words = []
for line in lines:
lwords = line.split()
for word in lwords:
words.append(word)
for i in range(100):
print(words[i])
## The
## Project
## Gutenberg
## EBook
## of
## Pride
## and
## Prejudice,
## by
## Jane
## Austen
## This
## eBook
## is
## for
## the
## use
## of
## anyone
## anywhere
## at
## no
## cost
## and
## with
## almost
## no
## restrictions
## whatsoever.
## You
## may
## copy
## it,
## give
## it
## away
## or
## re-use
## it
## under
## the
## terms
## of
## the
## Project
## Gutenberg
## License
## included
## with
## this
## eBook
## or
## online
## at
## www.gutenberg.org
## Title:
## Pride
## and
## Prejudice
## Author:
## Jane
## Austen
## Posting
## Date:
## August
## 26,
## 2008
## [EBook
## #1342]
## Release
## Date:
## June,
## 1998
## Last
## Updated:
## March
## 10,
## 2018
## Language:
## English
## Character
## set
## encoding:
## UTF-8
## ***
## START
## OF
## THIS
## PROJECT
## GUTENBERG
## EBOOK
## PRIDE
## AND
## PREJUDICE
## ***
## Produced
## by
## Anonymous
## Volunteers
## PRIDE
Count the number of occurrences of the word “and” in the list of words. Be sure to allow for capitalization. What fraction of all the words is this?
count = 0
for word in words:
if word.lower() == "and":
count = count + 1
print("Number of ands is",count)
## Number of ands is 3398
## Fraction of ands is 0.02727301913445486