Files Exercises

Harold Nelson

9/29/2020

Ex1

You have a file “Pride and Prejudice.txt” in a folder “Pride and Prejudice”. In that folder, create a jupyter notebook and write a program to read the contents of the file line by line into a list called lines. In the reading loop, count the number of lines you read. Compare this with the length of the list you get from the len() function.

Answer

fh = open("Pride and Prejudice.txt","r")
count = 0
lines = []
for line in fh:
    lines.append(line)
    count = count + 1
fh.close()
len(lines) 
## 13427
print(count)
## 13427

Ex2

What is the type of an item in the list?

Answer

type(lines[0])
## <class 'str'>

Ex3

What is the average length of a line?

Answer

total = 0
for line in lines:
    total = total + len(line)
avg_length = total/count
print(avg_length)
## 52.445669174052284

Ex4

How many lines have length 0?

Answer

total = 0
for line in lines:
    if len(line) == 0:
        total = total + 1
print(total)        
## 0

Ex5

How many lines have length 1?

Answer

total = 0
for line in lines:
    if len(line) == 1:
        total = total + 1
print(total)        
## 2394

Ex6

Create a list “short_lines” of all the lines with length 1. What is in these lines? Look at the first 3 by printing them out character by character.

Answer

short_lines = []
for line in lines:
    if len(line) == 1:
        short_lines.append(line)
        
for i in range(3):
    print("line",i)
    for j in short_lines[i]:
        print(j)
    print("End of line",i)
    print(" ")
    
## line 0
## 
## 
## End of line 0
##  
## line 1
## 
## 
## End of line 1
##  
## line 2
## 
## 
## End of line 2
## 

Ex7

Consider a word to be defined as any sequence of non-blank characters separated by whitespace. Create a list of all of the words in the file. Use the string method split(). Print out the first 100 words in the list of words. Print the length of the list of words.

Answer

words = []
for line in lines:
    lwords = line.split()
    for word in lwords:
        words.append(word)
for i in range(100):
    print(words[i])
## The
## Project
## Gutenberg
## EBook
## of
## Pride
## and
## Prejudice,
## by
## Jane
## Austen
## This
## eBook
## is
## for
## the
## use
## of
## anyone
## anywhere
## at
## no
## cost
## and
## with
## almost
## no
## restrictions
## whatsoever.
## You
## may
## copy
## it,
## give
## it
## away
## or
## re-use
## it
## under
## the
## terms
## of
## the
## Project
## Gutenberg
## License
## included
## with
## this
## eBook
## or
## online
## at
## www.gutenberg.org
## Title:
## Pride
## and
## Prejudice
## Author:
## Jane
## Austen
## Posting
## Date:
## August
## 26,
## 2008
## [EBook
## #1342]
## Release
## Date:
## June,
## 1998
## Last
## Updated:
## March
## 10,
## 2018
## Language:
## English
## Character
## set
## encoding:
## UTF-8
## ***
## START
## OF
## THIS
## PROJECT
## GUTENBERG
## EBOOK
## PRIDE
## AND
## PREJUDICE
## ***
## Produced
## by
## Anonymous
## Volunteers
## PRIDE

Ex8

Count the number of occurrences of the word “and” in the list of words. Be sure to allow for capitalization. What fraction of all the words is this?

count = 0
for word in words:
    if word.lower() == "and":
        count = count + 1
print("Number of ands is",count)
## Number of ands is 3398
print("Fraction of ands is", count/len(words))
## Fraction of ands is 0.02727301913445486