Python coding: HW `declaration.py` * define function `calc_histogram(lines)` * returns histogram dictionary and total non-whitespace characters * takes list of strings as argument * ignore comment lines denoted by `#` * if invoking script directly, load `data/declaration.txt` and calculate histogram * print histogram by sorted characters * count letter instances not characters, so `A` and `a` count as same
(This is the local test I was given):
if __name__ == '__main__': import os.path file_path = os.path.join("data", 'declaration.txt') print(file_path) letter = open(file_path) text = letter.readlines() # Read entire file into a list of strings letter.close() # If you open it, close it! hist, count = calc_histogram(text) keys = list(hist.keys()) keys.sort() print("Keys:", keys) sum = 0.0 for a in keys: val = 100.0*(hist[a]/count) print("{} : {:4d} : {:7.3f}%".format(a, hist[a], val)) sum += val print("total={:.5f}\ncnt={}".format(sum, count))
First we define an empty dictionary and iterate through each line in the given list:
hist = {} # define an empty dictionary to store character occurrences
for line in text: # iterate through each line in the list
line = line.strip() # remove trailing newline characters
After that, we remove all the comments:
hist = {} # define an empty dictionary to store character occurrences
for line in text: # iterate through each line in the list
line = line.strip() # remove trailing newline characters
# if comments are present, remove the part of the sentence
# from # to the end of the line
if "#" in line:
index = line.find("#")
line = line[:index]
Now we change the case of the line to lower and remove all the spaces:
hist = {} # define an empty dictionary to store character occurrences
for line in text: # iterate through each line in the list
line = line.strip() # remove trailing newline characters
# if comments are present, remove the part of the sentence
# from # to the end of the line
if "#" in line:
index = line.find("#")
line = line[:index]
line = line.lower() # change to lower case as the question is case insensitive
line = line.replace(" ", "") # remove all the whitespaces in the line
Now we add the occurances to the dictionary:
for character in line:
try:
hist[character] += 1
except KeyError:
hist[character] = 1
The entire function is given below:
def calc_histogram(text):
hist = {} # define an empty dictionary to store character occurrences
for line in text: # iterate through each line in the list
line = line.strip() # remove trailing newline characters
# if comments are present, remove the part of the sentence
# from # to the end of the line
if "#" in line:
index = line.find("#")
line = line[:index]
line = line.lower() # change to lower case as the question is case insensitive
line = line.replace(" ", "") # remove all the whitespaces in the line
# if the character is already present in the dictionary, increment the count,
# else set the count of that character to 1. Since the characters are considered as
# the keys in the dictionary, we get a KeyError when that particular character is
# not in the dictionary
for character in line:
try:
hist[character] += 1
except KeyError:
hist[character] = 1
# return the hist dictionary and the sum of all the occurrences of each non-whitespace character
return hist, sum(list(hist.values()))
Good luck!
Get Answers For Free
Most questions answered within 1 hours.