Write a python program that will perform text analysis on an
input text using all of
the following steps:
1. Take the name of an input file as a command line argument.
For example, your program might be called using
python3 freq.py example1
to direct you to process a plain text file called example. More
information is given on
how to do this below.
2. Read the contents of the file into your program and divide the
text into a list of
individual, space-separated words
3. Analyze the word list to count the number of times each word
appears in the text.
Hint: You may want to use a dictionary to store and track the
counts.
4. When the counting is finished, your program should write a
frequency table of each
word, its count, and its relative frequency to an output file. The
frequency table must
be sorted in lexicographic order (alphabetical with uppercase words
first)
by word in the text. You may want to look up the sorted function to
help you with
this
each line in the table should be like
word counted_value frequency
where
word is the word in the text
counted_value is the number of times the word mentioned
occurs
frequency is a number in the range [0,1] that is the ratio of the
count for the word to the
total number of words found in the text. You will need to calculate
this frequency
once you have counted the number of times each word appears
You must write this frequency table to an output file.
CODE -
import sys
# Creating a string for file name to open
filename = sys.argv[1] + ".txt"
# Opening file
file = open(filename)
# Reading the contents of the file and splitting the text into a list of individual space-seperated words.
list_words = file.read().split()
# Closing the file
file.close()
# Creating an empty dictionary to store the count of each words.
dict_words = {}
# Iterating over each word in the list
for word in list_words:
# Adding word to the dictionary and a count of 1 if the word is not already in the dictionary
if word not in dict_words:
dict_words[word] = 1
# Increasing count of word in the dictionary by 1 if the word is already in the dictionary
else:
dict_words[word] += 1
# Opening the output file for writing
file = open("output.txt", "w")
# Writing the header for the frequency table in the file
file.write("{:<15}{:<15}{:<15}\n" .format("Word", "Counted_value", "Frequency"))
# Writing the frequency of each word in the file in lexicographic order
for word in sorted(dict_words.keys()):
file.write("{:<15}{:<15}{:<15.4f}\n" .format(word, dict_words[word], dict_words[word] / len(dict_words)))
# Closing the file
file.close()
SCREENSHOTS -
INPUT TEXT FILE -
CODE -
OUTPUT TEXT FILE -
If you have any doubt regarding the solution, then do
comment.
Do upvote.
Get Answers For Free
Most questions answered within 1 hours.