Question

This assignment involves using a binary search tree (BST) to keep track of all words in...

This assignment involves using a binary search tree (BST) to keep track of all words in a text document. It produces a cross-reference, or a concordance. This is very much like assignment 4, except that you must use a different data structure. You may use some of the code you wrote for that assignment, such as input parsing, for this one.

Remember that in a binary search tree, the value to the left of the root is less than the root and the value to the right is greater than the root.

The program will ask for the name of a text file. It will then read the file and keep track each of the words in the file, the number of times it occurs, and which line numbers contain the word. If a word occurs more than once in a line, count it more than once but do not duplicate the line number. Words in the document are separated by spaces and punctuation, which are the following: ? , . !;:-. That is, question mark, period, comma, exclamation point, semicolon, colon, and hyphen. Ignore parentheses and quotation marks. Contractions such as “don’t” are considered a single word. Your test data will not contain numbers. It may contain blank lines, which count in the line numbering but which, containing no words, are ignored. Plurals and variations of a word are considered different. Ignore capitalization; Word and word are the same. Your program will exit after printing the output.

Since part of this is learning how to use classes, you will have to create your own binary search tree node and binary search tree classes. You may not use those classes from the textbook nor any other source. Write only those functions you need to fulfill the assignment.

Remember how string functions work. Remember to write small “helper” functions for various tasks.

Print the text as you read it, preceded by a line number. Once you have reached the end of the file, print a blank line, then the output.

Your output will be the words in alphabetical order, the number of times the word occurs in the file, and the line numbers on which it occurs. Separate the line numbers with a comma and a space, as shown. To make things a little more interesting, ignore the following words: the, a, an.

Sample output for the above paragraph would start thus:

alphabetical        1 1

and                        1 1

word                      1 1

words                    2 1,3

your                       1 1

At the end print the total number of words, the total number of unique words, and the total number of lines.

Homework Answers

Answer #1
import re
file = open("C:\data.txt", "rt")
data = file.read()
word_list=re.split('; |, |:| | ? |- | ! | .',data)

print('Number of words in text file :', len(words))
unique=0
for word in word_list:
        if word not in word_list:
            unique=unique+1
  
print('Number of unique words in text file :', unique)
Content = file.read() 
CoList = Content.split("\n") 
  
for i in CoList: 
    if i: 
        Counter += 1
          
print("This is the number of lines in the file") 
print(Counter) 
Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
In this lab, you will write a program that creates a binary search tree based on...
In this lab, you will write a program that creates a binary search tree based on user input. Then, the user will indicate what order to print the values in. **Please write in C code** Start with the bst.h and bst.c base code provided to you. You will need to modify the source and header file to complete this lab. bst.h: #ifndef BST_H #define BST_H typedef struct BSTNode { int value; struct BSTNode* left; struct BSTNode* right; } BSTNode; BSTNode*...
Create an add method for the BST (Binary Search Tree) class. add(self, new_value: object) -> None:...
Create an add method for the BST (Binary Search Tree) class. add(self, new_value: object) -> None: """This method adds new value to the tree, maintaining BST property. Duplicates must be allowed and placed in the right subtree.""" Example #1: tree = BST() print(tree) tree.add(10) tree.add(15) tree.add(5) print(tree) tree.add(15) tree.add(15) print(tree) tree.add(5) print(tree) Output: TREE in order { } TREE in order { 5, 10, 15 } TREE in order { 5, 10, 15, 15, 15 } TREE in order {...
6.27 At the end of this and other textbooks, there usually is an index that lists...
6.27 At the end of this and other textbooks, there usually is an index that lists the pages where a certain word appears. In this problem, you will create an index for a text but, instead of page number, you will use the line numbers. You will implement function index() that takes as input the name of a text file and a list of words. For every word in the list, your function will find the lines in the text...
The Binary Search Tree implementation for bst.zip. The code in the destructor of the BST class...
The Binary Search Tree implementation for bst.zip. The code in the destructor of the BST class is empty. Complete the destructor so the memory allocated for each node in the BST is freed. Make a couple of different trees in your main method or in a function to test the destructor (the program should not crash upon exiting). bst.zip (includes the following files below in c++): bst.h: #pragma once #include #include "node.cpp" using namespace std; template class BST { public:...
Description The word bank system maintains all words in a text file named words.txt. Each line...
Description The word bank system maintains all words in a text file named words.txt. Each line in the text file stores a word while all words are kept in an ascending order. You may assume that the word length is less than 20. The system should support the following three functions: Word lookup: to check whether a given word exists in the word bank. Word insertion: to insert a new word into the word bank. No insertion should be made...
WRITE USING PYTHON PROGRAMMING THE CODE GIVEN BELOW HAS SOME ERRORS WHICH NEED TO BE SOLVED....
WRITE USING PYTHON PROGRAMMING THE CODE GIVEN BELOW HAS SOME ERRORS WHICH NEED TO BE SOLVED. ALSO THE 2 POINTS MENTIONED BELOW SHOULD BE PRESENT IN THE CODE Write a script that calculates the 3 longest words of a text stored in a file and print them from the longest to the smaller of the 3. Please note: 1. The name of the file is word_list.csv and it doesn’t need to be asked to the user (meaning the name will...
** Language Used : Python ** PART 2 : Create a list of unique words This...
** Language Used : Python ** PART 2 : Create a list of unique words This part of the project involves creating a function that will manage a List of unique strings. The function is passed a string and a list as arguments. It passes a list back. The function to add a word to a List if word does not exist in the List. If the word does exist in the List, the function does nothing. Create a test...
This is in java and you are not allowed to use Java API classes for queues,...
This is in java and you are not allowed to use Java API classes for queues, stacks, arrays, arraylists and linkedlists. You have to write your own implementations for them. You should construct a BST by inserting node values starting with a null tree. You can re-use the code for the insert method given in the sample code from the textbook. -insert method is provided below Your code should have a menu driven user interface at the command line with...
You will write a program that loops until the user selects 0 to exit. In the...
You will write a program that loops until the user selects 0 to exit. In the loop the user interactively selects a menu choice to compress or decompress a file. There are three menu options: Option 0: allows the user to exit the program. Option 1: allows the user to compress the specified input file and store the result in an output file. Option 2: allows the user to decompress the specified input file and store the result in an...
JAVA ASSIGNMENT 1. Write program that opens the file and process its contents. Each lines in...
JAVA ASSIGNMENT 1. Write program that opens the file and process its contents. Each lines in the file contains seven numbers,which are the sales number for one week. The numbers are separated by comma.The following line is an example from the file 2541.36,2965.88,1965.32,1845.23,7021.11,9652.74,1469.36. The program should display the following: . The total sales for each week . The average daily sales for each week . The total sales for all of the weeks .The average weekly sales .The week number...