Question

Please do the following in python: Write a program (twitter_sort.py) that merges and sorts two twitter...

Please do the following in python:

Write a program (twitter_sort.py) that merges and sorts two twitter feeds. At a high level, your program is going to perform the following:

  1. Read in two files containing twitter feeds.
  2. Merge the twitter feeds in reverse chronological order (most recent first).
  3. Write the merged feeds to an output file.
  4. Provide some basic summary information about the files.

The names of the files will be passed in to your program via command line arguments. Use the following input files to test your program: tweet1.txt(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/tweet1.txt) and tweet2.txt(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/tweet2.txt)

The output of your program includes the following:

  1. Console
    1. The name of the file that contained the most tweets followed by the number of tweets tweeted. In the event of a tie, print both filenames along with the number of tweets (Note: a file may be empty).
    2. The five earliest tweets along with the tweeter.
  2. sorted_tweets.txt: the lines from the inputted files sorted in reverse chronological order (most recent tweets first and earliest tweets at the end).

Program Details

File Format

Each input file will contain a list of records with one record appearing on each line of the file. The format of a record is as follows:

@TWEETER "TWEET" YEAR MONTH DAY HR:MN:SC

Your job will be to read in each file and for each line in the file, create a record with the above information. In the above format, a tweet is a string that can contain a list of tokens. Also, HR:MN:SC should be treated as a single field of the record, the time.

Note: you should remove the "@" symbol from each tweeter's name.

Reading from Files

You may use the provided Scanner class in the scanner.py(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/scanner.py) module to help you parse different fields from the tweets.

Functions to Define

In addition to a main() function, define the following functions in your code:

  • read_records(): a function that given a filename creates a Scanner object and creates a record for each line in the file and returns a list containing the records
  • create_record(): a function that takes in a Scanner object and creates a record then returns a list representing the record; note, the "@" symbol should also be removed from the tweeter's name
  • is_more_recent(): a function that compares two records based on date and returns True if the first record is more recent than the second and False otherwise
  • merge_and_sort_tweets(): a function that merges two lists of records based placing more recent records before earlier records and returns the merged records as a single list
  • write_records(): a function that takes in a list of records and writes to the file output each record on it's own line.

Example Run

File 1 (tweet1_demo.txt):

@poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42
@nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14
@pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11
@enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00

File 2 (tweet2_demo.txt):

@ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01
@caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47

Run the program: python twitter_sort.py tweet1_demo.txt tweet2_demo.txt sorted_demo.txt

Example Console Output

Reading files...
tweet1_demo.txt contained the most tweets with 4.
Merging files...
Writing file...
File written. Displaying 5 earliest tweeters and tweets.
caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt"
enigma "im so clever, my code is even unreadable to me!"
pythondiva "My memory is great <3 64GB android"
nohw4me "i have no idea what my cs prof is saying"
ocd_programmer "140 character limit? so i cant write my variable names"

Example Output File (sorted_demo.txt)

poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42
ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01
nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14
pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11
enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00
caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47

Homework Answers

Answer #1

# Please find all the relevant comments alongside the code itself

import sys
from scanner import Scanner
from functools import cmp_to_key

# Class for a single tweet
class Tweet:
   def __init__(self, tweeter, tweet, time):
   self.tweeter = tweeter[1:]
   self.tweet = tweet
   self.time = time
   def __str__(self):
       return self.tweeter+" "+self.tweet+" "+self.time  
   def display(self):
       return self.tweeter+" "+self.tweet

def create_record(s):
   tweets=[]
   tweeter=s.readtoken()
   tweet1count=0
   # reads until readtoken() returns empty string
   while tweeter!= "":
       tweet=s.readstring()
       t1=Tweet(tweeter,tweet,s.readline())
       tweet1count+=1
       tweeter=s.readtoken()
       tweets.append(t1)
   return tweets

def read_records(file):
   s = Scanner(file)
   return create_record(s)

def is_more_recent(t1,t2):
   # sorts by simply comparing the dates as string which are stores as YYYYMMDDHH:MM:SS
   year,month,day,time=t1.time.split()
   # converting single digit month to 2 digit by appending 0 at begining
   month=("0"+month)[-2:]
   # converting single digit day to 2 digit by appending 0 at begining
   day=("0"+day)[-2:]
   timestamp1=year+month+day+time
   year,month,day,time=t2.time.split()
   # converting single digit month to 2 digit by appending 0 at begining
   month=("0"+month)[-2:]
   # converting single digit day to 2 digit by appending 0 at begining
   day=("0"+day)[-2:]
   timestamp2=year+month+day+time
   return timestamp1 > timestamp2

def merge_and_sort_tweets(tweets1,tweets2):
   tweets=tweets1+tweets2
   # converts the custom sorting fuction to sorting key for python3
   cmp_items_py3 = cmp_to_key(is_more_recent)
   tweets.sort(key=cmp_items_py3)
   return tweets

def write_records(file, tweets):
   file = open(file,"w+")
   for t in tweets:
       file.write(str(t))

def main():
   # command line arguments are stored in sys.argv where the first value is the name if the file
   print("Reading files...")
   # reading record of first file whose name is stored in argv[1]
   tweets1=read_records(sys.argv[1])
   # reading record of second file whose name is stored in argv[2]
   tweets2=read_records(sys.argv[2])
   # number of tweets in first file
   tweet1count=len(tweets1)
   # number of tweets in second file
   tweet2count=len(tweets2)
   if tweet1count>tweet2count:
       print("tweet1_demo.txt contained the most tweets with",tweet1count)
   elif tweet1count<tweet2count:
       print("tweet2_demo.txt contained the most tweets with",tweet2count)
   else:
       # if both the files have same number of tweets
       print("tweet1_demo.txt containes ",tweet1count," tweets.")
       print("tweet2_demo.txt containes ",tweet2count," tweets.")

   print("Merging files...")
   tweets=merge_and_sort_tweets(tweets1,tweets2)

   print("Writing file...")
   write_records(sys.argv[3],tweets)

   # displaying the top 5 tweets or if there are less than 5 tweets in total, displaying all the tweets
   print("File written. Displaying",min(5,len(tweets)),"earliest tweeters and tweets.")
   for i in range(min(5,len(tweets))):
       print(tweets[i].display())

# It implies that the module is being run standalone by the user and not imported by some other script
if __name__== "__main__" :
   main()

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
C Program Write a program to count the frequency of each alphabet letter (A-Z a-z, total...
C Program Write a program to count the frequency of each alphabet letter (A-Z a-z, total 52 case sensitive) and five special characters (‘.’, ‘,’, ‘:’, ‘;’ and ‘!’) in all the .txt files under a given directory. The program should include a header count.h, alphabetcount.c to count the frequency of alphabet letters; and specialcharcount.c to count the frequency of special characters. Please only add code to where it says //ADDCODEHERE and keep function names the same. I have also...
Please do it in Python Write the simplest program that will demonstrate iteration vs recursion using...
Please do it in Python Write the simplest program that will demonstrate iteration vs recursion using the following guidelines - Write two primary helper functions - one iterative (IsArrayPrimeIter) and one recursive (IsArrayPrimeRecur) - each of which Take the array and its size as input params and return a bool. Print out a message "Entering <function_name>" as the first statement of each function. Perform the code to test whether every element of the array is a Prime number. Print out...
PYTHON : Create a Email Address Parser (* Please do make comments*) Often times, you may...
PYTHON : Create a Email Address Parser (* Please do make comments*) Often times, you may be given a list of raw email addresses and be asked to generate meaningful information from such a list. This project involves parsing such a list and generating names and summary information from that list. The script, eparser.py, should: Open the file specified as the first argument to the script (see below) Read the file one line at a time (i.e., for line in...
Write a Python 3 program called “parse.py” using the template for a Python program that we...
Write a Python 3 program called “parse.py” using the template for a Python program that we covered in this module. Note: Use this mod7.txt input file. Name your output file “output.txt”. Build your program using a main function and at least one other function. Give your input and output file names as command line arguments. Your program will read the input file, and will output the following information to the output file as well as printing it to the screen:...
Write a Java program that Reads baseball data in from a comma delimited file. Each line...
Write a Java program that Reads baseball data in from a comma delimited file. Each line of the file contains a name followed by a list of symbols indicating the result of each at bat: 1 for single, 2 for double, 3 for triple, 4 for home run, o for out, w for walk, s for sacrifice Statistics are computed and printed for each player. EXTRA CREDIT (+10 points); compute each player's slugging percentage https://www.wikihow.com/Calculate-Slugging-Percentage Be sure to avoid a...
can you please do this lab? use lunix or C program its a continuation of a...
can you please do this lab? use lunix or C program its a continuation of a previous lab. the previous lab: Unix lab 4: compile and link multiple c or c++ files Please do the following tasks step by step: create a new directory named by inlab4 enter directory inlab4 create a new file named by reverse.c with the following contents and then close the file: /*reverse.c */ #include <stdio.h> reverse(char *before, char *after); main() {       char str[100];    /*Buffer...
I NEED TASK 3 ONLY TASK 1 country.py class Country:     def __init__(self, name, pop, area, continent):...
I NEED TASK 3 ONLY TASK 1 country.py class Country:     def __init__(self, name, pop, area, continent):         self.name = name         self.pop = pop         self.area = area         self.continent = continent     def getName(self):         return self.name     def getPopulation(self):         return self.pop     def getArea(self):         return self.area     def getContinent(self):         return self.continent     def setPopulation(self, pop):         self.pop = pop     def setArea(self, area):         self.area = area     def setContinent(self, continent):         self.continent = continent     def __repr__(self):         return (f'{self.name} (pop:{self.pop}, size: {self.area}) in {self.continent} ') TASK 2 Python Program: File: catalogue.py from Country...
Use Python to Complete the following on a single text file and submit your code and...
Use Python to Complete the following on a single text file and submit your code and your output as separate documents. For each problem create the necessary list objects and write code to perform the following examples: Sum all the items in a list. Multiply all the items in a list. Get the largest number from a list. Get the smallest number from a list. Remove duplicates from a list. Check a list is empty or not. Clone or copy...