Please do the following in python:
Write a program (twitter_sort.py) that merges and sorts two twitter feeds. At a high level, your program is going to perform the following:
The names of the files will be passed in to your program via command line arguments. Use the following input files to test your program: tweet1.txt(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/tweet1.txt) and tweet2.txt(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/tweet2.txt)
The output of your program includes the following:
Program Details
File Format
Each input file will contain a list of records with one record appearing on each line of the file. The format of a record is as follows:
@TWEETER "TWEET" YEAR MONTH DAY HR:MN:SC
Your job will be to read in each file and for each line in the file, create a record with the above information. In the above format, a tweet is a string that can contain a list of tokens. Also, HR:MN:SC should be treated as a single field of the record, the time.
Note: you should remove the "@" symbol from each tweeter's name.
Reading from Files
You may use the provided Scanner class in the scanner.py(https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/scanner.py) module to help you parse different fields from the tweets.
Functions to Define
In addition to a main() function, define the following functions in your code:
Example Run
File 1 (tweet1_demo.txt):
@poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42 @nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14 @pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11 @enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00
File 2 (tweet2_demo.txt):
@ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01 @caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47
Run the program: python twitter_sort.py tweet1_demo.txt tweet2_demo.txt sorted_demo.txt
Example Console Output
Reading files... tweet1_demo.txt contained the most tweets with 4. Merging files... Writing file... File written. Displaying 5 earliest tweeters and tweets. caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" enigma "im so clever, my code is even unreadable to me!" pythondiva "My memory is great <3 64GB android" nohw4me "i have no idea what my cs prof is saying" ocd_programmer "140 character limit? so i cant write my variable names"
Example Output File (sorted_demo.txt)
poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42 ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01 nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14 pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11 enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00 caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47
# Please find all the relevant comments alongside the code itself
import sys
from scanner import Scanner
from functools import cmp_to_key
# Class for a single tweet
class Tweet:
def __init__(self, tweeter, tweet, time):
self.tweeter = tweeter[1:]
self.tweet = tweet
self.time = time
def __str__(self):
return self.tweeter+"
"+self.tweet+" "+self.time
def display(self):
return self.tweeter+"
"+self.tweet
def create_record(s):
tweets=[]
tweeter=s.readtoken()
tweet1count=0
# reads until readtoken() returns empty string
while tweeter!= "":
tweet=s.readstring()
t1=Tweet(tweeter,tweet,s.readline())
tweet1count+=1
tweeter=s.readtoken()
tweets.append(t1)
return tweets
def read_records(file):
s = Scanner(file)
return create_record(s)
def is_more_recent(t1,t2):
# sorts by simply comparing the dates as string which
are stores as YYYYMMDDHH:MM:SS
year,month,day,time=t1.time.split()
# converting single digit month to 2 digit by
appending 0 at begining
month=("0"+month)[-2:]
# converting single digit day to 2 digit by appending
0 at begining
day=("0"+day)[-2:]
timestamp1=year+month+day+time
year,month,day,time=t2.time.split()
# converting single digit month to 2 digit by
appending 0 at begining
month=("0"+month)[-2:]
# converting single digit day to 2 digit by appending
0 at begining
day=("0"+day)[-2:]
timestamp2=year+month+day+time
return timestamp1 > timestamp2
def merge_and_sort_tweets(tweets1,tweets2):
tweets=tweets1+tweets2
# converts the custom sorting fuction to sorting key
for python3
cmp_items_py3 = cmp_to_key(is_more_recent)
tweets.sort(key=cmp_items_py3)
return tweets
def write_records(file, tweets):
file = open(file,"w+")
for t in tweets:
file.write(str(t))
def main():
# command line arguments are stored in sys.argv where
the first value is the name if the file
print("Reading files...")
# reading record of first file whose name is stored in
argv[1]
tweets1=read_records(sys.argv[1])
# reading record of second file whose name is stored
in argv[2]
tweets2=read_records(sys.argv[2])
# number of tweets in first file
tweet1count=len(tweets1)
# number of tweets in second file
tweet2count=len(tweets2)
if tweet1count>tweet2count:
print("tweet1_demo.txt contained
the most tweets with",tweet1count)
elif tweet1count<tweet2count:
print("tweet2_demo.txt contained
the most tweets with",tweet2count)
else:
# if both the files have same
number of tweets
print("tweet1_demo.txt containes
",tweet1count," tweets.")
print("tweet2_demo.txt containes
",tweet2count," tweets.")
print("Merging files...")
tweets=merge_and_sort_tweets(tweets1,tweets2)
print("Writing file...")
write_records(sys.argv[3],tweets)
# displaying the top 5 tweets or if there are less
than 5 tweets in total, displaying all the tweets
print("File written.
Displaying",min(5,len(tweets)),"earliest tweeters and
tweets.")
for i in range(min(5,len(tweets))):
print(tweets[i].display())
# It implies that the module is being run standalone by the user
and not imported by some other script
if __name__== "__main__" :
main()
Get Answers For Free
Most questions answered within 1 hours.