PYTHON : Create a Email Address Parser (* Please do make comments*)
Often times, you may be given a list of raw email addresses and be asked to generate meaningful information from such a list. This project involves parsing such a list and generating names and summary information from that list.
The script, eparser.py, should:
Each email address will either be in the form “[email protected]” OR “[email protected]”, where “first” and “last” are the first and last names of the person respectively, and “i” is the first initial. Each “line” (step 4) should contain the email address itself, the name (either “Last, First” or “Last, I.” where “I” is the first initial, and the domain name of th email address. The name components should be in “title case” (capitalize first letter of all words/initials). So for example, the following two email addresses in the input file:
[email protected] [email protected] [email protected] [email protected]
should result in the following printed to the screen:
[email protected] Deer, Doe ray.me [email protected] Robot, I. sneaker.net [email protected] Sew, Far ray.me [email protected] Tea, L. ray.me Number of Addresses: 4 Unique Domain Names: 2
Allow 35 characters for the email address, 25 characters for the full name and 20 characters for the domain name for a total of 80 characters wide per line of output. All fields should be left aligned (that’s the default for strings).
HINTS
import sys filename = sys.argv[1]
REQUIRED IMPLEMENTATION NOTES
def parse_address(email): ... DO STUFF HERE ...
You MUST use either the format() function or an f-string to format your lines of text. In other words, you cannot use the center(), ljust() and rjust() methods. Those methods are okay, but don’t use them here.
You MUST run your script against the file emails.txt and redirect the output to contacts.txt. In other words, you should run python eparser.py emails.txt > contacts.txt. This file should be in your final submission. DO NOT WRITE TO A FILE INSIDE OF YOUR SCRIPT. ONLY READ AND PRINT. WRITING OCCURS HERE BY REDIRECTING OUTPUT.
OPTIONAL CHALLENGES( IF YOU CAN DO THEM THATS FINE IF YOU CANNOT ITS ALRIGHT)
Create a second version of the script, eparser2.py. This version should import the parse_address function from eparser (from eparser import parse_address). Instead of asking for a file name and reading the text from the file, instead read in the text line-by-line from sys.stdin. This should allow you to run it interactively AND to pipe in text from another command (e.g., type emails.txt | python eparser2.py).
Improve the “logic” for parse_address to remove any numbers (digits) from the name before running your “rules.” So for instance, [email protected] would parse to a name of “Cunningham, D.” (dropping the “12”).
Improve the logic of parse_address to ALSO capitalize the third letter of any name that starts with Mc. So for instance Mcshady whould become McShady.
BELOW WILL BE THE FILES REQUIRED FOR THIS TO WORK:
emails.txt FILE will have:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
sample_contacts.txt FILE:
[email protected] Smith, A. boring.com
[email protected] Witcher, A. coldasheck.net
[email protected] Johnson, Demarcus semper.org
[email protected] Baker, Shelly supplyshack.com
[email protected] Parker, P. spider.org
[email protected] Markle, Meghan royals.gov
[email protected] Mcshady, Seedy criminal.inc
[email protected] Bird, B. sesame.st
Number of Addresses: 8
Unique Domain Names: 7
ANSWER:-
GIVEN:-
out=open("contacts.txt","w")#open file in write
mode
domain_list=[]
def parse_address(email):#method that writes the output file the
last, first name and domain
temp=[]
temp=email.split("@")
if '.' in temp[0]:
out.write(l[i]+"
"+temp[0].split('.')[0].title()+","+temp[0].split('.')[1].title()+"
"+temp[1]+"\n")
domain_list.append(temp[1])
else:
out.write(l[i]+" "+temp[0][1:].title()+","+temp[0][0].title()+"
"+temp[1]+"\n")
domain_list.append(temp[1])
f=open("emails.txt","r")#open file in read mode
l=f.readlines()
for i in range(len(l)):
l[i]=l[i][:-1]
total_emails=0
domain_names_count=0
for i in range(len(l)):
parse_address(l[i])#call function for each email
domain_set=list(set(domain_list))
print("The total emails:"+str(len(l)))
print("The count of unique domain
names:"+str(len(domain_set)))
out.write("The total emails:"+str(len(l))+"\n")
out.write("The count of unique domain
names:"+str(len(domain_set))+"\n")
out.close()
SCREEN SHOTS:-
Get Answers For Free
Most questions answered within 1 hours.