Question

The following training dataset is “reading email dataset”. This dataset has four features as follows: author,...

  1. The following training dataset is “reading email dataset”.

This dataset has four features as follows: author, thread, length, and where to read the mail. According to the features the algorithm has to predict the user’s action whether to read or skip the mail.

Use Naïve Bayes classifier to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home.

Author

Thread

Length

Where to read

User’s Action

Known

new

long

home

Skips

unknown

new

short

work

Reads

unknown

Follow up

long

work

Skips

Known

Follow up

Long

Home

Skips

Known

New

Short

Home

Reads

Known

Follow up

Long

Work

Skips

Unknown

New

short

work

skips

Unknown

New

short

Work

reads

Known

Follow up

Long

Home

Skips

known

New

Long

Work

skips

unknown

Follow up

short

home

Skips

Known

new

Long

work

Skips

Known

Follow up

Short

Home

Reads

Known

New

Short

Work

Reads

known

New

short

Home

Reads

Known

Follow up

short

Work

Reads

Known

New

Short

home

Reads

unknown

new

short

work

Reads

  1. Write a Python code to implement a naïve Bayesian classifier to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home. (Do not use Scikit-Learn)
  2. Use Scikit-Learn to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home.

Hint in authors feature you can use 0, 1 instead of unknown and known. In thread feature you can use 0, 1 instead of follow up and new. In length feature you can use 0, 1 instead of short and long. In where to read feature you can use 0, 1 instead of home, work. In the target you can use 0 instead of skips and 1 instead of reads.

Two different programs, 1 doesnt use scikit and 2 does use scikit-learn

Homework Answers

Answer #1

without scikit

import pandas as pd
import matplotlib.pyplot as plt

import numpy as np
     

def accuracy_score(pre, y):
    return 1 - sum(1.0 * (pre - y)**2)/len(y)


class GaussianNB:
    
    def fit(self, X, y, epsilon = 1e-10):
        self.y_classes, y_counts = np.unique(y, return_counts=True)
        self.x_classes = np.array([np.unique(x) for x in X.T])
        self.phi_y = 1.0 * y_counts/y_counts.sum()
        self.u = np.array([X[y==k].mean(axis=0) for k in self.y_classes])
        self.var_x = np.array([X[y==k].var(axis=0)  + epsilon for k in self.y_classes])
        return self
    
    def predict(self, X):
        return np.apply_along_axis(lambda x: self.compute_probs(x), 1, X)
    
    def compute_probs(self, x):
        probs = np.array([self.compute_prob(x, y) for y in range(len(self.y_classes))])
        return self.y_classes[np.argmax(probs)]
    
    def compute_prob(self, x, y):
        c = 1.0 /np.sqrt(2.0 * np.pi * (self.var_x[y]))
        return np.prod(c * np.exp(-1.0 * np.square(x - self.u[y]) / (2.0 * self.var_x[y])))
    
    def evaluate(self, X, y):
        return (self.predict(X) == y).mean()


pop = pd.read_csv('./Downloads/convertcsv.csv',dtype='category')
#print(pop)
pop = pop.apply(lambda x: x.astype(str).str.lower())
pop.columns = pop.columns.str.replace(' ', '')
#print(pop)

pop['Author']=pd.factorize(pop.Author)[0]
pop['Thread']=pd.factorize(pop.Thread)[0]
pop['Length']=pd.factorize(pop.Length)[0]
pop['Wheretoread']=pd.factorize(pop.Wheretoread)[0]
pop['UsersAction']=pd.factorize(pop.UsersAction)[0]
#print(pop)

##naive bayes

X=pop[['Author','Thread','Length','Wheretoread']].to_numpy()
print(X)
Y=pop[['UsersAction']].to_numpy().flatten()
#print(Y)
 
clf=GaussianNB().fit(X,Y)
print(GaussianNB().fit(X, Y).evaluate(X, Y))

Xtest=[[0,1,1,0]];

print(clf.predict(Xtest))

with scikit

import pandas as pd
import matplotlib.pyplot as plt

import numpy as np
     


pop = pd.read_csv('./Downloads/convertcsv.csv',dtype='category')
#print(pop)
pop = pop.apply(lambda x: x.astype(str).str.lower())
pop.columns = pop.columns.str.replace(' ', '')
#print(pop)

pop['Author']=pd.factorize(pop.Author)[0]
pop['Thread']=pd.factorize(pop.Thread)[0]
pop['Length']=pd.factorize(pop.Length)[0]
pop['Wheretoread']=pd.factorize(pop.Wheretoread)[0]
pop['UsersAction']=pd.factorize(pop.UsersAction)[0]
#print(pop)

##naive bayes

X=pop[['Author','Thread','Length','Wheretoread']].to_numpy()
print(X)
Y=pop[['UsersAction']].to_numpy().flatten()
#print(Y)
 
from sklearn.naive_bayes import GaussianNB

clf=GaussianNB().fit(X,Y)
GaussianNB().fit(X, Y)

Xtest=[[0,1,1,0]];

print(clf.predict(Xtest))
Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Please read article, Business doesn't happen face to face as often as some would like. Instead,...
Please read article, Business doesn't happen face to face as often as some would like. Instead, today's communication depends on conference calls and emails chains that make it challenging to get to know your partners. It's been a common lament among business people dissatisfied with the technology that has become the norm in their daily lives. But with so many workers worldwide now working in virtual teams, many business relationships do depend on technology. And that's not a bad thing...
Japan-Test Market for the World The following mini case represents an example of the unique marketing...
Japan-Test Market for the World The following mini case represents an example of the unique marketing attributes of a foreign culture. Here the focus is on the idiosyncrasies of the Japanese market and how they can be used to establish a pilot or ''test'' market for the rest of the world. In this particular activity, the focus is on establishing a test market, one that can be used to test the various features of a product and the receptivity of...
After reading the following article, how would you summarize it? What conclusions can be made about...
After reading the following article, how would you summarize it? What conclusions can be made about Amazon? Case 12: Amazon.com Inc.: Retailing Giant to High-Tech Player? (Internet Companies) Overview Founded by Jeff Bezos, online giant Amazon.com, Inc. (Amazon), was incorporated in the state of Washington in July 1994, and sold its first book in July 1995. In May 1997, Amazon (AMZN) completed its initial public offering and its common stock was listed on the NASDAQ Global Select Market. Amazon quickly...
Pandora is the Internet’s most successful subscription radio service. As of June 2013, it had over...
Pandora is the Internet’s most successful subscription radio service. As of June 2013, it had over 200 million registered users (140 million of which access the service via a mobile device) and over 70 million active listeners. Pandora now accounts for more than 70% of all Internet radio listening hours and a 7% share of total U.S. radio listening (both traditional and Internet). At Pandora, users select a genre of music based on a favorite musician, and a computer algorithm...
Scott E. Miller, CPA, CVA has given an example of an expert witness in his article...
Scott E. Miller, CPA, CVA has given an example of an expert witness in his article entitled “You Got the Litigation Engagement, So Now What,” in The Value Examiner. Read his example and then prepare a list of mistakes that the expert made in his expert witnessing engagement. Let’s assume there is a CPA, Calvin P. Anderson. Calvin has been a practicing CPA for 15 years. He has a successful CPA firm providing a full range of traditional accounting and...
Read the following case carefully and then answer the questions. In the movie Face/Off, John Travolta...
Read the following case carefully and then answer the questions. In the movie Face/Off, John Travolta got a new look by exchanging faces with Nicolas Cage. Unfortunately, he got a lot of trouble along with it. John could receive a much less troublesome new look by using Botox, a treatment discovered by Vancouver’s Dr. Jean Carruthers, who came upon the cosmetic potential of Botox in 1982 while treating a woman with eye spasms. Botox is marketed by Allergan, a specialty...
Please read the case and answer the questions below: 1-3 The employer publishes the South Texas...
Please read the case and answer the questions below: 1-3 The employer publishes the South Texas Clarion daily newspaper, employing 726 carriers on 780 routes through the rural Rio Grande river valley. In addition to the Clarion, the carriers deliver seven other newspapers e.g., The Wall Street Journal. The employer operates four distribution centers (warehouses) where carriers pick up the papers to take on their routes. Each distribution center has general manager and several "District Managers" who supervise the work...
I did already posted this question before, I did get the answer but i am not...
I did already posted this question before, I did get the answer but i am not satisfied with the answer i did the code as a solution not the description as my solution, so i am reposting this question again. Please send me the code as my solution not the description In this project, build a simple Unix shell. The shell is the heart of the command-line interface, and thus is central to the Unix/C programming environment. Mastering use of...
The employer publishes the South Texas Clarion daily newspaper, employing 726 carriers on 780 routes through...
The employer publishes the South Texas Clarion daily newspaper, employing 726 carriers on 780 routes through the rural Rio Grande river valley. In addition to the Clarion, the carriers deliver seven other newspapers e.g., The Wall Street Journal. The employer operates four distribution centers (warehouses) where carriers pick up the papers to take on their routes. Each distribution center has general manager and several "District Managers" who supervise the work of 30-50 carriers. A few carriers work multiple routes. In...
1.Establishing the virtual Management: As known, managing virtual staff requires a different method or approach than...
1.Establishing the virtual Management: As known, managing virtual staff requires a different method or approach than managing local staff. Due to that reason, Golden Scent has developed a strategic plan to successfully manage its virtual staff in the USA. Identify the suitable manager. to make sure our work will proceed as we planned, Golden Scent willrecruit a virtual manager with the essential skills and knowledge required to manage virtual employees. Find the skilled people to work with. Since not everyone...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT