you're tasked with performing a sentiment analysis on top articles in a sub-rddit. If you are unfamiliar with Reddit, here are some API urls to a few sub-rddits:
- `worldnews` URL:
https://www.rddit.com/r/worldnews/top.json
- `writingprompts` URL:
https://www.rddit.com/r/writingprompts/top.json
- `todayilearned` URL:
https://www.rddit.com/r/todayilearned/top.json
- `explainlikeimfive` URL:
https://www.rddit.com/r/explainlikeimfive/top.json
- `politics` URL: https://www.rddit.com/r/politics/top.json
From these URLs you should be able to figure out the pattern for any sub-rddit such as `news`, or `ama`.
Start by getting the Rddit API to work, for the URL above and extracting a list of titles only. You can easily view the JSON in a browser by clicking the link above. However to use the name URL with Python requests, you'll need to set a custom `User-Agent` in your ` HTTP Headers`, as explained in the **Rules** section here: https://github.com/rddit/rddit/wiki/API Figuring this out is the point of the homework as the rest is rather trivial. Note: You do NOT need to follow the OAUTH2 instructions as you will access the Rddit api unauthenticated.
You should perform the analysis on the titles only.
After you get Rddit working move on to sentiment analysis. Once again, we will use (`http://text-processing.com/api/sentiment/`) like we did in the in-class coding lab. Figuring this out should be trivial.
We will start by writing the `GetRdditStories` and `GetSentiment` functions, then putting it all together.
1.
First let's write a function `GetRdditStories` to get the top news articles from the http://www.rddit.com site.
Inputs: a subrddit as string e.g. `news` or `worldnews`
Outputs: the top `stories` as a Python object converted from JSON
Algorithm (Steps in Program):
```
todo write algorithm here
code:
import requests
def GetRdditStories(subrddit):
# todo write code return a list of dict of stories(write code on
this line)
# testing
GetRdditStories('worldnews') # you should see some stories from
r/worldnews
2.
Now let's write a function, that when given `text` will return the sentiment score for the text. We will use http://text-processing.com 's API for this.
Inputs: `text` string
Outputs: a Python dictionary of sentiment information based on `text`
Algorithm (Steps in Program):
def GetSentiment(text):
# todo write code to return dict of sentiment for text (write code
on this line)
# testing
GetSentiment("You are a very bad, bad man!")
Python language (Only need to write the code of "to do...")
Find the python 3 code below. Read the comments provided for a better understanding.
import requests
#----------------------------------------------------------------------------------
# Function GetRdditStories(subrddit)
#----------------------------------------------------------------------------------
def GetRdditStories(subrddit):
# build url
url = "https://www.reddit.com/r/" + subrddit + "/top.json"
# headers - for user agent
headers = {'user-agent': 'my-app/1.0.0'}
return requests.get(url, headers=headers).json()
# testing
print(GetRdditStories('worldnews')) # you should see some stories from r/worldnews
#----------------------------------------------------------------------------------
# Function GetSentiment(text)
#----------------------------------------------------------------------------------
def GetSentiment(text):
# api url
url = "http://text-processing.com/api/sentiment/"
# post data
data = "text=" + text
return requests.post(url, data=data.encode('utf-8')).json()
# testing
GetSentiment("You are a very bad, bad man!")
#----------------------------------------------------------------------------------
# Test the functions
#----------------------------------------------------------------------------------
subrddit = input("Enter subject: ")
for article in GetRdditStories('worldnews')['data']['children']:
# to print only the label, uncomment the below line
# print(GetSentiment(article['data']['title'])['label'])
# print the dict
print(GetSentiment(article['data']['title']))
Kindly upvote, if you find this answer helpful.
Get Answers For Free
Most questions answered within 1 hours.