Question

For a given dataset in a csv file answer the following using unix command only: 3)...

For a given dataset in a csv file answer the following using unix command only:
3) The 2nd column is the unique identifier for a Facebook post. What are the other
columns?
4) How many Facebook posts are there in the file?
5) What is the date range for Facebook posts in this file? (Assume that the data is
in order)
6) How many unique pages are there?
7) How many unique posts are there? [Hint: one page can have multiple posts]
8) When was the first mention in the file regarding “Italian Dishes” and what was
the post?
9) How many times is “Barack Obama” mentioned in the file? How did you find
this? (Do not ignore the case)
10) What about “Donald Trump”? Who is more popular on Facebook, Obama or
Trump? (Do not ignore the case)

The column of the dataset is given below: Filename. xyz.csv

page_name post_id page_id post_name message description caption post_type status_type likes_count comments_count shares_count love_count wow_count haha_count sad_count thankful_count angry_count post_link picture posted_at

Homework Answers

Answer #1
3) To print all column names of the csv file, we can use:
awk 'BEGIN{ FS="," } { for(fn=1;fn<=NF;fn++) {print fn" = "$fn;}; exit; }' xyz.csv

4) To count number of facebook posts:

awk 'NR>1' xyz.csv | cut -f1 -d, | sort | uniq | wc -l

5) To select date range, you need date column number. Suppose date is given in column number 10, then

Start date can be fetched by:

  cut -f10 -d, xyz.txt | head -2 | tail -1

  End date can be fetched by:

  cut -f10 -d, xyz.txt | tail -1

6) To count number of unique pages;

  awk 'NR>1' xyz.csv | cut -f3 -d, | sort | uniq | wc -l

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions