Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MIS780  Advanced Artificial Intelligence for Business

Trimester 2, 2022

Assessment 1 (Individual)  Data Analysis and Report

Assignment Objectives

This assignment aims for students to learn how to analyse data relating to a business problem and propose artificial intelligence solutions based on machine learning and data mining techniques. The report will discuss and interpret the results. In particular, students will learn to:

•    Articulate problems and solutions in business terms .

•    Prepare data for different analytics tasks .

•    Develop and justify sentiment analysis and topic models .

•    Assess and report valuable insights to business .

Case Study Description

Social Analytics is a company specialized in gather and find meaning in data gathered from social channels to support business decisions . Social analytics is usually hired by marketers to track online conversations about products and  companies.  In the  role  of  data  analysts  working  at Social Analytics, you  are  appointed to analyse  a  large  data  set  containing  conversations  between  consumers  and  customer  support  agents  on Twitter to provide insights into modern customer support practices and impact.

You are provided with a sample of 100,000 tweets from the Customer Support on Twitter data set. The data include the following information:

•    tweet_id:  A   unique,   anonymized   ID  for  the  Tweet.   Referenced   by   response_tweet_id   and in_response_to_tweet_id.

•    author_id:  A  unique,  anonymized  user  ID.  @s  in  the  dataset  have  been  replaced  with  their associated anonymized user ID.

•    Inbound: Whether the tweet is "inbound" to a company doing customer support on Twitter. This feature is useful when re-organizing data for training conversational models.

•    created_at: Date and time when the tweet was sent.

•    text: Tweet content. Sensitive information like phone numbers and email addresses are replaced with mask values like __email__.

•    response_tweet_id: IDs of tweets that are responses to this tweet, comma-separated.

•    in_response_to_tweet_id: ID of the tweet this tweet is in response to, if any.

Your task is to use Python Jupyter Notebook to process and explore the provided data.  In particular, you are to generate some insights and provide answers to these questions of interest:

A.   What are the top 10 brands according to the number of tweets?  (Hint: both tweets generated by the companies and customers are considered)

B.    How many tweets were posted  by days of a week (Mon to Sun) for the top 5 companies? (Hint: generate one figure for each company)

C.    How many customers requested for support from AmazonHelp in the data set?

D.   Among  the  top   5  companies,  which   company   received  the   most   positive  sentiments  from customers?  (Hint: determine the sentiments of the  tweets generated by customers, compute  the portions of positive vs. negative tweets, compare the proportions positive tweets between the top 5 companies)

E.    What are the topics frequently mentioned in the tweets from customers of AppleSupport?

Task and Deliverables:

•   Executive   Summary:   Define   your   problem   in   business   term   and   present   your   proposed approaches.  Present  your  major  findings  and  explain  how  they  help  to  address  the  business

problem. Cross-reference with other report sections for support.

•    Data Exploration: Process and explore the characteristics of the attributes the provided data set. Use table or figure to support answering questions (A), (B) and (C).

•    Sentiment Analysis: Use lexicon-based sentiment analysis to answer question (D).

•   Topic Modelling:  Use text-processing techniques to  process and  prepare textual data for topic modelling. Use LDA to explore topics discussed in the text reviews. Carry out experiments and demonstrate  how  an  appropriate  topic  number  is  determined  for  your  model.  Interpret  the discovered topics and answer question (E).

•    Practical   Implication:   Based  on  the  discovered   insights  from  your  analysis,   provide   some recommendations to the businesses on how to better support customers .

Submission Instructions

See  CloudDeakin  for  more  info  about  this  assignment,  especially  the  assignment  template  and  the assessment rubric.

The  assignment  must  be  prepared  using  the  provided  assignment  template  (. ipynb  file)  using  Jupyter Notebook. Your assignment should contain all necessary codes and ready to run. If you use any new python package, ensure that you include installation code in your .ipynb file. All python codes should be ready to execute without any further modification.

Upon completion of the assignment, execute all python codes and then generate a PDF file. Your files should be      named      as      your      firstname_lastname_ MIS780A1      (e.g.      John_Smith_MIS780A1.pdf      and John_Smith_MIS780A1.ipynb).

You are to submit your assignment (both the PDF file and the source .ipynb file) in the individual Assignment Dropbox  in  the  MIS780  CloudDeakin  unit  site  on  or  before  the  due  date .  Do  NOT  zip  the  files.  Any submission contained in a zip file will not be marked.

Notes

•    You are allowed to use any sample code provided in the lab materials or online resources. However, you must modify/customize such sample code to your own assignment (e.g. rename variables, labels, titles; restructure code flow, modify chart types, colour and symbols.). References and citations must be provided where             appropriate.

•    Any work you submit may be checked by electronic or other means for the purposes of detecting collusion and/or plagiarism

•    Feel free to discuss concepts and ideas with peers but remember your submission must be your own work. Be careful not to allow others to copy your work . Submissions, whose python codes are significantly similar (e.g. mostly identical except for only some variable names), are subjected to investigation for potential copying       issue. The authors of such submissions may also be asked to present their work to an academic panel if            necessary.

•    You must keep a backup copy of every assignment you submit, until the marked assignment has been returned to you.  In the unlikely event that one of your assignments is misplaced, you will need to submit your backup    copy.

•    When you are required to submit an assignment through your CloudDeakin unit site, you will receive an email  to your Deakin email address confirming that it has been submitted. You should check that you can see your     assignment in the Submissions view of the Assignment dropbox folder after upload, and check for, and keep,    the email receipt of the submission. You are responsible for submitting the correct documents for the correct   unit, in the required content or format. Should you wish to correct your submission, you can resubmit with any applicable penalties. You will not be able to submit, resubmit or correct your submission after the 5 day             lateness period (or your extension deadline).

•    Penalties for late submission: The following marking penalties will apply if you submit an assessment task      after the due date without an approved extension: 5% will be deducted from available marks for each day up to five days, and work that is submitted more than five days after the due date will not be marked. You will    receive 0% for the task. 'Day' means working day for paper submissions and calendar day for electronic           submissions. The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the task after the due date.