A1. Customer Problem Identification – Individual Report [CLO1, CLO2, CLO5]


Due: Thursday (June 15) 3 PM / Weighting: 15% / Length: Max 900 words


This assessment provides the opportunity to identify customer problems from product review data and generate new product or service ideas.

After customers purchase and use products or services, they share their experience in an online product review platform. Many companies try to identify customer problems and their unmet needs from large- scale product review data by using natural language processing such as topic modelling and sentiment   analysis. The recent AI methods allow us to categorize text data automatically by doing topic modelling  and labelling together, although it is not perfect yet. Your task is to identify customer problems using both an AI machine and Human’s manual labelling and generate new product or service ideas.


A1-1000-split-sentences.csv contains 1,000 sentences (rows) that customers write about their product or shopping experience. Follow the below steps to complete your task:

1.    Product Review Categorization

Categorize the product reviews sentences into about 20~30 categories using both AI Machine (e.g., IBM’s Intent Recommendation) and Human categorization sequentially. Candidate groups are product (e.g., skincare), product attributes (e.g., longevity), different stages of customer journeys (e.g., online shopping, browsing, delivery), and so on. (a) What is the number of categories labelled by the AI Machine (b) What is the number of review sentences selected by the AI machine for each category and across all   the categories? (c) How many sentences did you change to different categories? (d) What is the number  of the final categories labelled by a human? (e) In doing human categorization, why did you change the initial categories the AI machine gave? In other words, what kind of categories did you try to make? And  why?

2.    Importance-Satisfaction Plot

Using the final categories in task 1, make an Importance-Satisfaction plot using the Python code for counting frequencies and measuring sentiment, which you learned during the lecture and tutorial.  For interpretation, divide the plot into four quadrants: Right-Bottom, Left-Bottom, Right-Top, and   Left-Top. (a) What does each quadrant mean? (b) What categories are located in each quadrant?

3.    Customer Problem Identification & New Product or Service Idea Generation

Based on the importance-Satisfaction plot, make recommendations for the company. (a) what are the company doing relatively well? (b) what are the primary customer problems? (c) What does the company need to improve with high and low priority? (d) To address the identified customer problems, suggest new product or service ideas.

In completing this task, apply appropriate data analytics and consider the concepts introduced in class. Ensure that your discussion is logical, clearly structured, and professionally presented. Your report should not exceed the word limit, excluding the title page, relevant images, tables or charts.

Title page (1 page) includes (1) the Title of your report, (2) the Word count, (3) the Course name, tutorial session and group, tutor’s name, (4) Your first and last name & zID.

Submission instructions

A.   Submit your report to Turnitin via Moodle.

-  .doc contains your report. File name: “Tutorial session_Group_ your first and last name & zID _A1.doc” (e.g., T9_1_Junbum Kwon_zXXXXX_A1.doc)

B.   Submit other supporting files (data, and code) to Moodle submission folder.

1)    .xlsx file contains data including the AI Machine & Human categorization columns.

2)    .ipynb contains all relevant code to get the results in your report.

●   For each missing file among the above (1) to (2), -1 mark

Marking Criteria

Your assignment will be marked based on the following marking criteria:

1.   Analysis: Quality of analysis - categorization and plotting

2.    Interpretation & Recommendations: Quality of interpretation and new product idea

3.   Written Presentation: Quality of written report

For further information, see the below marking rubric.

Marking Rubric for Assessment 1: Customer Problem Identification Individual Report







High Distinction



Analysis of

Sufficient analysis of the

Proper analysis of the

Effective and proper analysis of

Highly effective and proper analysis

advertising image

data analytics

does not meet the required standard.

which categorizes product reviews by AI Machine and Human. Attempts Importance- Satisfaction plot.

which properly categorizes    product reviews by AI              Machine and Human. Makes Importance- Satisfaction plot.

properly categorizes product reviews by AI Machine and Human.

Accurately makes Importance- Satisfaction plot.

properly categorizes product reviews by AI Machine and Human.

Highly accurately makes Importance- Satisfaction plot.

Interpretation &


Interpretation of

Sufficient interpretation

Data is mostly accurately

Data is accurately interpreted

Data is accurately and meaningfully