A2. Industry Benchmarking


A2. Industry Benchmarking Group Report [CLO1, CLO2, CLO3, CLO4]

Week 7 Friday (March 31) 5 PM / Weighting: 30% / Length: Max 3,000 Words


This assessment provides you with the opportunity to apply the concepts learned in class to develop and communicate an appropriate image advertising strategy.

Your team is working for an advertising agency company. One client just launched its business in the X industry. The client requires information on what effective image advertising content strategies in the respective industry look like.


You will be collaborating with your peers in groups of 3-5 students to complete this task. You can choose any industry among the given candidate industries. Then, choose N companies (N = the number of your group members). Explain why you choose those companies by citing sources (e.g., top 5 companies on   Fortune 500 list, top 5 active companies in social media). The company should have at least 1,000 posts on their Instagram account. Columns of the company Table are (1) the Instagram account, (2) the Instagram web link, (3) The number of total posts, and (4) the number of analysed posts. Note that you  will analyse posts with a single image by excluding posts with videos or multiple images (i.e., sidecar).

Follow the below steps to complete your task:

1) Quick-Hypothesis: Consulting companies often need to tell their Quick-Hypotheses to clients. Find and report the top 10 posts vs the worst 10 posts regarding like count” in each company.   And then, make your N Hypotheses by critically discussing the reasons with supporting evidence from those top and worst posts. Hypotheses can be about either the main effect or the moderator.

2) Data-Analytics-Hypothesis: In the tutorial, you learned how to make a variety of visual and textual features (e.g., color, composition, visual objects, face emotion, text within an image or in the caption). Use all the visual and textual features within an image (not in the caption). Run regression with Y = the log (like count + 1) using (a) each account data and (b) the combined data, respectively, by carefully addressing multicollinearity issues. Report each regression result in Table (columns: X variables, coefficients, p-value, and VIF) whose p-value is less than 0.1. Considering all the regression results,  make your N Hypotheses by critically discussing the           reasons with supporting evidence from your data-analytics result.

3) Final Hypothesis: Make your group’s final N hypotheses by considering the above Hypothesis,  visual pattern, as well as industry and academic articles. Note that you need to construct dictionaries for at least two hypotheses. Make XY plot for each candidate's hypotheses (X = a variable corresponding to your hypotheses, Y = the log (like count + 1)). This visual pattern may help you choose your final hypothesis among candidates. Specifically, discuss why you keep,     drop, or add hypotheses to your final set compared to your quick- or data-analytics-hypothesis set. Explain your final hypothesis logically (i.e., why does your X increase Y?) by citing related    theories and/or other supporting evidence.

4) Hypotheses Test:

(a) State your regression model with the definition of each variable and coefficient. X variables

include the variables about your hypotheses and control variables: (1) text length in the         caption, (2) text sentiment in the caption, (3) OCR text length within an image, (4) OCR text   sentiment within an image, and (5) Posting time dummies: Year, Month-of-Year (January, …, December), Day-of-Week (Monday, ..., Sunday), Time-of-day (Morning, Afternoon, Evening,  Night). Run regression using (1) each account data and (2) the combined data, respectively.   Report each regression result in Table. Also, make a summary Table that includes only X variables and their coefficients from all the above regression results. The columns are each   account and the combined one. Add color to significant coefficients. Interpret the regression results. For only the combined data, (1) Report summary statistics (count/frequency, mean,  median, minimum, maximum) in a table for each X and Y variable, (2) Make a correlation matrix among all the main and control variables including Y except time and account dummies, and (3) state which hypotheses are supported or not If not, discuss why not.

(b) Robustness test: Repeat the above analysis with Y = the log (comment count + 1). Do not

make summary statistics and correlation matrix again. Interpret results and discuss which results in (a) are robust.

5) Conclusion: (a) Deliver your finding to your client by recommending effective image advertising content strategies with supporting evidence.

(b) Also, as one way of communicating your finding, generate TWO new advertising prototypes   based on your recommendation to increase viewer engagement. Advertising agencies often give two versions of ad prototypes to their clients. It does not have to be a fancy ad. For example, if green turns out to be the best colour, you can use the green colour in your prototype. Also, if other objects or text have significant effects, you can add those objects or text to the prototype. There are many online resources that you can benchmark. Among them, CANVA has many templates.



In completing this task, apply appropriate data analytics and consider the concepts introduced in class.  Your report should not exceed the word limit, excluding the title page, relevant images, tables or charts.

Title page (1 page) includes (1) The title of your report, (2) The word count, (3) An executive summary (One paragraph) of your report, (4) the course name, tutorial session and group, tutor’s name, (5) Each team member’s first and last name & zID

Reference: Cite academic papers, newspaper articles, blogs, or industry reports using Endnote. Use APA (American Psychological Association) style in-text citations and a reference list at the end.


Format: Use word file (.doc), 12pt, 1.5 lines spacing, at least 2.5cm margins on all sides.

Submission instructions

Submit your report (only once per group) to Turnitin via Moodle.

1)   .doc contains your report. File name: Tutorial_Group_A2.doc” (e.g., W12_1_A2.doc)

Submit other supporting files (data, image, paper and code) to Moodle submission folder.

1)   .xlsx file contains the dataset on which you run a regression.

2)   .ipynb contains all relevant code to get the results in your report. Make a zip file by combining all colab files.

3)   .json contains all outcomes from Google Vision. Make a zip file by combining all JSON files.

4)   .xlsx also contains all the cited paper lists with a brief note about why you cited them.

5)   .zip contains all the cited papers. When you add papers to Endnote, save their pdf. Then, submit a zip file of all the cited pdfs.

6)   At the end of your report, copy the link of the G-drive folder containing all image files.

For each missing file among the above (1) to (7), -1 mark

Marking Criteria

Your assignment will be marked based on the following marking criteria:

1.   Analysis: Quality of advertising image data analytics

2.    Hypothesis: Quality of Hypothesis development

3.    Written Presentation: Quality of written report

For further information, see the below marking rubric.

Marking Rubric for Assessment 2: Industry Benchmarking Group Report







High Distinction


Quality of advertising image data analytics


Analysis does not  meet the required standard.

Sufficient analysis of the advertising data, which identifies and measures Instagram data. Some attempts to do regression analysis and interpretation of results with some accuracy.

Proper analysis of the advertising data, which mostly identities and accurately measures posted Instagram data and presents findings. Attempts regression analysis and includes some X variables, addressing multicollinearity issues. Interprets results to      some extent.

Effective and proper analysis of the advertising data, which accurately identifies, and measures posted Instagram data, and presents findings clearly in an appropriate format. Does regression analysis properly by including necessary X variables, addressing multicollinearity issues, and interpreting results to an appropriate standard.

Highly effective and proper analysis of the advertising data, which accurately identifies, measures, and compares posted Instagram data; clearly and accurately presents findings in an         appropriate format.

Does regression analysis properly by including necessary X variables, addressing multicollinearity issues, and interpreting the results properly.


Quality of hypothesis development


Hypotheses development does not meet the required standard.

Sufficient development of hypotheses by applying most of the required steps and addressing most of the relevant features; attempt to make final hypothesis with some discussion           provided.

At least one hypothesis needs to be significant. Findings are most appropriately communicated. Attempt to provide recommendations.

Good development of hypotheses by applying required steps and addressing relevant visual features; the final hypothesis is appropriate; discussion is supported by considering previous data and results.

At least one hypothesis needs to be significant. Findings are appropriately communicated, and some recommendations  are provided.

Effective development of hypotheses by accurately applying required and relevant steps and addressing relevant visual features; the final hypothesis is appropriate and evidence-based; discussion is supported by considering previous data and results and  concepts.

At least two hypotheses need to  be significant. Findings are             clearly communicated by                recommending image advertising content strategies

Highly effective development of hypotheses by accurately applying the required and relevant steps and addressing relevant visual features; final hypothesis is highly appropriate, meaningful and evidence-based; discussion is supported by considering previous data and results, concepts, and scholarly articles.

At least two hypotheses need to be significant. Relatively, new insight is made. Findings are effectively communicated by recommending effective image advertising content strategies with supporting evidence