Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

AD654: Marketing Analytics

Assignment V: Handling Time Series Data & Modeling with an Interaction Term

Once you have completed this assignment, you will upload two files into Blackboard:  The .ipynb file that you create in Jupyter Notebook, and an .html file that was generated from your .ipynb file.  If you run into any trouble with submitting the .html file to Blackboard, you can submit it as a PDF instead.

For any question that asks you to perform some particular task, you just need to show your input and output in Jupyter Notebook.   Tasks will always be written in regular, non-italicized font.

For any question that asks you to include interpretation, write your answer in a Markdown cell in Jupyter Notebook.   Any homework question that needs interpretationwill bewritten in italicized font.   Do  not simply write your answer in a code cell as a comment, but use a Markdown cell instead.

Remember to be resourceful!  There are many helpful resources available to you, including the video library, the class slides, the recitation sessions, the Zoom office hours sessions, and the web.

Part I: Working with Time Series Data   (5 points)

 

A.  Pick any publicly-traded company that trades on the Nasdaq or the NYSE.

a.   What companydidyouselect,andwhatisitsticker symbol?

B.  Go to Yahoo! Finance:  finance.yahoo.com.  Enter your company’s ticker symbol  in the search bar near the top of your screen.  Next, click on “Historical Data” and then Download.”   This will automatically download a .csv with one year’s worth of the company’s data onto your computer.

C.  Bring the dataset into your environment.   For this step, bring the dataset into your  environment  using  read_csv() from pandas -- but now, add some extra parameters to that function: index_col=’Date’and parse_dates=True.

a.   Use the head() function to explore the variables, and show your results.

b.  Next, call the info() function on your dataset, and show your results.

D.  Isthisdataframeindexedbytimevalues? Howdoyouknowthis?

E.   In your Jupyter Notebook, view the indexattribute of your time series.

a.   Now, view the max and minvalue of your index attribute.

b.   Now, view the argmax and argminvalues of your index attribute.

c.  Whatdotheresultsofmax,min,argmax,andargminrepresent?

F.   Let’s visualize the entire time series.

a.   First, just call .plot() on your dataframe object.

i.     Describe whatyou see here.  Why is thisa challenginggraphto interpret?What wouldmake iteasiertounderstand?

b.   Now, re-run the .plot() function, but this time, call that function on the ‘Close’variable only.

i.     Now, in a coupleofsentences, describe whatyou see.  Why is thisgraph more easily interpretable than theone you plotted in thepreviousstep?

c.   Plotting a subset of your data

i.     Using a slice operation, plot the daily‘Close’variable from your dataset for any one-month period of your choice.

ii.     Now, show the plot you drew with the previous step, but with a new figsize, line color, and style

G.  Rolling windows

a.   Generate a 3-period moving average for your‘Close’variable, and create a plot that overlays this 3-period average atop the actual daily closing prices.

b.   Next, generate a 30-period moving average for your‘Close’variable, and create a plot that overlays this 30-period average atop the actual daily closing prices.


c.   How are your two moving average plots different from one another? What are some pros and cons of shorterand longer moving average windows?

H.  Next, we will try something called resampling.

a.   Resample your time series so that its values are based on one-month time periods’mean values for‘Close’, rather than daily periods.

i.     Plot this newly-resampled time series.

ii.     Provide an example that explainswhy someone might care about resampling a time series.   To answer this, you may use ANY examplethatyou canthinkof,or discover, from anyfieldthatuses time series data (health, weather, market forecasting,etc.)  You don’t need toperform anyoutsideresearchor gotoodeeply into domain knowledgehere -- 3-4 thoughtful sentences are allyou need.

Part II: Marketing Mix Modeling with an Interaction Term   (5 points) :

For  this  part,  we  will  use  the  dataset  supreme_data.csv.    This  dataset  can  be found  in Blackboard -- it is posted in the same area where this assignment prompt is posted.    This dataset contains  150 periods of data, including the company’s total spending on‘wraparound’ subway ads that cover the exterior of an entire subway car, spending on Instagram advertising in  the Greater NYC region, and sales revenue for the period in the Greater NYC region.

 

We will not use a data partition here, as the model we will build is being used for explanatory purposes, rather than predictive purposes.

1.  After reading the file into your environment, the first question that you will explore here is whether there is any relationship between marketing spending and revenue.


a.   To explore this, first create a new variable that shows the total spending.  This variable’s value should be the sum of the subway ad spending and the instagram ad spending among Instagram users based in Greater NYC.

b.   Now, find the correlation between this new total spending variable and Sales.

i.     Whatisthecorrelationbetweenthesevariables?

ii.     What doesthis correlationsuggestabout the relationshipbetween total marketingspendingandsales?

c.   Next,  let’s  explore  the  relationship  among  the  subway  ad  spending  &  the instagram ad spending.  Examine the correlation among these two variables.

i.     Isthis correlationso highthatwemight notbeabletousethemtogether ina linearmodel?

d.   Now, build a model that uses sales as the outcome variable, with subway ad spending  and  regional instagram ad spending as the input variables.    Use the statsmodels library for this step, and all of the remaining steps here .

i.     What is the p-value ofthe F-Statistic for this model?  What doesthis suggestaboutthemodel?

ii.     What are the p-valuesfor eachofthe individual predictors usedinthis model? Whatdoesthissuggestaboutthesepredictors?

e.   Build yet another model -- this time, you will again use Sales as the outcome variable.  Your inputs will be subway ad spending, regional instagram ad spending, and an interaction term based on the relationship between subway ad spending and regional instagram ad spending.

i.     What do you notice about the p-values for each of these predictors? Should youkeepallthesetermstogetherinone model?

ii.     Demonstrate whatyour model would predictfor a marketerusing 100 units of subway ad spending and 100 units of regional instagram ad spending.   What sales outcome should this marketer expect to see? How does this number compare to the one that the previous model (withno interactionterm)wouldhave predicted?

iii.     In afew sentences, how doyouinterpretthisinteractioneffect? Whatis this effectshowingus?  (no statistical jargon is requiredhere, butthis should make sense to someone who knows about marketing,but not abouthowtointerpreta linearmodel).

f.    Find an example, or make up an example, of an interaction term in a model (this can  be from the world of marketing, or from anywhere else).   A very good answer to the last part of this question will include some genuine reaction from you -- finding an example of an interaction effect is only a‘half-credit’answer here.

i.     In a 3-5 sentence paragraph describe what you found (or invented). What are the variables that make up this interaction?  Is theeffectof their interactionon theoutcomepositive, or negative?  Howdoyoufeel aboutthe interaction? Does it make sense toyou? Or doesitsurprise you?