Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Tableau Homework Assignment

BUAN 3065 – Spring 2023

Question 1 (15 points total)

The first set of data in question is information from the Freedom House, a group that attempts to measure political rights and civil liberties for all countries in the world. The data set you analyze is the recent information from this group and I’ve added some information about each country’s population and estimated amount of net migration. The Freedom House group quantifies measures such as the amount of property rights, judicial effectiveness, government integrity, fiscal health, freedoms for businesses, labor, and trade, and so on. There is a score for each measure and an overall “freedom score”. In addition, the data source provides economic information for the countries (inflation rate, amount of foreign direct investment, GDP per capita, and data on taxes/tariffs, for example).

Here are the descriptions of the measures in that file:

· Country ID: numeric ID for each country

· Country Name: text name of country

· Region: there are 5 regions: Europe, the Americas, Sub-Saharan Africa, the Middle East and North Africa and Asia-Pacific (this is text name). At the end of the file, there are (0, 1) variables on if a country is in that particular region or not.

· World Rank: rank: highest freedom score (=1) to lowest score (= 180)

The following are the “freedom” scores. All range from 0 to 100.

· Freedom Score: the overall freedom score (the higher the score the more measured freedom)

· Property Rights: the score for the amount of property rights in a country (higher is more)

· Gov’t Integrity: the score for the amount of government integrity (higher the score the more democratic and less autocratic)

· Business Freedom: measure of freedom to operate a business (higher means less gov’t intervention i.e. more “free”)

· Labor Freedom: measure of freedom for the labor market (higher means less gov’t intervention i.e. more “free”)

· Trade Freedom: measure of freedom to trade with other countries (higher means less gov’t intervention i.e. more “free”)

Here are the economic and demographic measures

· Tariff Rate: in %, effective tariff rate applied to imports

· Corporate Tax Rate: in %, highest stated national tax rate on corporations

· Tax Burden: in %, ratio of total taxes paid to GDP

· Per Capita GDP: GDP divided by population, converted to US dollars

· Inflation: in %, country’s most recent measured inflation rate

· FDI: in millions of US dollars, the net amount of foreign direct investment for the country; negative means more investment flowing out than in

· Pop: the country’s population

· Net_Migr: an estimate of the net migration into the country; negative means more people leaving than entering

Question 1.1 (8 points) - Use all countries – is it true that as the freedom score rises, on average, socio-economic conditions improve? Examine GDP per capita, Net_Migr and Inflation for this question as socio-economic measures.

a) (1 point) Statistically, provide the correlation coefficients between the freedom score and the three measures above, and provide your test to indicate the correlation coefficient is statistically different from 0 or not. Briefly discuss the results.

b) (2 points) Graphically, do two simple charts that can show the relationship between the freedom score and GDP per capita, and the freedom score and inflation. You probably want to use a feature from “analytics” to help show the relationships explicitly. Interpret your findings.

c) (3 points) Use the WORLD RANK and look at that indicator by quintiles (highest 20%, next highest 20%, etc.) Create a grouping using these quintiles. Use the mapping feature and show the freedom score that has 5 different colors based on the groupings.

o Can you detect a regional pattern?

o Then, change the type of map so that you can show both the impact of the freedom score (by color) and Net_Migr on the same map.

d) (2 points) Last, create a SCATTERPLOT that plots GDP/Capita on the x-axis and Inflation on the y-axis. Apply your color coding (based upon the grouping your created for 1.1c and discuss how the colors/findings relate to the charts created earlier.

Question 1.2 (7 points) - The next idea is to see if countries in one region are different with respect to some of the factors that make up the overall score. A clustered bar or clustered column (side-by-side) chart could be used here (a suggestion).

a) (2 points) use the region dimension to help you create a clustered bar or column chart that shows the average scores of Business Freedom, Labor Freedom and Trade Freedom for each region (so three bars or columns per region).

a. Can you spot any regional patterns/differences?

b) (3 points) Go back to the EXCEL file and sort your data by REGION. Then, create 5 new variables that will have different numbers of observations – one with the TRADE FREEDOM scores for Middle East/North Africa, one for the TRADE FREEDOM scores for Europe, one for the TRADE FREEDOM scores for the Americas, and so on. With these 5 new variables – perform a single-factor ANOVA test.

a. Do the results of this ANOVA test help confirm any pattern you see in the clustered bar/column charts?

c) (2 points) This part is just a graphical piece – look at the countries in the Middle East/North Africa region (there are just 14, that’s why I am picking this region). Complete a side-by-side (i.e. clustered) bar/chart graph for these countries looking at Property Rights and Business Freedom (2 bars/columns per country).

a. Describe any pattern (or lack of pattern) that you see.

Question 2 (10 points total)

The second set of data set is comprised of all homes in Kings County, Washington, totaling 21,613 observations (i.e., homes/apartments). Included in the data set is a range of variables that measure things such as price, attributes of the house/apartment, location, and other variables (details provided below).

The data allows for some interesting analysis on the determinants of price in a specific housing market. The data definitions are provided below, followed by a set of questions for you to answer.

Data set

ID: Unique ID for each home sold

Date: date of the home sale

Price: Price of each home sold (in $)

Bedrooms: Number of bedrooms

Bathrooms: Number of bathrooms, where .5 accounts for rooms with a toilet, but no shower

sqft_living: square footage of the apartments living space

sqft_lot: square footage of the land space

floors: Number of floors

waterfront: A dummy variable for whether the apartment is overlooking the waterfront or not

view: An index from 1 to 4 of how good the view of the property was

condition: An index from one to 5 on the condition of the apartment

grade: An index from 1 to 13, where 1-3 falls short of building construction and design, 7 has an average level of construction and design and 11-13 has a high quality of construction and design

sqft_above: the square footage of the interior housing space that is above ground level

sqft_basement: the square footage of the interior housing space that is below ground level

yr_built: the year the house was initially built

yr_renovated: the year of the house’s last renovation

zipcode: What zipcode area the house is in

lat: Latitude

long: Longitude

sqft_living15: The square footage of interior housing space for the nearest 15 neighbors

sqft_lot15: the square footage of land lots of the nearest 15 neighbors

Question 2.1 (2 points)

Find the impact of additional living space square footage on price. To do so, plot Price (on the Y axis) and sqft_living (on the X axis), choosing the appropriate measure to best represent the data (HINT: remember what Tableau defaults to) on the visual. Then, execute the following:

1. Add a trend line and make it a dashed line

2. Interpret the equation

Question 2.2 (3 points)

Develop a frequency distribution by placing “yr_built” in the ‘Columns’ field and the “Price” variable in the ‘Rows’ field. It is important to note that I want the number of homes built every year (i.e., this doesn’t use “bins”).

The “Price” variable measure will default to SUM (as the measure), but you’ll need to find the appropriate measure to develop this visual. Once complete, then add “Price” to the secondary axis (like I showed in class), change its measure to minimum price (MIN) and make it a line chart (you should have blue bars with an orange line).

Answer the following questions:

1. In what year was the cheapest house built? What was the price?

2. What years saw the lowest number of houses built? Why?

3. Provide general commentary on the housing market in this county – i.e., when is it up and when is it down?

Question 2.3 (5 points)

Answer the following question with the appropriate visuals and commentary:

1. Using only one visual, show the relationship between the average condition of the homes and the median prices of homes for each zip code. Is there a relationship? Which zip codes illustrate your point?

Question 3 (15 points total)

This set of data is a generic data set of sales from a fictitious “super store” in the U.S. (SuperStoreUS_2015_HW data F2020). In it, you will see a variety of purchase data and different attributes associated with individual orders placed in 2015. You will use this data to build a “Dashboard” and a “Story” using different techniques in Tableau.

Question 3.1 (7 points) - Building a “Dashboard” on US SuperStore profits

1. Step 1 – Create a sheet that shows Profit by Customer Segment

a. Add a “mark” of color coding for Customer Segment, title and appropriate labels

2. Sheet 2 - Create a sheet that shows Profit by Product Sub-Category (sorted from largest to smallest)

a. Add the Customer Segment color coding (should be the same as color coding in Sheet 1), title and appropriate labels

3. Sheet 3 – Create a sheet that shows Least Profitable Products (sorted from largest to smallest)

a. Add filter to only show products with negative profit

b. Add the Customer Segment color coding (should be the same as others) and title/labels

4. Create a dashboard that uses all three of the sheets above

a. Use the “Profit by Customer Segment” visual as a filter for the other two (you’ll need to execute the filter option)

5. Then, create a duplicate of that dashboard and on it, highlight any one of the Customer Segments – this is to show me you executed the step correctly (both the Product Sub-Category and Least Profitable Products charts should change if you did it correctly).

6. Answer the following three questions using the interactive dashboard you created:

a. How do the least profitable products of small business compare in magnitude to the least profitable products of the other three customer segments?

b. Compare the best-selling sub-category of the two segments that cater to businesses vs. the best-selling category of the two-segments that cater to home user (consumer and home office). - answer: Chairs and Binders

c. Looking across the four segments, what are the names of the products with profits that are less than -$5K dollars.

Question 3.2 (8 Points) - Creating a “Story” about the US SuperStore business

1. Step 1 – Create a sheet that shows Sales by Region and State (on one chart)

a. Add a color coding for Region and all appropriate titles/labels

2. Step 2 – Create a sheet that looks at Sales vs. Profits by Product Sub-Category

a. Create a Scatterplot with Sales on the X-axis and Profit on the Y-axis

b. Add a “mark” for color by Product Sub-Category

c. Add a “mark” for labels by Product Sub-Category

3. Step 3 – Create a sheet that looks at shipping costs by Product Sub-Category

a. Create a “Packed Bubbles” chart

b. Add a “mark” for color by Product Sub-Category

c. Add a “mark” for labels by Product Sub-Category

4. Step 4 – Create a sheet that looks at Sales by State on a map of the U.S.

a. Create a map of the U.S.

b. Then make it a “filled map” with Sales as the “mark” for color coding

c. Change color scheme to “Red-Green Diverging” (by using ‘edit colors’)

d. Change “Center” to 30,000 (you can do this by clicking ‘Advanced’)

5. Step 5 – Make a duplicate sheet of the scatterplot visual that you created in Step 2

a. Add shipping cost as the size of the bubbles

6. Step 6 – Build your story using the four sheets you created

a. Each “story point” should have a “caption”

i. You will need to create each caption – a small comment on the main point of the visual (i.e., what do you want to tell me about this visual?)

b. One the sheet with bubble chart, add a textbox and answer the following question: Where should the company focus efforts to reduce shipping costs to increase profitability?

Question 4 (5 points total)

The last part of the assignment requires you to construct a dashboard using the MegaMart 2017 Sales dataset. The instructions are as follows:

You need to build a dashboard that consists of the following four sheets:

1. Total Sales by Category with a color coding that distinguished the different Segments, titled “Sales by Category”

2. Historical Sales for the entire period viewed at a Weekly level, titled “Weekly Sales”

3. A scatterplot that looks at Sales (x-axis) vs. Profit (y-axis) with Customer Names as the points and includes a data label for their actual name, titled “Unprofitable Orders”

4. A table that has the following columns: OrderID, ProductID, Product Name and Profit, titled “Item Detail”

The dashboard should be made to interact as follows:

a) The Sales by Category visual should filter the entire dashboard

b) The Weekly Sales visual should filter the Unprofitable Orders scatterplot and the Item Detail table

c) The Unprofitable Orders scatterplot should filter the Item Detail table

Instructions for the final “deliverable” once you’ve built the working dashboard:

1. On the “Sales by Category”, click on Office Suppliers/Corporate and take a screenshot and paste onto a slide (Please note: the other visuals should change once you do this).

2. Then, on the Weekly Sales visual click on the Week of Order Date for March 13, 2016, and take a screenshot and paste onto a slide (Please note that certain visuals should change once you do this)

3. Then, on the Unprofitable Orders visual click Stuart Carmichael and take a screenshot and paste onto a slide (please note that certain visuals/tables should change once you do this)

As you drill down, do not unclick any of your selections. We’re trying to get to Stuart Carmichael’s order details so it’s necessary that Office Supplies/Corporate and March 13, 2016 are selected.

Hint: When you create “Actions” to filter your dashboards, make sure that “Run Action on – Select” and “Clearing the selection will keep filtered values” are your default settings for all filters you create.

Tableau Homework Assignment Details

· Due Date/Time: Monday, April 3rd at 11pm

· DropBox folder title: Tableau Assignment

· Format: PDF

· Data files needed for the assignment:

o world_freedom_F2020 (Excel file)

o KingsCounty Homes Data F2020 (Excel file)

o SuperStoreUS_2015_HW data (Excel file)

o MegaMart 2017 Sales Data HW S2023 (Excel file)

· How to get your file into a PDF

o Because we cannot export/download our Tableau workspace/sheets, you will need to do the following to create and submit a PDF

o Create all the sheets, dashboards and stories in Tableau and complete all analysis (where required) in Excel

o Then, use a blank PowerPoint template take screenshots of your work for every problem (using the Print Screen function – ‘PrtScr’ key) and paste onto individual slides

o Do the same for your analysis/tables in Excel – copy/paste to a slide

o You should any interpretation (where it is required) by simply adding a textbox to the slide and adding in your commentary

o Include a heading in the upper left corner of each slide that indicates which part/question the visuals are associated with (add in a Text box for the heading)

o Save the final PowerPoint document as a PDF file (e.g., JBorchardt_TableauHW.pdf)

· Missed Deadlines

o I will deduct 3 points for each day the assignment is late.