Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

INFS5730  Social Media Analytics in Practice

Week 3 Tutorial Activities

Required Reading

•    SAS Visual Text Analytics User Guide, provided on Moodle under Week 3 section.

A gentle reminder to complete the activities before you join the online tutorial / arrive to the face-to-face class.

During week 3 tutorial, we will get started with SAS Visual Text Analytics capabilities to conduct text analytics.

The purpose of this tutorial activity is to use SAS Visual Text Analytics to conduct text analysis, explore the dataset, discuss relevant predefined concepts and the automatically generated topics from a large unstructured dataset. We will also learn how can we derive insights from the findings of the analysis.

The dataset is labelled TWITTER_US_AIRLINE_SENTIMENT and can be found in datasets available on SAS Visual Text Analytics.

The dataset has over 14,000 tweets about US Airlines, taken from Twitter.com. The data variables include the text of each tweet that we would like to analyse.

Conduct the following tasks and answer the questions when required.

1. Create a SAS Visual Text Analytics project following the steps described in the SAS Visual Text Analytics User Guide available on Moodle (under Week 3 section) as follows:

•    To access SAS Model Studio, which allows you to create Visual Text Analytics projects, please follow these steps:

◦   Visithttps://www.sas.com/en_us/software/viya-for-learners.html

◦   Create an account using your UNSW email address

◦   Login tohttps://www.sas.com/en_us/software/viya-for-learners.html

◦   Click "Access for students"

 

◦   Click the button "Launch SAS Viya for Learners 3.5"

 

◦   Click the icon in the upper left corner to show the menu

 

Then click on "Build Models"

 

◦   The Model Studio will launch and allow you to create Visual Text Analytics projects .

•    Creating a Project as follows:

◦   Click the “New Project” button

 

◦  Set the project Name as My first project” and select the project type as Text Analytics”, then click the “Browse” button to choose the Data source

 

◦  Search for the dataset “TWITTER_US_AIRLINE_SENTIMENT” and click OK” .

 

◦  Set the project language as English and click the “Save” button to save your project.

 

◦   Click the “View table” icon (shown below) to view the data. You can notice that the text of each tweet is in the variable text” .

 

 

 

•   Assigning Variables in the Data Tab (refer to page 26 of the User Guide)

◦   Click the “Variables table” icon (shown below) to show the list of variables available in the dataset.

 

◦  Set the role of variable text” as Text” to indicate to SAS that the content of this variable should be analysed in the Text Analysis.

 

◦  Set the role of variable airline” as Category” to indicate to SAS that the content of this      variable should be considered as an existing categorisation of the data. Also, make sure to check “Display variable” .

 

•    Before you run your text analytics pipeline, modify the settings as follows:

◦   Click Pipelines in the upper left corner of the data tab to access the pipelines view.

 

◦   Click on the Concepts node. The options panel for the Concepts node appears on the right side of the page. Select Include predefined concepts in the options panel for the Concepts node.

 

◦   Click on the Text Parsing node. The options panel for the Text Parsing node appears on the right side of the page. Select Enable misspelling detection near the bottom of the options     panel for the Text Parsing node.

 

•    Run the Text Analytics Pipeline by clicking the Run Pipeline button in the upper right corner of the pipelines view.

 

2. After running the project, explore the Predefined Concepts in the Concepts Node

•    Right-click on the Concepts node and select Results.

 

Explore the summary of concept matches. How can these charts help you analyse the data?

 

Right-click on the Concepts node and select Open.

 

•    Click the expansion arrow to the left of Predefined Concepts in the upper left corner of the Concepts pane.

 

•    Explore THREE (3) relevant predefined concepts to unveil interesting insights from the data.

For example, you could explore what destinations (nlpPlace predefined concept) are mentioned in tweets and what insights you could unveil from your findings. Discuss the limitations of           relying only on predefined concepts to uncover insights from this specific dataset.

◦  You can click on the predefined concept of interest and then click Matched” to see how many documents include terms matching the predefined concept.

 

◦  You can also open the Text parsing node and sort the terms by role” to find which terms matching the predefined concept are the most occurring terms in documents.

•    Go back to the pipelines by clicking on My first project” .

•    Right-click on the Text Parsing node, and select Open.

•    Sort the terms by Role, and scroll down to the predefined concept of interest.

 

3. Explore the Terms extracted in the Text Parsing Node

•    Go back to the pipelines by clicking on My first project” .

•    Right-click on the Text Parsing node, and select Results. Interpret the results.

 

 

 

•    Right-click on the Text Parsing node, and select Open.

 

•    Use the Filter field to search for the Noun (N) Term “boarding” . We would like to explore what customers say about the boarding at airports. Explore the matched documents.

 

•    Click the Show term map” icon and explore the term map of boarding”

 


 

When you move the mouse over a node in the term map, additional information is indicated in a tooltip. What do the numbers in tooltips mean (e.g. the nodes “pass”, “prior to” and “mobile”)?

What insights could you derive from your findings in the term map?

What does the darkness of a node mean in the term map?

4. Explore the Topics window and select the topic ““+story, love, true, true story, +life

•    Go back to the pipelines by clicking on My first project” .

•    Right-click on the Topics node, and select Results. Interpret the results.

 

•    Go back to the pipelines by clicking on My first project” .

•    Right-click on the Topics node, and select Open.

•    Select the topic “+cancelled, +flight +flightled +flighted +rebook” and show its matched documents and its matched terms sorted by relevancy (from high to low).

 

Why do you think this topic is relevant?

•    clicking on the icon “Add topics as categories” .

 

5. Explore the Categories window and select the topic

•    Go back to the pipelines by clicking on My first project” .

•    Run the project again, to apply the changes made in the previous step (adding a topic as a category)

•    Right-click on the Categories node, and select Results. Interpret the results.

 


 

•    Right-click on the Categories node, and select Open.

•    Select the category “+cancelled, +flight +flightled +flighted +rebook” and show its matched documents and its category rule. Rename the category as Cancelled flights” .

 

•    Test the category rule whether it categorises a new document as “Cancelled flights” or not. Use the following text for testing. It should have matching terms.

My flight is cancelled, and I would spend the night at the airport's hotel.