Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

INFS5730 Social Media Analytics in Practice

Week 4 Tutorial Activities

During week 4 tutorial, we will get to practice the creation of Custom Concepts and Custom Categories in SAS Visual Text Analytics. We will use the dataset MOVIES_PLUS and we select the variable “overview” as the text we would like to analyse. We also include pre-defined concepts.

Conduct the following tasks individually and discuss in group.

1. Create a new SAS Visual Text Analytics project following the steps described in the SAS Visual Text Analytics User Guide available on Moodle (under Week 3 section) as follows:

•     Launch the Model Studio in SAS Viya for Learners 3.5

•    Creating a Project as follows:

◦   Click the “New Project” button

◦  Set the project Name as My second project” and select the project type as “Text Analytics”, then click the “Browse” button to choose the Data source

◦  Search for the dataset “MOVIES_PLUS” and click “OK” .

Set the project language as English and click the Save” button to save your project.

•   Assigning Variables in the Data Tab (refer to page 26 of the User Guide)

◦  Set the role of variable text” as Text” to indicate to SAS that the content of this variable should be analysed in the Text Analysis.

•    Before you run your text analytics pipeline, modify the settings as follows:

◦   Click Pipelines in the upper left corner of the data tab to access the pipelines view.

◦   Click on the Concepts node. The options panel for the Concepts node appears on the right side of the page. Select Include predefined concepts in the options panel for the Concepts node.

◦   Click on the Text Parsing node. The options panel for the Text Parsing node appears on the right side of the page. Select Enable misspelling detection near the bottom of the options     panel for the Text Parsing node.

•    Run the Text Analytics Pipeline by clicking the Run Pipeline button in the upper right corner of the pipelines view.

2. After running the project, explore the Predefined Concepts in the Concepts Node

•    Right-click on the Concepts node and select Open.

•    Click the expansion arrow to the left of Predefined Concepts in the upper left corner of the Concepts pane.

•    Select nlpPerson from the list of predefined concepts and click on“Matched”in the Documents view. What would the predefined concept nlpPerson help you analyse in this dataset?

•    Select nlpMoney from the list of predefined concepts. What would the predefined concept nlpMoney help you analyse in this dataset?

•    Select nlpDate from the list of predefined concepts. What would the predefined concept nlpDate help you analyse in this dataset?

3. Create the following Custom Concepts:

•   A custom concept for Star Wars characters using the CLASSIFIER rule type.

Include Star Wars characters such as Hand Solo, Princess Leia, Luke Skywalker and R2-D2. Why is the rule type CLASSIFIER adequate in this case?

Discuss the purpose of creating this custom concept.

•   A custom concept using the CONCEPT rule type and the nlpPlace predefined concept to match the following terms:

▪   king of α

where α can be any place.

Why is the rule type CONCEPT adequate in this case?

Discuss the purpose of creating this custom concept.

•   A custom concept using the C_CONCEPT rule type and the nlpPerson predefined concept to extract people that movies were made about their true story (biography):

▪  true story of α

where α can be any person.

Why is the rule type C_CONCEPT adequate in this case?

Discuss the purpose of creating this custom concept.

Why is the context marker _c used?

•   A custom concept using the REGEX rule type to match any of the following: 1900’s, 1910’s, 1920’s, 1930’s, 1940’s,  1950’s, 1960’s, 1970’s, 1980’s, 1990’s.

Why is the rule type C_CONCEPT adequate in this case?

Discuss the purpose of creating this custom concept.

4. Create the following Custom Categories:

•    Create a custom category for each custom concept you have created previously.

•    Create the following custom categories:

◦   Movies about engagements or weddings of kings, queens, princes, or princesses. Use the Boolean operators AND and OR.

◦   Movies about kingdoms but not about army or war. Use the custom concept Kings_of you have created earlier and Boolean operators AND, OR and NOT.

◦   Documents mentioning any term starting with “inter” and US (capital letters meaning United States). Use a wildcard (*) and case-sensitive rule (_C).