Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Workshop 10:

Text Mining and Analysis

Question

Companies can learn a lot about customer experiences by monitoring the social media web site Twitter.  The file airlinetweets.xlsx contains a sample of 36 tweets of an airline’s customers. Each tweet details customer experience and highlights the quality of service received by the customers.

Required:

1. Normalize the terms by using stemming and generate frequency bar chart and word cloud.

2. Utilize binary document-term matrix, list the five most common terms occurring in these tweets. How often does each term appear?

3. Using Jaccard’s distance to compute dissimilarity between observations, apply hierar- chical clustering employing complete linkage method to yield three clusters on the binary document-term matrix using the following tokens as variables: agent, attend, bag, damag, and rude. How many documents are in each cluster? Give a description of each cluster.

4. How could management use the results obtained in part (3)?