19.07.2018

Making Use of Social Media Data

Relevance of Social Media

Social media presence has become critically important in business life over the past decade or so. This applies to almost every business regardless of the size and industry. Especially large companies in consumer products are seeking new ways to improve their social media performance. Proactivity and awereness are key success factors when it comes to marketing, visibility and responsiveness. It is also essential to keep track of industry trends. Data warehousing can be applied in conjuction with machine learning to enable effective social media management.

Sentiment Analysis

Firstly, a suitable data pipeline has to be created in order store publicly available social media content. The scope of your analysis is defined in this stage by applying the desired data filters including topic and channel. Having established a live resource, verified data quality and acquired enough data, sentiment analysis can be commenced.

It is to be noted that “out of the box” sentiment analysis features may be readily available depending on the chosen technological solutions! There are limitations to sentiment analysis too. Available methods are not yet advanced enough to correctly capture irony and sarcasm, for instance. Linguistic complexity remains a challenge to even state-of-the-art algorithms.

Text analysis is usually based on word counts, unions, sequences and categorization. The available sentiment analysis techniques can be divided into lexicon-based and machine learning methods. Lexical analysis methods employ word dictionaries and polarity scores. These methods do not require training data, which is beneficial despite inaccuracy in terms of recall rate. However, ML methods are nowadays commonly used for sentiment analysis as well. In practice, it could be feasible to choose a hybrid of these two.

Data preprocessing typically includes omitting irrelevant words and constructing a token frequency matrix out of the available posts. In this matrix, each row is for an individual instance, whereas columns are for words or tokens. It is also possible to impose ad hoc considerations as data manipulation is carried out. The matrix can be further divided into a smaller number of uncorrelated components by applying a dimensionality reduction method. Furthermore, there is a variety of ML algorithm options to choose from in the classification stage of sentiment analysis.

Geographical Sentiment

Having set up the data pipeline and sentiment estimator, the solution can be connected to BI reporting software. Social media integration allows you to extract the text content, sentiment estimate and location of each public post. Considering large event volumes, it is more appropriate to visualize the overall picture.

Using a map visualization is convenient as it provides an up-to-date geographical outlook on the operational environment from the viewpoint of social media presence. Raw sentiment values or a threshold-based classification scale, such as “negative”, “neutral” and “positive”, can be used for the purpose. Sentiment levels are indicated by colours in the map visualization. Potential data filters include at least timestamp, language and social media channel.

1. Sentiments Associated with Individual Posts

2. Sentiment Scoring for Recent Posts

3. Aggregate Sentiment Evolution

4. Filtering by Social Media Channel

Share
Contact Person

Blog writer

Pekka Tiusanen

Bilot Alumnus