NLTK vs TextBlob: Comparison of Sentiment Analysis Applied to Elon Musk's X Profile

Daniele Calixto Barros
18 de mar. de 2024
4 min de leitura

NLTK or Natural Language Toolkit is a package of Python libraries focused on human language data, while Textblob is a Python library. Both tools has sentiment analysis tools, which is a Natural Language Processing (NLP) technique used to determine whether text data is positive, negative or neutral.

The aim of this article is to compare the results of the two tools applied to a real-life case and see if there is a discrepancy in the results. It would also be interesting to analyze generic phrases with a reference table of results. Here's a suggestion for a future study.

Dataset

To perform the test, we'll start with Twitter Scraper of Apify. The idea is to have a sample of posts from a Twitter/X profile. The scrap tool is limited to 100 posts per profile arranged by the number of likes they got. Elon Musk's Twitter/X was chosen.

We won't go into the details of the extraction because that's not the focus of the article, but this was the result when we run info() function for the Pandas dataframe:

Pre Processing

We'll let the pre processing details for another article, but we used Python Regular Expressions to remove remove things from the post other than the text, such as mentions, emojis and RT words.

Examples of text cleaning:

Raw Text	Cleaned Text
🚀💫♥️ Yesss!!! ♥️💫🚀 https://t.co/0T9HzUHuh6	Yesss!!!
Let’s make Twitter maximum fun!	Let’s make Twitter maximum fun!

Sentiment Analysis Codes

Let's code! First of all, we have to make the imports:

import pandas as pd

from textblob import TextBlob

import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer

Abstracting the text pre-processing explained in the previous item, we'll transform the dataset into a pandas dataframe:

file_path = 'your_file_path.csv'
df = pd.read_csv(file_path)

Then, we have to create the sentiment analysis functions. For TextBlob, we'll get two features. The polarity score is a float within the range [-1.0, 1.0], where -1.0 is so negative and 1.0 is so positive. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective. The functions are:

def getSubjectivity(text):
  return TextBlob(text).sentiment.subjectivity

def getPolarity(text):
  return TextBlob(text).sentiment.polarity

For NLTK, we can get four features in this format: {'neg': 0.0, 'neu': 0.213, 'pos': 0.787, 'compound': 0.5719}. These features mean how likely the text is to be negative(neg), neutral(neu) or positive(pos). And the feature compound is the combination of these 3 probabilities in a range of [-1.0, 1.0], where -1.0 is so negative and 1.0 is so positive. For this research, we'll consider compound. The function is:

analyzer = SentimentIntensityAnalyzer()

def get_sentiment(text):
    scores = analyzer.polarity_scores(text)
    return scores['compound']

Then, apply the functions to cleaned_text, adding the columns to dataframe:

df['textblob_polarity'] = df['cleaned_text'].apply(getPolarity)
df['textblob_subjectivity'] = df['cleaned_text'].apply(getSubjectivity)
df['nltk_polarity'] = df['cleaned_text'].apply(get_sentiment)

Results Comparison

To compare the results, first we will consider polarity for TextBlob and compound for NLTK, transforming the metrics between [-1.0, 1.0] into qualitative values: Negative, Positive and Neutral. This way, we can analyze the discrepancies.

def getTBSentiment(value):
  if value < 0:
    return 'Negative'
  elif value > 0:
    return 'Positive'
  else:
    return 'Neutral'

df['textblob_sentiment'] = df['textblob_polarity'].apply(getTBSentiment)


def getNLTKSentiment(value):
  if value < 0:
    return 'Negative'
  elif value > 0:
    return 'Positive'
  else:
    return 'Neutral'

df['nltk_sentiment'] = df['nltk_polarity'].apply(getNLTKSentiment)

Now, we can make the imports for visualization:

import matplotlib.pyplot as plt
import seaborn as sns

Starting with a pie chart for both tools:

textblob_counts = df['textblob_sentiment'].value_counts()
plt.figure(figsize=(8, 6))
plt.pie(textblob_counts, labels=textblob_counts.index, autopct='%1.1f%%', startangle=140)
plt.title('TextBlob Sentiment Distribution')
plt.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.

# Showing the plot
plt.show()

nltk_counts = df['nltk_sentiment'].value_counts()
plt.figure(figsize=(8, 6))
plt.pie(nltk_counts, labels=nltk_counts.index, autopct='%1.1f%%', startangle=140)
plt.title('NLTK Sentiment Distribution')
plt.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.

# Showing the plot
plt.show()

We can see that the distribution is very similar, but NLTK attributed Negative sentiment for more posts. Let's see a sample of posts where the sentiment was opposite in the two tools: negative for one and positive for another.

opposite_sentiments_df = df[(df['nltk_sentiment'] == 'Negative') & (df['textblob_sentiment'] == 'Positive') |
                            (df['nltk_sentiment'] == 'Positive') & (df['textblob_sentiment'] == 'Negative')]
num_rows_opposite_sentiments = opposite_sentiments_df.shape[0]

There are 7 cases, which means 7% of the sample. Let's analyze two examples:

Text	NLTK Compound	TextBlob Polarity	TextBlob Subjectivity
If I die under mysterious circumstances, it’s been nice knowin ya	-0.2732	0.3000	1.000
Those who want power are the ones who least deserve it	0.0772	-0.3000	0.4000

Adding the polarity of the Text Blob to the analysis, because it is an important factor in the definition. In both cases, the two tools did not assign the texts as super positive or super negative (-1 or 1). But Textblob attributed maximum subjectivity to the text "If I die under mysterious circumstances, it’s been nice knowin ya", which would indicate low reliability in the positive or negative definition.

As a purely personal opinion, I agree with the textblob definition. But we will be using a trial of a third sentiment analysis tool from google: Natural Language API. As we can see below, the classification is closer to textblob:

This analysis alone is not enough to define which tool is best. Let's try to increase it analyzing the different classifications when one is neutral and the other is positive or negative.

mixed_sentiments_df = df[((df['nltk_sentiment'] == 'Positive') & (df['textblob_sentiment'] == 'Neutral')) |
                         ((df['nltk_sentiment'] == 'Negative') & (df['textblob_sentiment'] == 'Neutral')) |
                         ((df['textblob_sentiment'] == 'Negative') & (df['nltk_sentiment'] == 'Neutral')) |
                         ((df['textblob_sentiment'] == 'Positive') & (df['nltk_sentiment'] == 'Neutral'))
                        ]

There are 11 cases, which means 11% of the sample. Let's analyze two examples:

Text	NLTK Compound	TextBlob Polarity	TextBlob Subjectivity
Twitter DMs should have end to end encryption like Signal, so no one can spy on or hack your messages	0.0018	0.0000	0.0000
Congrats Morocco!!	0.6103	0.0000	0.0000

In both cases, TextBlob assigned neutral polarity and 0.0 subjectivity, which means subjectivity should not be taken into account, i.e. very high probability of the feeling being neutral. In the first example, NLTK considered a very low compound, so it's very near to neutral. But in the second example, NLTK considered it very positive.

As a purely personal opinion, I agree with the NLTK definition in these case. Let's try Natural Language API again. As we can see below, the classification is not close to any of the both tools.

With these two analyses, which are still shallow, we can see that in 82% of cases the two tools classify the texts in the same way. In the 18% of cases where the results were different, it is difficult to define which tool is better. Other visualizations have been created and are available in the github repository, as the full code.

Both tools are valid, but a comparative study using generic phrases and a reference table would be more effective. Here's a suggestion for a next research. Feel free to clone the repository and make improvements.

I am Daniele Calixto, Data Scientist from Brazil. I hope it was useful. See you next time!

For more, my portfolio.

NLTK vs TextBlob: Comparison of Sentiment Analysis Applied to Elon Musk's X Profile

Dataset

Pre Processing

Sentiment Analysis Codes

Results Comparison

Posts recentes

Comentários