A Evaluating Machine Learning Algorithms for Sentiment Analysis: A Comparative Study to Support Data-Driven Decision Making

Authors

  • Nor Hayati Shafii Universiti Teknologi MARA Cawangan Perlis, Kampus Arau
  • Nur Hafiza Mohamad Daud Universiti Teknologi MARA Cawangan Perlis, Kampus Arau
  • Diana Sirmayunie Md Nasir Universiti Teknologi MARA Cawangan Perlis, Kampus Arau
  • Nur Fatihah Fauzi Universiti Teknologi MARA Cawangan Perlis, Kampus Arau

Keywords:

Bernoulli Naïve Bayes, Machine learning, Sentiment Analysis, Support Vector Machine, Accuracy

Abstract

This research investigates the accuracy and robustness of sentiment analysis models through a comparative analysis of three distinct machine learning algorithms: Bernoulli Naive Bayes, Linear Support Vector Machines, and Logistic Regression. The primary objective is to assess the performance of these models across various domains and datasets in sentiment analysis tasks. The study employs data from the IMDb 500k movie reviews dataset, utilizing machine learning techniques to conduct sentiment analysis. Specifically, the selected algorithms—Bernoulli Naive Bayes, Linear Support Vector Machines, and Logistic Regression—are employed to train the dataset. Upon evaluating the models, the findings reveal notable differences in accuracy. LinearSVM demonstrates the highest accuracy, achieving 89% after rounding to the nearest hundredths. Bernoulli Naive Bayes closely follows with the same accuracy of 89%, while Logistic Regression exhibits the lowest accuracy among the three algorithms. These results highlight the significance of algorithm choice in sentiment analysis tasks, with LinearSVM and Bernoulli Naive Bayes outperforming Logistic Regression. The research contributes valuable insights into the comparative performance of these algorithms, providing guidance for practitioners and researchers in choosing effective models for sentiment analysis across diverse datasets and domains.

References

Arya, V., Mishra, A. K., & Gonzalez-Briones, A. (2022). Analysis of sentiments on the onset of Covid-19 using Machine Learning Techniques. Advances in Distributed Computing and Artificial Intelligence Journal, 11(1), 45–63. https://doi.org/10.14201/adcaij.27348

A. M. Rahat, A. Kahir, and A. K. M. Masum (2019). Comparison of Naive Bayes and SVM Algorithm based on sentiment analysis using review dataset. 8th International Conference System Modeling and Advancement in Research Trends (SMART), 2019, 266–270. https://doi.org/10.1109/SMART46866.2019.9117512

Sharma, K. (n.d.). Cyberbullying Score Classification Using Machine Learning Techniques MSc Research Project Data Analytics.

Sindhu, I., & Shamsi, F. (2023). Prediction of IMDB Movie Score and Movie Success by Using the Facebook. 2023 International Multi-Disciplinary Conference in Emerging Research Trends, IMCERT 2023. https://doi.org/10.1109/IMCERT57083.2023.10075189

Downloads

Published

2025-07-31