A Weighted Ensemble Approach to Sentiment Classification Using SVM and Naïve Bayes

Authors

  • Kshitiz Dhakal
  • Gokarna Adhikari
  • Dipesh Tripathi
  • Abhay Nepal

Keywords:

Ensemble learning, Multinomial naïve bayes, Natural language processing, Sentiment analysis, Support vector machine, Text classification

Abstract

Sentiment analysis is a key area in natural language processing (NLP) that focuses on identifying the sentiment polarity of text, enabling automated interpretation of opinions in domains such as product reviews, social media monitoring, and customer feedback analysis. In this study, we propose a weighted ensemble approach that combines the predictive capabilities of Support Vector Machine (SVM) and Multinomial Naïve Bayes (MNB) classifiers to achieve higher sentiment classification accuracy than individual models. The system processes input text through a multi-stage pipeline consisting of validation, tokenization, stop-word removal, punctuation elimination, negation handling, and vectorization before feeding it into the classifiers. Comparative analysis of SVM kernels revealed that the linear kernel provided superior accuracy for our dataset. The ensemble approach applies different weighting ratios, 21:76 (MNB: SVM) for short text inputs and 1:6 for longer inputs, based on empirical evaluation of classification performance. Experimental results demonstrate that while SVM alone achieved an accuracy of 90.17% and MNB achieved 83.87%, the weighted ensemble achieved an improved accuracy of 90.53% with a balanced accuracy of 85.90% and an F1-score of 0.903. These findings confirm that the strategic combination of classifiers, with optimally tuned weights, can lead to measurable improvements in sentiment classification tasks. This work contributes a scalable and adaptable approach that can be extended to multilingual datasets and enhanced with advanced deep learning architectures in future research.

References

B. Pang and L. Lee, “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts,” Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Available: https://aclanthology.org/P04-1035.pdf

B. Liu, “Sentiment Analysis and Opinion Mining,” Morgan & Claypool Publishers, Apr. 2012. Available: https://www.cs.uic.edu/~liub/FBS/SentimentAnalysis-and-OpinionMining.pdf

Boiy, Erik, Hens, Pieter, “Automatic Sentiment Analysis in On-line Text,” Proceedings ELPUB2007 Conference on Electronic Publishing, Jan. 01, 2007. Available: https://www.researchgate.net/publication/37676552_automatic_sentiment_analysis_in_on-line_text

E. Kouloumpis, T. Wilson, and J. Moore, “Twitter Sentiment Analysis: The Good, the Bad, and the OMG!,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 5, no. 1, Jul. 2011. Available: https://ojs.aaai.org/index.php/ICWSM/article/view/14185

H. Saif, Y. He, and H. Alani, “Alleviating Data Sparsity for Twitter Sentiment Analysis,” 2012. Available: https://ceur-ws.org/Vol-838/paper_01.pdf

N. Kalra and D. Yadav, “SynRec: A Prediction Technique using Collaborative Filtering and Synergy Score,” International Journal of Engineering and Advanced Technology, vol. 8, no. 5s3, pp. 457–463, Sep. 2019, doi: https://doi.org/10.35940/ijeat.E1096.0785S319.

C. Cortes and V. Vapnik. “Support-vector networks." Machine Learning 20, no. 3 (1995): 273–297. Available: https://web.engr.oregonstate.edu/~huanlian/teaching/ML/2017fall/extra/svn-1995.pdf

T. Mitchell, “Machine Learning,” McGraw-Hill Science, 1997. Available: https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf

T. M. S. Neethu and R. Rajasree, “Sentiment analysis in Twitter using machine learning techniques,” 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), Jul. 2013, doi: https://doi.org/10.1109/icccnt.2013.6726818

Y. Kim, “Convolutional Neural Networks for Sentence Classification,” New York University, 2014. Available: https://arxiv.org/pdf/1408.5882

Published

2025-08-30

How to Cite

Kshitiz Dhakal, Gokarna Adhikari, Dipesh Tripathi, & Abhay Nepal. (2025). A Weighted Ensemble Approach to Sentiment Classification Using SVM and Naïve Bayes. International Journal of Data Science, Bioinformatics and Cyber Security, 1(2), 1–10. Retrieved from https://www.matjournals.net/engineering/index.php/IJDSBCS/article/view/2387