A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data

Venugopal, Jothi Prakash; Subramanian, Arul Antran Vijay; Sundaram, Gopikrishnan; Rivera, Marco; Wheeler, Patrick

doi:10.3390/app142311471

A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data

Venugopal, Jothi Prakash; Subramanian, Arul Antran Vijay; Sundaram, Gopikrishnan; Rivera, Marco; Wheeler, Patrick

Authors

Jothi Prakash Venugopal

Arul Antran Vijay Subramanian

Gopikrishnan Sundaram

Professor MARCO RIVERA MARCO.RIVERA@NOTTINGHAM.AC.UK
PROFESSOR

Professor PATRICK WHEELER pat.wheeler@nottingham.ac.uk
PROFESSOR OF POWER ELECTRONIC SYSTEMS

Abstract

Sentiment analysis is a vital component of natural language processing (NLP), enabling the classification of text into positive, negative, or neutral sentiments. It is widely used in customer feedback analysis and social media monitoring but faces a significant challenge: bias. Biases, often introduced through imbalanced training datasets, can distort model predictions and result in unfair outcomes. To address this, we propose a bias-aware sentiment analysis framework leveraging Bias-BERT (Bidirectional Encoder Representations from Transformers), a customized classifier designed to balance accuracy and fairness. Our approach begins with adapting the Jigsaw Unintended Bias in Toxicity Classification dataset by converting toxicity scores into sentiment labels, making it suitable for sentiment analysis. This process includes data preparation steps like cleaning, tokenization, and feature extraction, all aimed at reducing bias. At the heart of our method is a novel loss function incorporating a bias-aware term based on the Kullback–Leibler (KL) divergence. This term guides the model toward fair predictions by penalizing biased outputs while maintaining robust classification performance. Ethical considerations are integral to our framework, ensuring the responsible deployment of AI models. This methodology highlights a pathway to equitable sentiment analysis by actively mitigating dataset biases and promoting fairness in NLP applications.

Citation

Venugopal, J. P., Subramanian, A. A. V., Sundaram, G., Rivera, M., & Wheeler, P. (2024). A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data. Applied Sciences, 14(23), Article 11471. https://doi.org/10.3390/app142311471

Journal Article Type	Article
Acceptance Date	Dec 8, 2024
Online Publication Date	Dec 9, 2024
Publication Date	Dec 1, 2024
Deposit Date	Mar 11, 2025
Publicly Available Date	Mar 14, 2025
Journal	Applied Sciences
Electronic ISSN	2076-3417
Publisher	MDPI
Peer Reviewed	Peer Reviewed
Volume	14
Issue	23
Article Number	11471
DOI	https://doi.org/10.3390/app142311471
Public URL	https://nottingham-repository.worktribe.com/output/42840999
Publisher URL	https://www.mdpi.com/2076-3417/14/23/11471