Giannis Haralabopoulos
Text Data Augmentations: Permutation, Antonyms and Negation
Haralabopoulos, Giannis; Torres, Mercedes Torres; Anagnostopoulos, Ioannis; McAuley, Derek
Authors
Mercedes Torres Torres
Ioannis Anagnostopoulos
Derek McAuley
Abstract
Text has traditionally been used to train automated classifiers for a multitude of purposes, such as: classification, topic modelling and sentiment analysis. State-of-the-art LSTM classifier require a large number of training examples to avoid biases and successfully generalise. Labelled data greatly improves classification results, but not all modern datasets include large numbers of labelled examples. Labelling is a complex task that can be expensive, time-consuming, and potentially introduces biases. Data augmentation methods create synthetic data based on existing labelled examples, with the goal of improving classification results. These methods have been successfully used in image classification tasks and recent research has extended them to text classification. We propose a method that uses sentence permutations to augment an initial dataset, while retaining key statistical properties of the dataset. We evaluate our method with eight different datasets and a baseline Deep Learning process. This permutation method significantly improves classification accuracy by an average of 4.1%. We also propose two more text augmentations that reverse the classification of each augmented example, antonym and negation. We test these two augmentations in three eligible datasets, and the results suggest an -averaged, across all datasets-improvement in classification accuracy of 0.35% for antonym and 0.4% for negation, when compared to our proposed permutation augmentation.
Citation
Haralabopoulos, G., Torres, M. T., Anagnostopoulos, I., & McAuley, D. (2021). Text Data Augmentations: Permutation, Antonyms and Negation. Expert Systems with Applications, 177, Article 114769. https://doi.org/10.1016/j.eswa.2021.114769
Journal Article Type | Article |
---|---|
Acceptance Date | Feb 19, 2021 |
Online Publication Date | Mar 11, 2021 |
Publication Date | Sep 1, 2021 |
Deposit Date | Mar 15, 2021 |
Publicly Available Date | Mar 12, 2022 |
Journal | Expert Systems with Applications |
Print ISSN | 0957-4174 |
Electronic ISSN | 0957-4174 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 177 |
Article Number | 114769 |
DOI | https://doi.org/10.1016/j.eswa.2021.114769 |
Keywords | General Engineering; Artificial Intelligence; Computer Science Applications |
Public URL | https://nottingham-repository.worktribe.com/output/5396261 |
Publisher URL | https://www.sciencedirect.com/science/article/abs/pii/S0957417421002104?via%3Dihub |
Files
Text Data Augmentations Permutation Antonyms And Negation
(2 Mb)
PDF
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search