Dr HUAMAO WANG Huamao.Wang@nottingham.ac.uk
ASSOCIATE PROFESSOR
Tension in big data using machine learning: Analysis and applications
Wang, Huamao; Yao, Yumei; Salhi, Said
Authors
Yumei Yao
Said Salhi
Abstract
© 2020 Elsevier Inc. The access of machine learning techniques in popular programming languages and the exponentially expanding big data from social media, news, surveys, and markets provide exciting challenges and invaluable opportunities for organizations and individuals to explore implicit information for decision making. Nevertheless, the users of machine learning usually find that these sophisticated techniques could incur a high level of tensions caused by the selection of the appropriate size of the training data set among other factors. In this paper, we provide a systematic way of resolving such tensions by examining practical examples of predicting popularity and sentiment of posts on Twitter and Facebook, blogs on Mashable, news on Google and Yahoo, the US house survey, and Bitcoin prices. Interesting results show that for the case of big data, using around 20% of the full sample often leads to a better prediction accuracy than opting for the full sample. Our conclusion is found to be consistent across a series of experiments. The managerial implication is that using more is not necessarily the best and users need to be cautious about such an important sensitivity as the simplistic approach may easily lead to inferior solutions with potentially detrimental consequences.
Citation
Wang, H., Yao, Y., & Salhi, S. (2020). Tension in big data using machine learning: Analysis and applications. Technological Forecasting and Social Change, 158, Article 120175. https://doi.org/10.1016/j.techfore.2020.120175
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 17, 2020 |
Online Publication Date | Jun 30, 2020 |
Publication Date | Sep 1, 2020 |
Deposit Date | Jun 30, 2020 |
Publicly Available Date | Dec 31, 2021 |
Journal | Technological Forecasting and Social Change |
Print ISSN | 0040-1625 |
Electronic ISSN | 0040-1625 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 158 |
Article Number | 120175 |
DOI | https://doi.org/10.1016/j.techfore.2020.120175 |
Keywords | Big data; Machine learning; Data size; Prediction accuracy; Social media |
Public URL | https://nottingham-repository.worktribe.com/output/4739492 |
Publisher URL | https://www.sciencedirect.com/science/article/pii/S0040162520310015 |
Additional Information | This article is maintained by: Elsevier; Article Title: Tension in big data using machine learning: Analysis and applications; Journal Title: Technological Forecasting and Social Change; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.techfore.2020.120175; Content Type: article; Copyright: © 2020 Elsevier Inc. All rights reserved. |
Files
Tension In Big Data Using Machine Learning
(698 Kb)
PDF
You might also like
Tension in the data environment: How organisations can meet the challenge
(2021)
Journal Article
Dynamics and performance of decentralized portfolios with size-induced fund flows
(2019)
Journal Article
Investment timing and optimal capital structure under liquidity risk
(2017)
Journal Article
Investment and financing for SMEs with a partial guarantee and jump risk
(2015)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search