Manuel Jim�nez
Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification
Jim�nez, Manuel; Triguero, Isaac; John, Robert
Authors
Abstract
© 2018 Citizen Science, traditionally known as the engagement of amateur participants in research, is showing great potential for large-scale processing of data. In areas such as astronomy, biology, or geo-sciences, where emerging technologies generate huge volumes of data, Citizen Science projects enable image classification at a rate not possible to accomplish by experts alone. However, this approach entails the spread of biases and uncertainty in the results, since participants involved are typically non-experts in the problem and hold variable skills. Consequently, the research community tends not to trust Citizen Science outcomes, claiming a generalised lack of accuracy and validation. We introduce a novel multi-stage approach to handle uncertainty within data labelled by amateurs in Citizen Science projects. Firstly, our method proposes a set of transformations that leverage the uncertainty in amateur classifications. Then, a hybridisation strategy provides the best aggregation of the transformed data for improving the quality and confidence in the results. As a case study, we consider the Galaxy Zoo, a project pursuing the labelling of galaxy images. A limited set of expert classifications allow us to validate the experiments, confirming that our approach is able to greatly boost accuracy and classify more images with respect to the state-of-art.
Citation
Jiménez, M., Triguero, I., & John, R. (2019). Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification. Information Sciences, 479, 301-320. https://doi.org/10.1016/j.ins.2018.12.011
Journal Article Type | Article |
---|---|
Acceptance Date | Dec 6, 2018 |
Online Publication Date | Dec 7, 2018 |
Publication Date | Apr 1, 2019 |
Deposit Date | Dec 10, 2018 |
Publicly Available Date | Dec 10, 2018 |
Journal | Information Sciences |
Print ISSN | 0020-0255 |
Electronic ISSN | 1872-6291 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 479 |
Pages | 301-320 |
DOI | https://doi.org/10.1016/j.ins.2018.12.011 |
Keywords | Control and Systems Engineering; Theoretical Computer Science; Software; Information Systems and Management; Artificial Intelligence; Computer Science Applications |
Public URL | https://nottingham-repository.worktribe.com/output/1395082 |
Publisher URL | https://www.sciencedirect.com/science/article/pii/S0020025518309538#ack0001 |
Contract Date | Dec 10, 2018 |
Files
Handling Uncertainty in Citizen Science Data
(984 Kb)
PDF
You might also like
Machine Learning Pipeline for Energy and Environmental Prediction in Cold Storage Facilities
(2024)
Journal Article
Local-global methods for generalised solar irradiance forecasting
(2024)
Journal Article
Hyper-Stacked: Scalable and Distributed Approach to AutoML for Big Data
(2023)
Presentation / Conference Contribution
Explaining time series classifiers through meaningful perturbation and optimisation
(2023)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search