Skip to main content

Research Repository

Advanced Search

Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification

Jim�nez, Manuel; Triguero, Isaac; John, Robert

Authors

Manuel Jim�nez

Robert John



Abstract

© 2018 Citizen Science, traditionally known as the engagement of amateur participants in research, is showing great potential for large-scale processing of data. In areas such as astronomy, biology, or geo-sciences, where emerging technologies generate huge volumes of data, Citizen Science projects enable image classification at a rate not possible to accomplish by experts alone. However, this approach entails the spread of biases and uncertainty in the results, since participants involved are typically non-experts in the problem and hold variable skills. Consequently, the research community tends not to trust Citizen Science outcomes, claiming a generalised lack of accuracy and validation. We introduce a novel multi-stage approach to handle uncertainty within data labelled by amateurs in Citizen Science projects. Firstly, our method proposes a set of transformations that leverage the uncertainty in amateur classifications. Then, a hybridisation strategy provides the best aggregation of the transformed data for improving the quality and confidence in the results. As a case study, we consider the Galaxy Zoo, a project pursuing the labelling of galaxy images. A limited set of expert classifications allow us to validate the experiments, confirming that our approach is able to greatly boost accuracy and classify more images with respect to the state-of-art.

Citation

Jiménez, M., Triguero, I., & John, R. (2019). Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification. Information Sciences, 479, 301-320. https://doi.org/10.1016/j.ins.2018.12.011

Journal Article Type Article
Acceptance Date Dec 6, 2018
Online Publication Date Dec 7, 2018
Publication Date Apr 1, 2019
Deposit Date Dec 10, 2018
Publicly Available Date Dec 10, 2018
Journal Information Sciences
Print ISSN 0020-0255
Electronic ISSN 1872-6291
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 479
Pages 301-320
DOI https://doi.org/10.1016/j.ins.2018.12.011
Keywords Control and Systems Engineering; Theoretical Computer Science; Software; Information Systems and Management; Artificial Intelligence; Computer Science Applications
Public URL https://nottingham-repository.worktribe.com/output/1395082
Publisher URL https://www.sciencedirect.com/science/article/pii/S0020025518309538#ack0001

Files




You might also like



Downloadable Citations