Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification

Foody, Giles M.

doi:10.1016/j.rse.2019.111630

Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification

Foody, Giles M.

Authors

Professor GILES FOODY giles.foody@nottingham.ac.uk
PROFESSOR OF GEOGRAPHICAL INFORMATION

Abstract

The kappa coefficient is not an index of accuracy, indeed it is not an index of overall agreement but one of agreement beyond chance. Chance agreement is, however, irrelevant in an accuracy assessment and is anyway inappropriately modelled in the calculation of a kappa coefficient for typical remote sensing applications. The magnitude of a kappa coefficient is also difficult to interpret. Values that span the full range of widely used interpretation scales, indicating a level of agreement that equates to that estimated to arise from chance alone all the way through to almost perfect agreement, can be obtained from classifications that satisfy demanding accuracy targets (e.g. for a classification with overall accuracy of 95% the range of possible values of the kappa coefficient is −0.026 to 0.900). Comparisons of kappa coefficients are particularly challenging if the classes vary in their abundance (i.e. prevalence) as the magnitude of a kappa coefficient reflects not only agreement in labelling but also properties of the populations under study. It is shown that all of the arguments put forward for the use of the kappa coefficient in accuracy assessment are flawed and/or irrelevant as they apply equally to other, sometimes easier to calculate, measures of accuracy. Calls for the kappa coefficient to be abandoned from accuracy assessments should finally be heeded and researchers are encouraged to provide a set of simple measures and associated outputs such as estimates of per-class accuracy and the confusion matrix when assessing and comparing classification accuracy.

Citation

Foody, G. M. (2020). Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification. Remote Sensing of Environment, 239, Article 111630. https://doi.org/10.1016/j.rse.2019.111630

Journal Article Type	Article
Acceptance Date	Dec 28, 2019
Online Publication Date	Jan 9, 2020
Publication Date	Mar 15, 2020
Deposit Date	Jan 8, 2020
Publicly Available Date	Jan 10, 2021
Journal	Remote Sensing of Environment
Print ISSN	0034-4257
Electronic ISSN	1879-0704
Publisher	Elsevier
Peer Reviewed	Peer Reviewed
Volume	239
Article Number	111630
DOI	https://doi.org/10.1016/j.rse.2019.111630
Keywords	Computers in Earth Sciences; Soil Science; Geology
Public URL	https://nottingham-repository.worktribe.com/output/3690555
Publisher URL	https://www.sciencedirect.com/science/article/pii/S0034425719306509
Additional Information	This article is maintained by: Elsevier; Article Title: Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification; Journal Title: Remote Sensing of Environment; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.rse.2019.111630; Content Type: article; Copyright: © 2020 Elsevier Inc. All rights reserved.