Hui Yang
Automatic detection of protected health information from clinic narratives
Yang, Hui; Garibaldi, Jonathan M.
Abstract
This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub categories. A hybrid model was proposed which combines machine learning techniques with keyword-based and rule based approaches to deal with the complexity inherent in PHI categories. Our proposed approaches exploit a rich set of linguistic features, both syntactic and word surface-oriented, which are further enriched by task specific features and regular expression template patterns to characterize the semantics of various PHI categories. Our system achieved promising accuracy on the challenge test data with an overall micro-averaged F measure of 93.6%, which was the winner of this de-identification challenge.
Citation
Yang, H., & Garibaldi, J. M. (2015). Automatic detection of protected health information from clinic narratives. Journal of Biomedical Informatics, 58(Suppl.), S30-S38. https://doi.org/10.1016/j.jbi.2015.06.015
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 23, 2015 |
Online Publication Date | Jul 29, 2015 |
Publication Date | 2015-12 |
Deposit Date | Oct 14, 2016 |
Publicly Available Date | Oct 14, 2016 |
Journal | Journal of Biomedical Informatics |
Print ISSN | 1532-0464 |
Electronic ISSN | 1532-0480 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 58 |
Issue | Suppl. |
Pages | S30-S38 |
DOI | https://doi.org/10.1016/j.jbi.2015.06.015 |
Keywords | Protected Health Information (PHI); De-identification; Hybrid model; Natural language processing; Clinical text mining |
Public URL | https://nottingham-repository.worktribe.com/output/756185 |
Publisher URL | http://www.sciencedirect.com/science/article/pii/S1532046415001252 |
Additional Information | This article is maintained by: Elsevier; Article Title: Automatic detection of protected health information from clinic narratives; Journal Title: Journal of Biomedical Informatics; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.jbi.2015.06.015; Content Type: article; Copyright: © 2015 Elsevier Inc. |
Files
1-s2.0-S1532046415001252-main.pdf
(686 Kb)
PDF
Copyright Statement
Copyright information regarding this work can be found at the following address: http://creativecommons.org/licenses/by-nc-nd/4.0
You might also like
Lessons learned from the COVID-19 pandemic about sample access for research in the UK
(2022)
Journal Article
FuzzyDCNN: Incorporating Fuzzy Integral Layers to Deep Convolutional Neural Networks for Image Segmentation
(2021)
Conference Proceeding
Designing the Hierarchical Fuzzy Systems Via FuzzyR Toolbox
(2021)
Conference Proceeding
An Extension of the FuzzyR Toolbox for Non-Singleton Fuzzy Logic Systems
(2021)
Conference Proceeding
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: digital-library-support@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search