Rachel Carrington
Invariance and identifiability issues for word embeddings
Carrington, Rachel; Bharath, Karthik; Preston, Simon
Authors
KARTHIK BHARATH Karthik.Bharath@nottingham.ac.uk
Professor of Statistics
SIMON PRESTON simon.preston@nottingham.ac.uk
Professor of Statistics and Applied Mathematics
Abstract
Word embeddings are commonly obtained as optimisers of a criterion function f of 1 a text corpus, but assessed on word-task performance using a different evaluation 2 function g of the test data. We contend that a possible source of disparity in 3 performance on tasks is the incompatibility between classes of transformations that 4 leave f and g invariant. In particular, word embeddings defined by f are not unique; 5 they are defined only up to a class of transformations to which f is invariant, and 6 this class is larger than the class to which g is invariant. One implication of this is 7 that the apparent superiority of one word embedding over another, as measured by 8 word task performance, may largely be a consequence of the arbitrary elements 9 selected from the respective solution sets. We provide a formal treatment of the 10 above identifiability issue, present some numerical examples, and discuss possible 11 resolutions.
Conference Name | NeurIPS 2019 |
---|---|
Start Date | Dec 8, 2019 |
End Date | Dec 14, 2019 |
Acceptance Date | Sep 3, 2019 |
Online Publication Date | Dec 14, 2019 |
Publication Date | Dec 14, 2019 |
Deposit Date | Oct 16, 2019 |
Publicly Available Date | Feb 15, 2020 |
Book Title | Advances in Neural Information Processing Systems 32 (NIPS 2019) |
Public URL | https://nottingham-repository.worktribe.com/output/2848777 |
Publisher URL | https://papers.nips.cc/paper/9650-invariance-and-identifiability-issues-for-word-embeddings |
Related Public URLs | https://papers.nips.cc/book/advances-in-neural-information-processing-systems-32-2019 |
Files
Nips Paper
(583 Kb)
PDF
You might also like
Shape and Structure Preserving Differential Privacy
(2022)
Presentation / Conference
Variograms for kriging and clustering of spatial functional data with phase variation
(2022)
Journal Article
Differential privacy over Riemannian manifolds
(2021)
Conference Proceeding
Shape-Based Classification of Partially Observed Curves, With Applications to Anthropology
(2021)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search