CHRISTOPHER FALLAIZE Chris.Fallaize@nottingham.ac.uk
Lecturer
Bayesian protein sequence and structure alignment
Fallaize, Christopher J.; Green, Peter; Mardia, Kanti; Barber, Stuart
Authors
Peter Green
Kanti Mardia
Stuart Barber
Abstract
© 2020 Royal Statistical Society The structure of a protein is crucial in determining its functionality and is much more conserved than sequence during evolution. A key task in structural biology is to compare protein structures to determine evolutionary relationships, to estimate the function of newly discovered structures and to predict unknown structures. We propose a Bayesian method for protein structure alignment, with the prior on alignments based on functions which penalize ‘gaps’ in the aligned sequences. We show how a broad class of penalty functions fits into this framework, and how the resulting posterior distribution can be efficiently sampled. A commonly used gap penalty function is shown to be a special case, and we propose a new penalty function which alleviates an undesirable feature of the commonly used penalty. We illustrate our method on benchmark data sets and find that it competes well with popular tools from computational biology. Our method has the benefit of being able potentially to explore multiple competing alignments and to quantify their merits probabilistically. The framework naturally enables further information such as amino acid sequence to be included and could be adapted to other situations such as flexible proteins or domain swaps.
Citation
Fallaize, C. J., Green, P., Mardia, K., & Barber, S. (2020). Bayesian protein sequence and structure alignment. Journal of the Royal Statistical Society: Series C, 69(2), 301-325. https://doi.org/10.1111/rssc.12394
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 28, 2019 |
Online Publication Date | Jan 8, 2020 |
Publication Date | 2020-04 |
Deposit Date | Dec 15, 2017 |
Publicly Available Date | Jan 9, 2021 |
Journal | Journal of the Royal Statistical Society: Series C (Applied Statistics) |
Print ISSN | 0035-9254 |
Electronic ISSN | 1467-9876 |
Publisher | Wiley |
Peer Reviewed | Peer Reviewed |
Volume | 69 |
Issue | 2 |
Pages | 301-325 |
DOI | https://doi.org/10.1111/rssc.12394 |
Keywords | Statistics, Probability and Uncertainty; Statistics and Probability |
Public URL | https://nottingham-repository.worktribe.com/output/1094029 |
Publisher URL | https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssc.12394 |
Additional Information | This is the peer reviewed version of the following article: Fallaize, C.J., Green, P.J., Mardia, K.V. and Barber, S. (2020), Bayesian protein sequence and structure alignment. J. R. Stat. Soc. C., which has been published in final form at https://doi.org/10.1111/rssc.12394 . This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions |
Files
GapAlignFinal
(306 Kb)
PDF
You might also like
Prevalence, risk factors and genotype distribution of Toxoplasma gondii DNA in soil in China
(2019)
Journal Article
Mutation and Selection in Bacteria: Modelling and Calibration
(2018)
Journal Article
Exact Bayesian inference for the Bingham distribution
(2016)
Journal Article
Bayesian Model Choice for Directional Data
(2023)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: digital-library-support@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search