Date of Creation
5-18-2023
Document Type
Departmental Honors Thesis
Department
Mathematics
First Advisor
John B. Little
Abstract
An algebraic statistical model is a parametrized family of probability distributions. Identifiability is a crucial property of a statistical model; when it holds, probability distributions in the model uniquely determine the parameters that produce them. In phylogenetics, models that are identifiable can be used to ascertain evolutionary relationships between species based on observed data, such as amino acid and DNA sequence data. However, global identifiability is a strong condition that may not always hold. A discrete parameter of a model is said to be generically identifiable if the set of probability distributions that do not uniquely determine the discrete parameter is a set of measure zero. A recently developed algorithm by Hollering and Sullivant [11] addresses the question of determining the generic identifiability of the tree parameters of phylogenetic models using algebraic matroids associated with the model. In this thesis, we first provide a discussion of the necessary biological, mathematical, and statistical concepts necessary to understand the results found in their paper. We then discuss how their method can be used to prove generic identifiability for the tree parameters of several phylogenetic models and verify several results and calculations in their paper.
Recommended Citation
Yacovone, Thomas J., "Identifiability of Phylogenetic Models" (2023). Math and Computer Science Honors Theses. 1.
https://crossworks.holycross.edu/math_honor/1
Comments
Reader: Gareth Roberts