Math and Computer Science Honors Theses

Date of Creation

5-18-2023

Document Type

Departmental Honors Thesis

Department

Mathematics

First Advisor

John B. Little

Abstract

An algebraic statistical model is a parametrized family of probability distributions. Identifiability is a crucial property of a statistical model; when it holds, probability distributions in the model uniquely determine the parameters that produce them. In phylogenetics, models that are identifiable can be used to ascertain evolutionary relationships between species based on observed data, such as amino acid and DNA sequence data. However, global identifiability is a strong condition that may not always hold. A discrete parameter of a model is said to be generically identifiable if the set of probability distributions that do not uniquely determine the discrete parameter is a set of measure zero. A recently developed algorithm by Hollering and Sullivant [11] addresses the question of determining the generic identifiability of the tree parameters of phylogenetic models using algebraic matroids associated with the model. In this thesis, we first provide a discussion of the necessary biological, mathematical, and statistical concepts necessary to understand the results found in their paper. We then discuss how their method can be used to prove generic identifiability for the tree parameters of several phylogenetic models and verify several results and calculations in their paper.

Comments

Reader: Gareth Roberts

Included in

Mathematics Commons

Share

COinS