Phylogenetic inference is the process of reconstructing relationships between species from genomic sequence data. The reliability of phylogenetic analysis relies on the quality of the data and the fitness of the substitution models. In phylogenetic inference, we commonly use substitution models which assume that sequence evolution is stationary, reversible, and homogeneous (SRH). Many empirical and simulation studies have shown that assuming SRH conditions can lead to significant errors in phylogenetic inference when the data violates these assumptions. Yet, the extent of SRH violations and their effects on phylogenetic inference of tree topologies are not well understood. Moreover, using time-reversible Markov models can produce unrooted phylogenetic trees only.
In my thesis, I introduce and apply new tests to assess the scale and impact of SRH violation on empirical and simulated datasets. In addition, I investigate the utility of non-reversible models to root empirical phylogenetic trees and introduce new test statistics, which provide information on the statistical support for any given root position.