So much research builds on evolutionary histories of species and genes, which are commonly represented as trees. Species and gene trees can be inferred from molecular sequences using methods which are explicitly model-based, and one such model, the multispecies coalescent (MSC), was the focus of my PhD. Researchers often avoid the MSC because of claims that available implementations are too computationally demanding. Instead, the species history is inferred by concatenating the sequences from each gene. I began my thesis research by evaluating the effect of this approximation, and found that it is grossly inaccurate. To address reluctance towards using the MSC I developed a faster implementation of the model called StarBEAST2, which is 13x faster than its predecessor.
The MSC has theoretical limitations. One is a constant substitution rate, when in reality it varies and is associated with traits like body size. I addressed this by extending the MSC to estimate substitution rates for each species. Another assumption is that genetic material cannot be transferred horizontally. A more general model called the multispecies network coalescent (MSNC) permits introgression across species boundaries, and my collaborators and I developed an implementation of the MSNC. My final PhD project was to combine the MSC with the fossilized birth-death process, which models how species are fossilized and sampled through time. To demonstrate the utility of the combined model, I use it to reconstruct the evolutionary history of Caninae (dogs and foxes).