Making better models from tricky data – latest developments in species distribution models

My research focuses on species distribution modelling, in which data collected on the presence of species is used to predict the distribution of species as a function of the environment. As simple as this sounds, there is a lot of active research in species distribution modelling to address difficulties that arise with various aspects of the model. In this talk, I will present the most recent developments I have been working on in this area:

  • Increasingly, species data from different sources are available to be used in fitting species distribution models, such as presence-only data and presence-absence data or presence-only data and occupancy data. We extend recent methodological developments in combined likelihood formulations by introducing lasso-type penalties to address potential overfitting and area-interaction models that accommodate spatial dependence, making the proposed combined penalised likelihood approach more flexible and useful for realistic data settings.
  • In some cases, species records may have uncertain species labels. This can happen if, for instance, taxonomy changes for a group of species, rendering observations made before the taxonomic change confounded. In such cases, can we make use of the confounded data to improve fitted species distributions? We present new methods that make use of mixture modelling and machine learning techniques to investigate this question.