Open Research Newcastle
Browse

Statistical modelling of species distributions: from bridging the gap between statistics and ecology to conservation stakeholders’ challenges

thesis
posted on 2025-05-09, 20:02 authored by Emy Paulette Guilbault
Species distribution models (SDMs) are widely used to estimate the spatial distribution of species. Over the last decade various tools and models have been developed to improve SDM predictions. Such information contributes greatly to inform appropriate conservation management decisions and improve our knowledge of species distributions. With the increase in data accessibility through new technologies, the potential of citizen science studies becomes more valuable. Anyone can access and collect species data, however, not all data are alike and many inaccuracies can arise from data collection. The behaviour and distributions of observers can trigger biases and compromise the study of true species distribution. The aim of this thesis is to present practical tools for species distribution modelling and simulation in ecology, incorporating various data types and accounting for common issues that degrade data quality. In this thesis, I extend existing methods and present new tools to account for data quality in SDM using point process models. In particular, I will present a new method for combining presence-only (PO) and occupancy data collected across multiple years. By integrating extinction and colonisation processes into a combined likelihood framework, I have created a dynamic combined model which makes use of both PO and Occupancy data. Data quality also encompasses the accuracy of the observations themselves. To accommodate uncertainty in the identity of species observations, I have developed two algorithms to classify unknown labeled observations using mixture modelling and machine learning techniques. I will show that some implementations of these models provide reliably superior performance to models that only make use of observations with known data labels in settings with varying levels of abundance, correlation among the species distributions, and model complexity. I have applied these models to a case study involving three frog species of the genus Myxohpies from the North-Eastern Australian coast. For these species, I classified old records as belonging to one of the three species and have evaluated their distribution using this new information. To extend simulation practices, I have developed a large suite of new functions in R to study ecological datasets. I will present these functions and tutorials to produce simulated data sets that incorporate variation in data quality to understand the impact of biases, changes, and errors in the observation process, along with dynamics of species distributions. Using a cygwin platform simulator developed by Dr Panu Somervuo, I also studied the impact of observer behaviour on spatio-temporal clustering of recorded observations to see how a well-known method for correcting observer bias performs under different types of observer behaviour.

History

Year awarded

2022.0

Thesis category

  • Doctoral Degree

Degree

Doctor of Philosophy (PhD)

Supervisors

Renner, Ian (University of Newcastle); Beh, Eric (University of Newcastle); Mahony, Michael (University of Newcastle)

Language

  • en, English

College/Research Centre

College of Engineering, Science and Environment

School

School of Information and Physical Sciences

Rights statement

Copyright 2022 Emy Paulette Guilbault

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC