Open Research Newcastle
Browse

Improved similarity search for large data in machine learning and robotics

thesis
posted on 2025-05-09, 12:44 authored by Josiah Walker
This thesis presents techniques for accelerating similarity search methods on large datasets. Similarity search has applications in clustering, image segmentation, classification, robotic control and many other areas of machine learning and data analysis. While traditionally database and data analysis oriented applications of similarity search have been search throughput oriented, in the areas of online classification and robotic control it is also important to consider total memory usage, scalability, and sometimes construction costs for the search structures. This thesis presents a locality-sensitive hash (LSH) code generation method which has a lower computational and technical cost than baseline methods, while maintaining perfor- mance across a range of datasets. This method has an exceptionally fast search structure construction time, which makes it suitable to accelerate even a small number of queries without a prior search structure. A simplified boosting framework for locality-sensitive hash collections is also presented. Applying this framework speeds up existing LSH boosting algorithms without loss of performance. A simplified boosting algorithm is given which improves performance over a state-of-the-art method while also being more efficient.

History

Year awarded

2017.0

Thesis category

  • Doctoral Degree

Degree

Doctor of Philosophy (PhD)

Supervisors

Chalup, Stephan K. (University of Newcastle); Brankovic, Ljiljana (University of Newcastle)

Language

  • en, English

College/Research Centre

Faculty of Engineering and Built Environment

School

School of Electrical Engineering and Computer Science

Rights statement

Copyright 2017 Josiah Walker

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC