posted on 2025-05-09, 12:44authored byJosiah Walker
This thesis presents techniques for accelerating similarity search methods on large datasets. Similarity search has applications in clustering, image segmentation, classification, robotic control and many other areas of machine learning and data analysis. While traditionally database and data analysis oriented applications of similarity search have been search throughput oriented, in the areas of online classification and robotic control it is also important to consider total memory usage, scalability, and sometimes construction costs for the search structures. This thesis presents a locality-sensitive hash (LSH) code generation method which has a lower computational and technical cost than baseline methods, while maintaining perfor- mance across a range of datasets. This method has an exceptionally fast search structure construction time, which makes it suitable to accelerate even a small number of queries without a prior search structure. A simplified boosting framework for locality-sensitive hash collections is also presented. Applying this framework speeds up existing LSH boosting algorithms without loss of performance. A simplified boosting algorithm is given which improves performance over a state-of-the-art method while also being more efficient.
History
Year awarded
2017.0
Thesis category
Doctoral Degree
Degree
Doctor of Philosophy (PhD)
Supervisors
Chalup, Stephan K. (University of Newcastle); Brankovic, Ljiljana (University of Newcastle)
Language
en, English
College/Research Centre
Faculty of Engineering and Built Environment
School
School of Electrical Engineering and Computer Science