Open Research Newcastle
Browse

Scalable and efficient multi-objective optimization algorithms for visual data exploration

thesis
posted on 2025-05-10, 15:15 authored by Claudio Sanhueza Lobos
Analyzing data has been a critical activity to enhance our decision-making procedures on many applications. However, the relentless growth in data generation brings new challenges when it comes to analyzing large amounts of data. New infrastructures need to be implemented to support the modern computation requirements. Complete computations are time-consuming and impractical, so the alternatives need to be fast using incomplete and noisy data. Data integration is another challenge in which data need to be aggregated from multiple sources. Solutions to these challenges are typically translated to digital products and services. Modern digital tools are used not only to store user data and to offer useful online services. They are also employed to extract complex relationships between data objects. The knowledge extracted from these relations are used on new features to improve the current products. Nowadays, meaningful relations need to be extracted from massive data collections that have been collected globally through many devices. Hand in hand with these activities, lies the design and development of novel and efficient algorithms that allow us to perform these analyses. Many applications and engineering challenges can be addressed by defining optimization problems (OPs). In doing so, we formally define a metric of interest (or objective function) that we will optimize (minimize or maximize). However, a more compelling strategy to model optimization problems is by assessing simultaneously several objectives, just like we do in our daily lives. Formally, these problems are known as multi-objective optimization problems (MOPs). In MOPs, the goal is to optimize multiple and possibly conflicting objectives. We say that two objectives functions are conflicting when improving the value of one of them worsens the second one. To address these problems, researchers and practitioners have proposed an extensive variety of multi-objective algorithms. The ultimate goal of multi-objective algorithms is to compute a set of solutions which represent a trade-off between the objectives that are optimized. This characteristic is a major difference compared to the classical single-objective optimization algorithms which aim to and a unique optimal solution. We argue that employing these multi-objective strategies applied to data analysis is an effort worth exploring. Specifically in this dissertation, we employ this paradigm to address a fundamental data analysis task: data visualization. To accomplish this goal, in this dissertation we present two related contributions. In the first contribution of this dissertation, we design and implement an efficient multiobjective memetic algorithm (MOMA) to address instances of the multi-objective quadratic assignment problem (mQAP). The mQAP is a generalization of the well-know quadratic assignment problem (QAP) in which the goal is to optimize the cost function of positioning objects on available positions. Our algorithm to tackle instances of the mQAP is based on the parallel island model in which several sub-populations are evolved independently. Each island uses evolutionary operators such as mutation, recombination and selection and a local search procedure to improve the quality of the solutions. Also, we use the set of islands to create a topology (i.e., an interconnection of islands) which we use to create a cooperation scheme. The idea is to migrate selected solutions to the neighboring solutions asynchronously. Our experiments show that this procedure allows us to improve the outcome of the algorithm compare against a state-of-the-art multi-objective algorithm. We called this algorithm the parallel asynchronous multi-objective quadratic assignment problem algorithm (PasMoQAP). Since data visualization algorithms play a relevant role in exploratory data analysis, our second contribution is a novel multi-objective optimization algorithm for computing data visualizations of large datasets (i.e., millions of data objects). As far as we can say, this is the first approach using a multi-objective optimization framework to tackle the data visualization task. We call the algorithm the multi-objective quadratic assignment problem visualization algorithm (mQAPViz), and it implements a divide-and-conquer strategy that solves, in parallel, several mQAP instances especially created for the visualization task at hand. The mQAP instances are solved efficiently using our first contribution PasMoQAP. To improve the efficiency of the algorithm, we implement an efficient graph data structure and a graph sampling techniques. We perform experiments on real-life and large-scale datasets to demonstrate that mQAPViz is competitive against state-of-the-art visualization algorithms. We also illustrate how the approach can be used on several types of data such as graphs and text.

History

Year awarded

2019.0

Thesis category

  • Doctoral Degree

Degree

Doctor of Philosophy (PhD)

Supervisors

Moscato, Pablo (University of Newcastle); Berretta, Regina (University of Newcastle)

Language

  • en, English

College/Research Centre

Faculty of Engineering and Built Environment

School

School of Electrical Engineering and Computer Science

Rights statement

Copyright 2019 Claudio Sanhueza Lobos

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC