Open Research Newcastle
Browse

Memetic algorithms for community detection and clustering problems

thesis
posted on 2025-05-10, 12:51 authored by Leila Moslemi Naeni
The 'community structure' is one of the most important properties of a graph, and the real system represented by the graph. The community detection problem was first introduced by Girvan and Newman in 2002. They defined a community as a group of vertices with dense connections within the group, and sparse connections with other groups. Vertices belong to such communities or clusters most likely play similar role in the system, or have common properties. Community detection is a very challenging problem in which the community structure of the graph is being searched and uncovered. Nevertheless, computationally, it is a very difficult problem, and has not been satisfactorily solved yet. Researchers from variety of disciplines developed wide range of methods and approaches to solve the community detection problem. The optimisation-based algorithms are among the most popular methods for dealing with the community detection problem. The main research question of this thesis is how we can propose a competent memetic algorithm for the community detection problem and how we can employed the community detection problem for investigating the gene expression data. Our ultimate goals can be summarised as: 1) designing and developing advanced and efficient memetic algorithms to detect high-quality community structures in graphs; 2) establishing a method to utilise community detection algorithms in a wider range of problems, for example, in the data clustering problems; and 3) creating a novel method to investigate the gene expression data by using the community detection algorithms. In this thesis, we proposed two memetic algorithms to detect community structure by maximising the modularity in unweighted and weighted graphs, respectively. We conducted experiments on real world benchmark graphs to evaluate the performance of our algorithms. We obtained remarkable results in compared to the state-of-the-art algorithms. To achieve our second goal, we developed a method to convert a clustering problem to a community detection problem. To evaluate the performance of our new clustering method, we applied that on two real-world datasets: dataset of Shakespearean era play and dataset of charity and not-for-profit sector. The performance of this method is compared with well-known clustering methods. The results are very promising, and show that our method competes well against popular clustering algorithms. Additionally, we developed a method to investigate the gene expression data. We extensively study and analyse a well-known breast cancer gene expression dataset (METABRIC) to identify a new way of subtyping breast cancer patients. Moreover, we obtained a list of potential biomarker genes that are capable of discriminating between different subtypes of the breast cancer patients. We evaluated our findings against three breast cancer subtyping methods. Based on all the experiments, the achievements of this thesis can be summarised as the two reliable and competent memetic algorithms for detecting community structure by means of modularity optimisation. Also, a new clustering method was proposed with outperforming results and great applications in investigating gene-expression data.

History

Year awarded

2017.0

Thesis category

  • Doctoral Degree

Degree

Doctor of Philosophy (PhD)

Supervisors

Moscato, Pablo (University of Newcastle); Berretta, Regina (University of Newcastle)

Language

  • en, English

College/Research Centre

Faculty of Engineering and Built Environment

School

School of Electrical Engineering and Computer Science

Rights statement

Copyright 2017 Leila Moslemi Naeni

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC