Open Research Newcastle
Browse

Using support vector machine ensembles for target audience classification on Twitter

Download (567.55 kB)
journal contribution
posted on 2025-05-10, 12:10 authored by Siaw Ling Lo, Raymond ChiongRaymond Chiong, David Cornforth
The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.

History

Journal title

PLoS ONE

Volume

10

Issue

4

Publisher

Public Library of Science (PLoS)

Language

  • en, English

College/Research Centre

Faculty of Science and Information Technology

School

School of Design, Communication and Information Technology

Rights statement

© 2015 Lo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Usage metrics

    Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC