Open Research Newcastle
Browse

Fake news detection in low-resource languages: A novel hybrid summarization approach

Download (1.45 MB)
journal contribution
posted on 2025-05-09, 03:43 authored by Jawaher Alghamdi, Yuqing LinYuqing Lin, Suhuai LuoSuhuai Luo
The proliferation of fake news across languages and domains on social media platforms poses a significant societal threat. Current automatic detection methods for low-resource languages (e.g., Swahili, Indonesian and other low-resource languages) face limitations due to two factors: sequential length restrictions in pre-trained language models (PLMs) like multilingual bidirectional encoder representation from transformers (mBERT), and the presence of noisy training data. This work proposes a novel and efficient multilingual fake news detection (MFND) approach that addresses these challenges. Our solution leverages a hybrid extractive and abstractive summarization strategy to extract only the most relevant content from news articles. This significantly reduces data length while preserving crucial information for fake news classification. The pre-processed data is then fed into mBERT for classification. Extensive evaluations on a publicly available multilingual dataset demonstrate the superiority of our approach compared to state-of-the-art (SOTA) methods. Our analysis, both quantitative and qualitative, highlights the strengths of this method, achieving new performance benchmarks and emphasizing the impact of content condensation on model accuracy and efficiency. This framework paves the way for faster, more accurate MFND, fostering more robust information ecosystems.

History

Journal title

Knowledge-Based Systems

Volume

296

Issue

19 July 2024

Article number

111884

Publisher

Elsevier

Language

  • en, English

College/Research Centre

College of Engineering, Science and Environment

School

School of Information and Physical Sciences

Rights statement

© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by- nc/4.0/).

Usage metrics

    Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC