Open Research Newcastle
Browse

Learning deep representations for video-based intake gesture detection

Download (2.12 MB)
journal contribution
posted on 2025-05-11, 17:41 authored by Philipp V. Rouast, Marc AdamMarc Adam
Automatic detection of individual intake gestures during eating occasions has the potential to improve dietary monitoring and support dietary recommendations. Existing studies typically make use of on-body solutions such as inertial and audio sensors, while video is used as ground truth. Intake gesture detection directly based on video has rarely been attempted. In this study, we address this gap and show that deep learning architectures can successfully be applied to the problem of video-based detection of intake gestures. For this purpose, we collect and label video data of eating occasions using 360-degree video of 102 participants. Applying state-of-the-art approaches from video action recognition, our results show that (1) the best model achieves an F1 score of 0.858, (2) appearance features contribute more than motion features, and (3) temporal context in form of multiple video frames is essential for top model performance.

History

Journal title

IEEE Journal of Biomedical and Health Informatics

Volume

24

Issue

6

Pagination

1727-1737

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Language

  • en, English

College/Research Centre

Faculty of Engineering and Built Environment

School

School of Electrical Engineering and Computer Science

Rights statement

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

Usage metrics

    Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC