Geed Lab

Neuroplasticity and Motor Function Recovery after Stroke

Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring


Journal article


Mohammad Akidul Hoque, Shamim Ehsan, Anuradha Choudhury, Peter S. Lum, Monika Akbar, Shashwati Geed, M. S. Hossain
Bioengineering, 2025

Semantic Scholar DOI
Cite

Cite

APA   Click to copy
Hoque, M. A., Ehsan, S., Choudhury, A., Lum, P. S., Akbar, M., Geed, S., & Hossain, M. S. (2025). Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring. Bioengineering.


Chicago/Turabian   Click to copy
Hoque, Mohammad Akidul, Shamim Ehsan, Anuradha Choudhury, Peter S. Lum, Monika Akbar, Shashwati Geed, and M. S. Hossain. “Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring.” Bioengineering (2025).


MLA   Click to copy
Hoque, Mohammad Akidul, et al. “Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring.” Bioengineering, 2025.


BibTeX   Click to copy

@article{mohammad2025a,
  title = {Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring},
  year = {2025},
  journal = {Bioengineering},
  author = {Hoque, Mohammad Akidul and Ehsan, Shamim and Choudhury, Anuradha and Lum, Peter S. and Akbar, Monika and Geed, Shashwati and Hossain, M. S.}
}

Abstract

Stroke-related impairment remains a leading cause of long-term disability, limiting individuals’ ability to perform daily activities. While wearable sensors offer scalable monitoring solutions during rehabilitation, they struggle to distinguish functional from non-functional movements, and manual annotation of sensor data is labor-intensive and prone to inconsistency. In this paper, we propose a novel framework that uses large language models (LLMs) to generate activity descriptions from video frames of therapy sessions. These descriptions are aligned with concurrently recorded accelerometer signals to create labeled training data. Through exploratory analysis, we demonstrate that accelerometer signals exhibit distinct temporal and statistical patterns corresponding to specific activities, supporting the feasibility of generating natural language narratives directly from sensor data. Our findings lay the foundation for future development of sensor-to-text models that can enable automated, non-intrusive, and scalable stroke rehabilitation monitoring without the need for manual or video-based annotation.


Share

Tools
Translate to