A Comprehensive Review of Imagined Speech Decoding in Brain-Computer Interfaces: Utilizing EEG and fNIRS Technologies

Motaqi, Monireh; Moallemi, Majid; Mirani, Abolfazl; Aboutorabi, Yeganeh; Hatef, Boshra

doi:10.32598/bcn.2026.8254.1

Accepted Articles Back to the articles list | Back to browse issues page

‎ 10.32598/bcn.2026.8254.1

A Comprehensive Review of Imagined Speech Decoding in Brain-Computer Interfaces: Utilizing EEG and fNIRS Technologies

Monireh Motaqi¹

, Majid Moallemi²

, Abolfazl Mirani³

, Yeganeh Aboutorabi⁴

, Boshra Hatef ^*⁵

1- Physiotherapy Research Center, School of Rehabilitation, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
2- Department of Biomedical Engineering, Faculty of Engineering, Islamic Azad University, Mashhad Branch, Mashhad, Razavi Khorasan, Iran.
3- Biomedical Engineering Research Center, New Health Technologies, Baqiyatallah University of Medical Sciences, Tehran, Iran.
4- Department of Microbiology, Faculty of Basic Sciences, Islamic Azad University, Mashhad Branch, Mashhad, Razavi Khorasan, Iran.
5- Neuroscience Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran.

Abstract:

The use of brain–computer interfaces (BCIs) to decode imagined speech has significant clinical and assistive potential. Twenty-six studies investigated covert speech decoding between 2009 and 2025 using EEG, fNIRS, or hybrid EEG–fNIRS systems. Early research (2009–2012) primarily focused on analyzing phonemes and syllables with EEG, achieving accuracy rates around 75%. From 2013 to 2017, CNN-based phoneme decoding produced highly variable results (40%–83%), with more complex multiclass tasks occasionally performing poorly (as low as 26.7%). Since 2018, binary paradigms such as yes/no responses have reached 64%–100% accuracy. CNN variants (about 83.4%), AlexNet (90.3%), and LSTM-RNNs (92.5%) demonstrated notable improvements, whereas architectures like EEGNet and SPDNet often underperformed (24.79%–66.93%). In hybrid EEG–fNIRS methods, convolutional neural networks (CNNs) achieved roughly 53% accuracy, while traditional classifiers like SVM and LDA performed better, reaching 78–79%. These results indicate that although deep learning and multimodal systems have potential for enhancing imagined speech decoding, there are still major challenges related to generalization, variability, and robustness.

Keywords: Brain-computer interface, imagined speech decoding, EEG, fNIRS, machine learning, multimodal fusion

Full-Text [PDF 1132 kb]

Type of Study: Review | Subject: Cognitive Neuroscience
Received: 2025/08/7 | Accepted: 2025/12/24

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Designed & Developed by : Yektaweb