Main Article Content

Abstract

Indonesian Sign Language (BISINDO) serves as the primary means of communication for the deaf community. However, limited public understanding and the lack of practical real-time translation technology remain significant barriers to effective two-way communication. Most prior research has focused on foreign sign languages or relied on sensor-based gloves, which are less flexible for everyday use. This study proposes a real-time BISINDO translation system that converts hand gestures into speech using a camera and an ESP32 microcontroller. The system employs a CNN-LSTM deep learning model implemented in Python to classify gestures representing letters A to J, then wirelessly transmits the classification results to the ESP32, which triggers the corresponding audio output. A custom gesture dataset was collected and enhanced through preprocessing and data augmentation to support model training. Evaluation results demonstrate a classification accuracy of 91.4%, with a precision of 89.7%, recall of 90.5%, and F1-score of 89.9%. The average communication latency was recorded at 3.1 seconds, and the speech output success rate reached 86.7%. The system has proven reliable for real-time automatic gesture-to-speech translation and holds potential for further development as an inclusive communication aid for individuals with hearing impairments in Indonesia. This study serves as an initial foundation for future advancements in assistive communication technologies.

Article Details

How to Cite
I Gusti Agung Made Yoga Mahaputra, Putri Alit Widyastuti Santiary, & I Ketut Swardika. (2025). Rancang Bangun Penerjemah BISINDO Real-time Berbasis Kamera dan Deep Learning dengan Kendali Suara ESP32 WiFi. Jurnal Elektro Dan Mesin Terapan, 11(1), 33–42. https://doi.org/10.35143/elementer.v11i1.6578