Human-Robot Interaction in Bengali language for Health Automation integrated with Speaker Recognition and Artificial Conversational Entity
Shehan Irteza Pranto, Rahad Arman Nabid, Ahnaf Mozib Samin, Nabeel Mohammed, Farhana Sarker, Mohammad Nurul Huda, Khondaker A. Mamun
The research study presents an architecture of HumanRobot Interaction (HRI) based Artificial Conversational Entity integrated with speaker recognition ability to avail modern healthcare services. Due to the Covid-19 pandemic, the situation has become troublesome for health workers and patients to visit hospitals because of the high risk of virus dissemination. To minimize the mass congestion, our developed architecture would be an appropriate, cost-effective solution that automates the reception system by enabling AI-based HRI and providing fast and advanced healthcare services in the context of Bangladesh. The architecture consists of two significant subsections: Speaker Recognition and Artificial Conversational Entities having Automatic Speech Recognition in Bengali, Interactive Agent, and Text-to-Speech-synthesis. We used MFCC features as the linguistic parameters and the GMM statistical model to adapt each speaker’s voice and estimation and maximization algorithm to identify the speaker’s identity. The developed speaker recognition module performed significantly with 94.38% average accuracy in noisy environments and 96.27% average accuracy in studio quality environments and achieved a word error rate (WER) of 42.15% from RNN based Deep Speech 2 model for Bangla Automatic Speech Recognition (ASR). Besides, Artificial Conversational Entity performs with an average accuracy of 98.58% in a small-scale real-time environment.
Type Conference paper