How to find word error rate of spoken sentence for regression based model?

2 min readMar 11, 2024

I am working on visual speech synthesis. I have used GRID dataset which consists of short sentences. The developed model is regression based model.The model takes mute video as a input & generate speech signal. My aim is to find word error rate from output signal(speech signal). I don’t know how to seperate words from input and output signal in order to find word error rate.

NOTE:-

Matlabsolutions.com provide latest MatLab Homework Help,MatLab Assignment Help , Finance Assignment Help for students, engineers and researchers in Multiple Branches like ECE, EEE, CSE, Mechanical, Civil with 100% output.Matlab Code for B.E, B.Tech,M.E,M.Tech, Ph.D. Scholars with 100% privacy guaranteed. Get MATLAB projects with source code for your learning and research.

Word Error Rate (WER) is a widely used metric for evaluating Automatic Speech Recognition (ASR). To calculate WER for a visual speech synthesis (VSS) system, a reference word transcription and a hypothesis word transcription will be needed, and then standard word error rate alignment can be performed to obtain the WER. These word transcriptions can be obtained in various ways. For example, the reference word transcriptions might come from the visual dataset labels. The hypothesis word transcription might come from the VSS system itself (if the VSS system has an intermediate representation in words), or from running ASR on the synthesized speech. It is important to note that while WER is a widely-used metric, it does not capture all aspects of visual speech synthesis quality. Other evaluation metric

SEE COMPLETE ANSWER CLICK THE LINK