Roman Bednarik , Hung-Hsuan Huang , Kristiina Jokinen , Yukiko I. Nakano
Researchers are proposing interactive machine translation as a potential method to make language translation process more efficient and usable. Introduction of different modalities like eye gaze and speech are being explored to add to the interactivity of language translation system. Unfortunately, the raw data provided by Automatic Speech Recognition (ASR) and Eye-Tracking is very noisy and erroneous. This paper describes a technique for reducing the errors of the two modalities, speech and eye-gaze with the help of each other in context of sight translation and reading. Lattice representation and composition of the two modalities was used for integration. F-measure for Eye-Gaze and Word Accuracy for ASR were used as metrics to evaluate our results. In reading task, we demonstrated a significant improvement in both Eye-Gaze f-measure and speech Word Accuracy. In sight translation task, significant improvement was found in gaze f-measure but not in ASR.
Gazein '13. Proceedings of the 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, 2013, p. 35-40
Main Research Area:
The 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction. GazeIn '13, 2013