0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
From Faces to Words: An Efficient Persian Visual Lip Reading
نویسندگان :
Mana Amini
1
Sajjad Aemmi
2
Azadeh Ashouri
3
Reza Akhoundzadeh
4
Kourosh Hassanzadeh
5
Mohammad Reza Mohammadi
6
1- PART AI Research Center
2- PART AI Research Center
3- PART AI Research Center
4- PART AI Research Center
5- PART AI Research Center
6- Iran University of Science and Technology
کلمات کلیدی :
Lip Reading،Visual Speech Recognition،CTC Loss،LSTM،Video-Based Authentication
چکیده :
Visual speech recognition, or lip reading, is the task of transcribing spoken content directly from video frames of a speaker’s mouth without relying on audio. We develop an end-to-end visual lip reading system that processes cropped mouth regions from video sequences and decodes them into text using recurrent neural networks trained with CTC loss. To extend beyond existing English datasets, we collected and manually annotated a new Persian Lip Reading Dataset (PLRD), providing valuable resources for studying morphologically rich languages. Our experiments show that the proposed system achieves competitive word error rates on our custom Persian dataset. Beyond transcription, the model can also be employed in authentication scenarios, where it verifies whether a spoken phrase in a video matches a given reference text. This demonstrates the potential of lip reading systems not only for accessibility and robust speech recognition in noisy environments, but also for secure user verification.
لیست مقالات
لیست مقالات بایگانی شده
A Swarm Intelligence Approach to Design Optimal Repeaters in Multilayer Graphene Nanoribbon Interconnects
Majid Sanaeepur - Maryam Momeni
Automatic Analysis of Inconsistencies in Inter-Enterprise Business Processes: Introducing a Formal Adaptation Patterns Catalog
Somayeh Ashourian - Shohreh َAjoudanian
Investigating the impact of management information systems (MIS) on organizational transparency with an emphasis on work ethics
Sadegh Balouch - Omid mehdi Ebadati
Short-Term Traffic Flow Prediction Based on a Recurrent Deep Neural Networks: Study in Tehran
Dr Monireh عبدوس - Taha Vajed Samei
آسیب شناسی استقرار بلاکچین در صنعت بانکی کشور ایران
نیلوفر مرادحاصل
تحلیل احساسات نظرات کاربران تجارت الکترونیک با استفاده از تکنیک های یادگیری عمیق
محیا دشتیانه - رضا قاسمی یقین
A Comparison between Slimed Network and Pruned Network for Head Pose Estimation
Amir Salimiparsa - Hadi Veisi - Mohammad-shahram Moin
یک رویکرد سریع تحلیل و شناسایی آسیب پذیری Next-Intent در برنامه های کاربردی اندروید
زهرا کلوندی - دکتر مهدی سخائی نیا زهرا کلوندی - مهدی سخائی نیا -
ML-based Optical Fibre Fault Detection in Smart Surveillance and Traffic Systems
Rushil Patel - Sana Narmawala - Nikunjkumar Mahida - Rajesh Gupta - Sudeep Tanwar - Hossein Shahinzadeh
Paths-oriented Test Data Generation using Genetic Algorithm
Mohammad Reza Hassanpour Charmchi - Dr Bagher Rahimpour cami
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2