0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
Authors :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
Keywords :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
Abstract :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
Papers List
List of archived papers
AI-based Message Spam Classification Framework for Secure Autonomous Vehicles Communication
Riya Upadhyay - Mili Virani - Lakshit Pathak - Rajesh Gupta - Sudeep Tanwar - Hossein Shahinzadeh
A Model-Driven Approach for Automatic Generation of Android Tourism Applications
Sara Adib - Bahman Zamani
A parallel approach to the fractional time delay model for predicting the spread of COVID-19
Mahdi Movahedian Moghaddam - Kourosh Parand
DRL-Based Phase Optimization for O-RIS in Dual-Hop Hard Switching FSO/RIS-aided RF and UWOC Systems
Aboozar Heydaribeni - Hamzeh Beyranvand - Sahar Eslami
چارچوب بومی پیادهسازی حکمرانی داده در رسانههای عمومی بر پایه مدل EDM
مریم فتحی - عبدالله امیرخانی - فرشید بهجت محمدی - ملیحه حاجی حسینی
A New Sentence Ordering Method Using BERT Pretrained Model
Melika Golestanipour - Seyedeh Zahra Razavi - Dr Heshaam Faili
Beyond One-Hot: CatBoost for Heating and Cooling Load Prediction
Shayan Naghizadeh - Mohammad Saeed Rajabi - Ehsan Nazerfard
A Hybrid Crow Search and Penguin Optimization Algorithm (CPMM) for Efficient Cloud Workflow Scheduling
Reza Akraminejad - Farhad Kazemipour - Mozhdeh Koreh Davoodi
ML-based Optical Fibre Fault Detection in Smart Surveillance and Traffic Systems
Rushil Patel - Sana Narmawala - Nikunjkumar Mahida - Rajesh Gupta - Sudeep Tanwar - Hossein Shahinzadeh
سیستم پیشنهاددهنده غذای سالم با استفاده از داده کاوی عادت های تغذیه ای کاربران
محمد عباسی - مریم حسینی پزوه - محمدرضا شمس
more
Samin Hamayesh - Version 43.8.0