0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
نویسندگان :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
کلمات کلیدی :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
چکیده :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
لیست مقالات
لیست مقالات بایگانی شده
Human Resource Allocation to the Credit Requirement Process, A Process Mining Approach
Omid Mahdi Ebadati - Mohammad Mehrabioun - Shokoofeh Sadat Hosseini
A Demand Response Schema in Industry: Smart Scheduling Approach for Industrial Processes
Negin Shafinezhad - Hamid Abrishami - Maryam Mahmoodi
A Model-Driven Approach for Automatic Generation of Android Tourism Applications
Sara Adib - Bahman Zamani
A Fuzzy Cluster-Based Routing Algorithm to Extend Wireless Sensor Network Lifetime
Mostafa Mirzaie - Armin Mazinani - Dr Sayyed Majid Mazinani
A Nano-based High-Speed QCA circuit for Information Security with Image Masking
Saeid Seyedi - Hatam Abdoli
GanjNet: Leveraging Network Modeling with Large Language Models for Persian Word Sense Induction
Amir Mohammad Kouyeshpour - Hadi Veisi - Saman Haratizadeh
جمعآوری، تحلیل و خلاصه سازی نظرات کاربران فارسی زبان در شبکههای اجتماعی پیرامون بیماری فراگیر کووید-19
محمدرضا شمس - محمد یاسین فخار محمدرضا شمس - محمد یاسین فخار -
یک روش خوشه بندی گره ها برای شبکه های حسگر بیسیم با هدف بهبود متوازن سازی بار مبتنی بر تکنیک تاپسیس
راضیه حسین رضایی - فهیمه یزدان پناه
A Data-Driven Hybrid Algorithm for 2D Path Planning via Modeling and Metaheuristic-Based Identification
Vahid Safari Dehnavi - Masoud Shafiee
Energy–Aware Clustering Routing Protocol to Improve the Multi-hop WSN Lifetime
Alireza Gholamrezaee - Hoda Gholamrezaee - Mahtab Hadiyan
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2