0% Complete
فارسی
Home
/
پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
GanjNet: Leveraging Network Modeling with Large Language Models for Persian Word Sense Induction
Authors :
Amir Mohammad Kouyeshpour
1
Hadi Veisi
2
Saman Haratizadeh
3
1- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
2- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
3- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
Keywords :
Word Sense Induction،Network Modeling،Community Detection،Large Language Models،Persian NLP،Lexical Semantics
Abstract :
Abstract—This paper introduces GanjNet, a novel approach to Word Sense Induction (WSI) in the Persian language that leverages network modeling and community detection in conjunction with large language models (LLMs). We present a method that constructs semantic graphs from lexical substitutes generated by LLMs and applies community detection algorithms to uncover and distinguish word senses in unannotated text. GanjNet addresses challenges such as limited annotated resources, high degrees of polysemy, and context-sensitive meanings in Persian. By leveraging unsupervised techniques, we enhance sense induction without relying on extensive labeled data. Our experiments demonstrate that GanjNet outperforms existing methods on a custom dataset derived from MirasText, achieving a V-measure of 47% and a paired F-score of 58%, compared to the best baseline method with a V-measure of 41% and a paired F-score of 53%. These results showcase the potential of integrating community detection and LLMs for unsupervised semantic tasks in morphologically rich languages like Persian. Moreover, GanjNet’s flexibility offers practical applicability across various domains, including automatic thesaurus and WordNet generation, as well as assisting writers in context-sensitive word choice, demonstrating its broader impact on natural language understanding.
Papers List
List of archived papers
دستهبندی متون خبری فارسی با یادگیری فعال
مینا طباطبائی - دکتر سعیده ممتازی
IT-based and Non-IT-based methods to separate and collect waste
Hoda Harati - Farzad Haghighi-Rad - Reza Yousefi Zenouz
حفظ حریم خصوصی در انتشار نسخه های متوالی دادههای شبکه اجتماعی با امکان افزایش یال
طاهره سرزهی - دکتر مهری رجایی طاهره سرزهی - مهری رجایی -
Challenges of Specification Mining-based Test Oracle for Cyber-Physical Systems
Maryam Raiyat Aliabadi - Dr Mojtaba Vahidi - Dr Ramak Ghavamizadeh
An integrated approach for estimating software cost estimation using Adaptive Neuro-Fuzzy Inference System and the Grey Wolf Optimization algorithm
Maryam Karimi - Taghi Javdani Gandomani - Mahdi Mosleh
IoT-Driven Water Quality Management System using Deep Q-Network
Shakiba Rajabi - Komeil Moghaddasi
A High-Speed Quantum Reversible Controlled Adder/Subtractor Circuit
Negin Mashayekhi - Mohammad Reza Reshadinezhad - Shekoofeh Moghimi
تشخیص خودکار اختلال عروقی ماکولا با عنوان عروق گسترش یافته در تصاویر آنژیوگرافی حاصل از تصویربرداری OCTA
راضیه گنجی - دکتر محسن ابراهیمی مقدم - دکتر رامین نوری نیا
مکانیابی بهینه آلودگی در شبکههای توزیع آب با استفاده از تکنولوژی اینترنت اشیاء بر مبنای پیشبینی سری زمانی چند متغیره
زینب محزون - امید بوشهریان
Cryptanalysis of two password authenticated key exchange schemes
Mohammad Ali Poorafsahi - Hamid Mala
more
Samin Hamayesh - Version 42.5.2