0% Complete
فارسی
Home
/
پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Evaluating LLMs in Persian News Summarization
Authors :
Arya VarastehNezhad
1
Reza Tavasoli
2
Mostafa Masumi
3
Seyed Soroush Majd
4
Mehrnoush Shamsfard
5
1- University of Tehran
2- University of South Carolina
3- Sharif University of Technology
4- shahid beheshti university
5- shahid beheshti university
Keywords :
Text Summarization،Large Language Models،Persian News،LLM Evaluation،Natural Language Processing،Artificial Intelligence
Abstract :
This study evaluates the performance of eight Large Language Models (LLMs) in Persian news summarization: GPT-4o, Claude-3.5-Sonnet, Gemini-Pro-1.5, Llama-3.1-405B, Command-R, Mistral-Large-2, DeepSeek V2.5, and Gemma-2-9B. We assess these models across five news categories: Economy, International, Sports, Technology, and Social, using the pn_summary dataset. Our evaluation employs multiple metrics, including BERTScore and ROUGE, across two input conditions: article-only and article-with-title. Results show that Llama-3.1-405b performed best against reference summaries in the article-only setting, achieving the highest BERTScore F1 (50.60) and ROUGE-L (33.96) scores. Notably, including article titles helped models produce summaries which aligned more closely to the reference summary, increasing the average BERTScore F1 from 48.31 to 50.16 across most models. Moreover, when comparing generated summaries to original articles, Mistral-Large-2 led with a BERTScore F1 of 48.09. In category-specific analysis, Mistral-Large-2 consistently outperformed the reference summaries across all news categories, with the most significant improvement in the Economic category. This study provides valuable insights into the current capabilities of LLMs for Persian summarization, highlighting their potential and the impact of input structure on performance. Our findings contribute to the growing body of research on multilingual summarization and have practical implications for Persian language processing applications.
Papers List
List of archived papers
Predicting Suicide Risk in Adolescents with Random Forest for Unbalanced Data Management
Fatemeh Rabbani - Dr Behrooz Masoumi - Dr Mohammad Reza Keyvanpour
PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph
Romina Etezadi - Mehrnoush Shamsfard
Enhancing Supervised Learning in Speech Emotion Recognition through Unsupervised Representations
Niloufar Faridani - Amirali Soltani Tehrani - Ramin Toosi
پیشبینی بستری مجدد بیماران با استفاده از استخراج مفاهیم زیستپزشکی از متون بالینی
فهیمه شاهرخ شهرکی - رسول سامانی - دکتر ناصر قدیری فهیمه شاهرخ شهرکی - رسول سامانی - ناصر قدیری -
Ensemble Model Based on an Improved Convolutional Neural Network with a Domain-agnostic Data Augmentation Technique
Faraz Fatahnaie - Armin Azhdehnia - Seyyed Amir Asghari - Mohammadreza Binesh Marvasti
طراحی واسط کاربری مبتنی بر رفتار و احساسات کاربران در سیستم های هوشمند
فاطمه صبائی - دکتر احمد عبداله زاده بارفروش
A New Method Based on Deep Learning and Time Stabilization of the Propagation Path for Fake News Detection
Fatemeh Torgheh - Dr Mohammad Reza Keyvanpour - Dr Behrooz Masoumi
Detection and Identification of Cyber-Attacks in Cyber-Physical Systems Based on Machine Learning Methods
Zohre Nasiri Zarandi
Sustainability analysis and improvement of model driven engineering and model transformation languages
Kevin Lano - Shekoufeh Kolahdouz Rahimi
روشی برای تشخیص مرحله پیشرفت آلزایمر در تصاویرFMRI مبتنی بر شبکه های عصبی چگال
فرساد زمانی بروجنی - عباس بهره دار
more
Samin Hamayesh - Version 42.5.2