0% Complete
English
صفحه اصلی
/
پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Evaluating LLMs in Persian News Summarization
نویسندگان :
Arya VarastehNezhad
1
Reza Tavasoli
2
Mostafa Masumi
3
Seyed Soroush Majd
4
Mehrnoush Shamsfard
5
1- University of Tehran
2- University of South Carolina
3- Sharif University of Technology
4- shahid beheshti university
5- shahid beheshti university
کلمات کلیدی :
Text Summarization،Large Language Models،Persian News،LLM Evaluation،Natural Language Processing،Artificial Intelligence
چکیده :
This study evaluates the performance of eight Large Language Models (LLMs) in Persian news summarization: GPT-4o, Claude-3.5-Sonnet, Gemini-Pro-1.5, Llama-3.1-405B, Command-R, Mistral-Large-2, DeepSeek V2.5, and Gemma-2-9B. We assess these models across five news categories: Economy, International, Sports, Technology, and Social, using the pn_summary dataset. Our evaluation employs multiple metrics, including BERTScore and ROUGE, across two input conditions: article-only and article-with-title. Results show that Llama-3.1-405b performed best against reference summaries in the article-only setting, achieving the highest BERTScore F1 (50.60) and ROUGE-L (33.96) scores. Notably, including article titles helped models produce summaries which aligned more closely to the reference summary, increasing the average BERTScore F1 from 48.31 to 50.16 across most models. Moreover, when comparing generated summaries to original articles, Mistral-Large-2 led with a BERTScore F1 of 48.09. In category-specific analysis, Mistral-Large-2 consistently outperformed the reference summaries across all news categories, with the most significant improvement in the Economic category. This study provides valuable insights into the current capabilities of LLMs for Persian summarization, highlighting their potential and the impact of input structure on performance. Our findings contribute to the growing body of research on multilingual summarization and have practical implications for Persian language processing applications.
لیست مقالات
لیست مقالات بایگانی شده
تحلیل کتابسنجی از مقالات حوزه دوقلوهای دیجیتال
فاطمه مکی زاده - سارا صراف - مصطفی شیرالی
A Hybrid Crow Search and Penguin Optimization Algorithm (CPMM) for Efficient Cloud Workflow Scheduling
Reza Akraminejad - Farhad Kazemipour - Mozhdeh Koreh Davoodi
بررسی روشها، مجموعههای داده و معیارهای ارزیابی در حوزهی پرسش از متون درون تصویر
کبری فرشیدی - حسن ختنلو - محرم منصوری زاده - الهام علی قارداش
Identifying Children's Personality Styles through Drawing Analysis using Machine Learning
Maedeh Mosharraf - Faezeh Banabazi
Short-Term Traffic Flow Prediction Based on a Recurrent Deep Neural Networks: Study in Tehran
Dr Monireh عبدوس - Taha Vajed Samei
تولید خودکار موارد آزمون برای پوشش مسیر اصلی با الگوریتم جایا
ُSaba Yadegari - Mohammad-Reza Keyvanpour
A Deep Neural Network-based Method for MmWave Time-varying Channel Estimation
Amirhossein Molazadeh - Zahra Maroufi - Mehrdad Ardebilipour
Distributed coordination protocol for event data exchange in IoT monitoring applications
Behnam Khazael - Hadi Tabatabaee Malazi
پیش بینی گره های رهبر در شبکه های اجتماعی با استفاده از پیش بینی پیوند
روح اله رشیدی - فرساد زمانی بروجنی - محمد رضا سلطان آقایی - هادی فرهادی
A Comparative Evaluation of Machine Learning Models for Anomaly-Based IDS in IoT Networks
Seyed Amir Mousavi - Mostafa Sadeghi - Mohammad Sadeq Sirjani
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2