0% Complete
English
صفحه اصلی
/
دوازدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Conceptual Intelligent Model for Visual Question Answering using Attention Mechanism and Relational Reasoning
نویسندگان :
ٍElham Alighardash
1
Hassan Khotanlou
2
Vahid Pour Amin
3
1- دانشگاه بوعلی سینا
2- دانشگاه بوعلی سینا
3- دانشگاه سیدجمال الدین اسدآبادی
کلمات کلیدی :
visual question answering, attention mechanism, visual reasoning, zero-shot learning
چکیده :
In recent years, a great deal of interest in research of Visual Question Answering (VQA) has been propounded it as a hot topic in computer vision. Many sub-problems were raised in this regard, and reasonable efforts have been made to solve them. Considering salient elements of different modalities, discovering inter or intra correlation, proper information fusion method, using supplementary information of external knowledge bases, visual reasoning, and accepting correct answers that have not been seen before in the training set are examples of these issues. In this paper, the focus is on reinforcing the model by reasoning about complex questions, applying the attention mechanism, and leveraging knowledge graphs (KG) to improve the generated answers. Moreover, the proposed conceptual model includes a zero-shot learning method to allow unlabeled correct answers by implementing a semantic space mapping approach. The use of the fact-based VQA knowledge base for integrating the scene graph with additional information is suggested in the research. It is expected that based on the proposed approach of the framework, its implementation will lead to better accuracy and improvement in efficiency for predicting the appropriate answers.
لیست مقالات
لیست مقالات بایگانی شده
Aligning the Brick and Mortar cosmetic with digital transformation as the right way to overhaul the In-store Experience
Mehrgan Malekpour - Dr Federica Caboni
استخراج ویژگی مجموعه دادههای پزشکی دارای ابعاد بالا با استفاده از برنامه نویسی ژنتیک چند منظوره
سحر فقیهی راد - دکتر سیده نفیسه آل محمد سحر فقیهی راد - سیده نفیسه آل محمد -
Embedded speech encoder for low-resource languages
Alireza A.Tabatabaei - Pouria Sameti - Ali Bohlooli
Distributed Learning Automata-based Algorithm for Finding K-Clique in Complex Social Networks
Mohammad Mehdi Daliri Khomami - Alireza Rezvanian - Ali Mohammad Saghiri - Mohammad Reza Meybodi
ML-based Optical Fibre Fault Detection in Smart Surveillance and Traffic Systems
Rushil Patel - Sana Narmawala - Nikunjkumar Mahida - Rajesh Gupta - Sudeep Tanwar - Hossein Shahinzadeh
Benchmarking Embedding Models for Persian-Language Semantic Information Retrieval
Mahmood Kalantari - Mehdi Feghhi - Nasser Mozayani
DynamicEvoStream : خوشه بندی پویای جریان داده تکاملی در زمانهای بیکاری
زهرا عمیقی - مرتضی یوسف صنعتی - میرحسین دزفولیان
پیاده سازی موازی یک طرح (t,n)-تسهیم چند تصویر با استفاده از GPU
سعیده کبیری راد
Analysing effect of news polarity on stock market prediction: a machine learning approach
Golshid Ranjbaran - Dr Mohammad-Shahram Moin - Dr Sasan H Alizadeh - Dr Abbas Koochari
A novel approach audio watermarking based on (GBT,DCT,SVD)
Mahdi Mosleh
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 41.3.1