Logo image
Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks
Conference proceeding

Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks

Edward Kim, Connor Onweller, Kathleen F McCoy and IEEE COMP SOC
2020 25th International Conference on Pattern Recognition (ICPR), pp 10188-10195
10 Jan 2021

Abstract

Blogs Cognition Deep learning Neural networks Social networking (online) Training data Visualization
We present a multimodal deep learning framework that can generate summarization text supporting the main idea of an information graphic for presentation to a person who is blind or visually impaired. The framework utilizes the visual, textual, positional, and size characteristics extracted from the image to create the summary. Different and complimentary neural architectures are optimized for each task using crowdsourced training data. From our quantitative experiments and results, we explain the reasoning behind our framework and show the effectiveness of our models. Our qualitative results showcase text generated from our framework and show that Mechanical Turk participants favor them to other automatic and human generated summarizations. We describe the design and results of an experiment to evaluate the utility of our system for people who have visual impairments in the context of understanding Twitter Tweets containing line graphs.

Metrics

17 Record Views
9 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Computer Science, Artificial Intelligence
Engineering, Electrical & Electronic
Imaging Science & Photographic Technology
Logo image