Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks

Edward Kim; Connor Onweller; Kathleen F McCoy; IEEE COMP SOC

doi:10.1109/ICPR48806.2021.9412146

Back

Conference proceeding

Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks

Edward Kim, Connor Onweller, Kathleen F McCoy and IEEE COMP SOC

2020 25th International Conference on Pattern Recognition (ICPR), pp 10188-10195

10 Jan 2021

DOI: https://doi.org/10.1109/ICPR48806.2021.9412146

Featured in Collection : UN Sustainable Development Goals @ Drexel

Additional Links

Abstract

Blogs

Cognition

Deep learning

Neural networks

Social networking (online)

Training data

Visualization

We present a multimodal deep learning framework that can generate summarization text supporting the main idea of an information graphic for presentation to a person who is blind or visually impaired. The framework utilizes the visual, textual, positional, and size characteristics extracted from the image to create the summary. Different and complimentary neural architectures are optimized for each task using crowdsourced training data. From our quantitative experiments and results, we explain the reasoning behind our framework and show the effectiveness of our models. Our qualitative results showcase text generated from our framework and show that Mechanical Turk participants favor them to other automatic and human generated summarizations. We describe the design and results of an experiment to evaluate the utility of our system for people who have visual impairments in the context of understanding Twitter Tweets containing line graphs.

Metrics

17 Record Views

10 citations in Web of Science

9 citations in Scopus

Details

Title: Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks
Creators: Edward Kim - Drexel University
Connor Onweller - University of Delaware
Kathleen F McCoy - University of Delaware
IEEE COMP SOC
Publication Details: 2020 25th International Conference on Pattern Recognition (ICPR), pp 10188-10195
Publisher: IEEE
Grant note: 1954364 / National Science Foundation (10.13039/100000001)
Resource Type: Conference proceeding
Language: English
Academic Unit: Computer Science
Web of Science ID: WOS:000681331402092
Scopus ID: 2-s2.0-85098613930
Other Identifier: 991019168018404721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration
Web of Science research areas: Computer Science, Artificial Intelligence; Engineering, Electrical & Electronic; Imaging Science & Photographic Technology

Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks

Additional Links

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media