Conference proceeding
Fusion of Text and Audio Semantic Representations Through CCA
MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, v 8869, pp 66-73
01 Jan 2015
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
Humans are natural multimedia processing machines. Multimedia is a domain of multi-modalities including audio, text and images. A central aspect of multimedia processing is the coherent integration of media from different modalities as a single identity. Multimodal information fusion architectures become a necessity when not all information channels are available at all times. In this paper, we introduce a multimodal fusion of audio signals and lyrics in a shared semantic space through canonical correlation analysis. We propose an audio retrieval system based on extended semantic analysis of audio signals. We will combine this model with a tf-idf representation of lyrics to achieve a multimodal retrieval system. We use canonical correlation analysis and supervised learning methods as a basis for relating audio and lyrics information. Our experimental evaluation of the proposed method indicated that the proposed model outperforms the prior approaches based on simple canonical correlation methods. Finally, the efficiency of the proposed method allows for dealing with large music and lyrics collections enabling users to explore relevant lyrics information for music datasets.
Metrics
11 Record Views
Details
- Title
- Fusion of Text and Audio Semantic Representations Through CCA
- Creators
- Kamelia Aryafar - Drexel UniversityAli Shokoufandeh - Drexel University
- Contributors
- F Schwenker (Editor)S Scherer (Editor)L P Morency (Editor)
- Publication Details
- MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, v 8869, pp 66-73
- Series
- Lecture Notes in Artificial Intelligence
- Publisher
- Springer Nature
- Number of pages
- 8
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Computer Science
- Web of Science ID
- WOS:000360223900007
- Scopus ID
- 2-s2.0-84927928093
- Other Identifier
- 991019168075004721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Theory & Methods