Thesis
Underwater acoustic target recognition on ShipsEar dataset: a convolution-free, purely attention-based approach with audio spectrogram transformer
Master of Science (M.S.), Drexel University
Jun 2023
DOI:
https://doi.org/10.17918/00001751
Abstract
The task of classifying underwater audio source has various marine-oriented applications, including maritime and environmental monitoring, detection of marine life, and underwater surveillance. However, Underwater Acoustic Target Recognition (UATR) remains challenging due to several factors. These include high cost of acquiring acoustic data, lack of of publicly accessible labeled datasets, interference induced by both underwater and over-the-air noise sources, and other environmental factors such as sound speed variations, weather, bottom boundary conditions, etc. In this thesis, we propose a convolution-free, purely attention-based transformer architecture for the task of classifying underwater acoustic source. The architecture, adopted from the Audio Spectrogram Transformer (AST) applied so far for air acoustic setting, is capable of capturing long-range global context for enhanced underwater acoustic signal classification. Additionally, we further improve the performance by leveraging transfer learning from Vision Transformer (ViT), which has been pretrained on the ImageNet dataset. The AST model is trained on the ShipsEar dataset, which consists of 90 instances of ship-emitted underwater sounds including 11 types of vessels and one background noise category. These sound recordings, obtained under real conditions in shallow waters, encapsulate both of natural and anthropogenic. Currently, the state-of-the-art Convolutional Long Short-term Memory (ConvLSTM) applied to this task achieves an accuracy of 0.9878 and an F1 score of 0.9878. However, our proposed AST model surpasses the ConvLSTM performance, reaching an accuracy of 0.9890 and an F1 score of 0.9897
Metrics
146 File views/ downloads
377 Record Views
Details
- Title
- Underwater acoustic target recognition on ShipsEar dataset
- Creators
- Tam Phi
- Contributors
- David Han (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Master of Science (M.S.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- ix, 27 pages
- Resource Type
- Thesis
- Language
- English
- Academic Unit
- College of Engineering (1970-2026); Electrical (and Computer) Engineering (1970-2026); Drexel University
- Other Identifier
- 991021212415404721