Shipsear Specaugment Electric transformers Underwater acoustics Signal Processing
The task of classifying underwater audio source has various marine-oriented applications, including maritime and environmental monitoring, detection of marine life, and underwater surveillance. However, Underwater Acoustic Target Recognition (UATR) remains challenging due to several factors. These include high cost of acquiring acoustic data, lack of of publicly accessible labeled datasets, interference induced by both underwater and over-the-air noise sources, and other environmental factors such as sound speed variations, weather, bottom boundary conditions, etc. In this thesis, we propose a convolution-free, purely attention-based transformer architecture for the task of classifying underwater acoustic source. The architecture, adopted from the Audio Spectrogram Transformer (AST) applied so far for air acoustic setting, is capable of capturing long-range global context for enhanced underwater acoustic signal classification. Additionally, we further improve the performance by leveraging transfer learning from Vision Transformer (ViT), which has been pretrained on the ImageNet dataset. The AST model is trained on the ShipsEar dataset, which consists of 90 instances of ship-emitted underwater sounds including 11 types of vessels and one background noise category. These sound recordings, obtained under real conditions in shallow waters, encapsulate both of natural and anthropogenic. Currently, the state-of-the-art Convolutional Long Short-term Memory (ConvLSTM) applied to this task achieves an accuracy of 0.9878 and an F1 score of 0.9878. However, our proposed AST model surpasses the ConvLSTM performance, reaching an accuracy of 0.9890 and an F1 score of 0.9897
Metrics
113 File views/ downloads
323 Record Views
Details
Title
Underwater Acoustic Target Recognition on ShipsEar Dataset
Creators
Tam Phi
Contributors
David Han (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Master of Science (M.S.)
Publisher
Drexel University; Philadelphia, Pennsylvania
Number of pages
ix, 27 pages
Resource Type
Thesis
Language
English
Academic Unit
College of Engineering (1970-2026); Electrical (and Computer) Engineering [Historical]; Drexel University