Conference proceeding
Multi-Segment Cross-Correlation CNN-Transformer Network for Underwater Sound Source Localization
Oceans (New York. Online), pp 1-9
29 Sep 2025
Abstract
Sound source localization is a critical yet challenging task, especially in underwater environments where complex sea conditions, variable velocity profiles, and ambient noise complicate accurate estimation. Although deep learning approaches have advanced the field, many existing methods overlook the importance of multi-segment analysis. This limitation reduces their ability to capture multipath arrivals that occur beyond the initial second, which is essential for precise range estimation in reverberant underwater settings. To address these challenges, we introduce the Multi-Segment Cross-Correlation CNN-Transformer (MS-CC-CNN-Transformer). This hybrid deep learning architecture integrates CNN-based spatial encoding with Transformerbased temporal modeling over multiple seconds of input. By leveraging phase-enhanced features from GCC-PHAT and a windowed multi-segment design, our model effectively captures longrange dependencies and reverberation patterns that are crucial for localization in shallow-water environments. The modular design also enables adaptation to a variety of underwater acoustic tasks. Evaluated on the SWellEx-96 S5 dataset, the MS-CC-CNNTransformer achieves state-of-the-art range prediction accuracy and substantially outperforms existing techniques. Experimental results demonstrate robust generalization across diverse array configurations, including vertical, tilted, and horizontal line arrays.
Metrics
25 Record Views
Details
- Title
- Multi-Segment Cross-Correlation CNN-Transformer Network for Underwater Sound Source Localization
- Creators
- Priontu Chowdhury - Drexel UniversityQuoc Thinh Vo - Drexel UniversityJoseph T. Woods - Drexel UniversityDavid K. Han - Drexel University
- Publication Details
- Oceans (New York. Online), pp 1-9
- Publisher
- IEEE
- Grant note
- N00014-21-1-2790 / Office of Naval Research (10.13039/100000006)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Electrical and Computer Engineering
- Scopus ID
- 2-s2.0-105029596246
- Other Identifier
- 991022138657804721