Logo image
Vision-Language Artificial Intelligence for Robotic-Based Monitoring: Concrete Defect Detection, Classification, and Localization in Two-Dimensional Maps
Journal article   Peer reviewed

Vision-Language Artificial Intelligence for Robotic-Based Monitoring: Concrete Defect Detection, Classification, and Localization in Two-Dimensional Maps

Farzad Azizi Zade and Arvin Ebrahimkhanlou
Journal of computing in civil engineering, v 40(2), 04025157
01 Mar 2026
Featured in Collection :   Drexel's Newest Publications

Abstract

Technical Papers
AbstractThis paper introduces a novel framework that combines vision-language models (VLMs) and localization techniques to detect, classify, and localize visual structural defects using moving platforms such as robots and handheld devices, with an emphasis on concrete defects. The framework interactively searches for defects by analyzing images captured from various locations and perspectives, employing, but not limited to, the vision transformer for open-world localization (OWL-ViT). Upon detection, defect localization is estimated using the moving platform’s position, orientation, view angles, and depth measurements, with a postprocessing module further enhancing detection relevancy via mixing estimations from distinct views. Evaluations in the real world, in simulation, and on a custom dataset include prompt engineering and a comparison with the classic models (e.g., YOLO). The framework achieves an average Euclidean error of 0.56 m with OWL-ViT’s optimal prompt, compared to 0.75 m with YOLO and 0.97 with DETR, demonstrating its potential for robotic inspection of concrete structures.

Metrics

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Interdisciplinary Applications
Engineering, Civil
Logo image