Hybrid YOLOv9–RCNN Framework for Real-Time Underwater Fish Detection with Enhanced Localization in Low-Visibility Marine Environments

Manoj Kumar Singh; Vivek Kumar; Harminder Kaur

Journal of Scientific Innovation and Advanced Research (JSIAR) Published: March 2026 Volume: 2, Issue: 3 Pages: 151-165

Hybrid YOLOv9–RCNN Framework for Real-Time Underwater Fish Detection with Enhanced Localization in Low-Visibility Marine Environments

Original Research Article

Manoj Kumar Singh¹

¹Department of Computer Science Engineering, Sharda University, Greater Noida, India

Vivek Kumar²

²Department of Computer Science Engineering, Sharda University, Greater Noida, India

Harminder Kaur³

³Department of Computer Science Engineering, Sharda University, Greater Noida, India

*Author for correspondence: Manoj Kumar Singh
Department of Computer Science Engineering, Sharda University, Greater Noida, India
E-mail ID: manojbhu20@gmail.com

ABSTRACT

Continuous observation of underwater ecosystems plays a critical role in sustainable fisheries management, marine biodiversity assessment, and intelligent aquaculture operations. However, reliable monitoring remains challenging due to adverse imaging conditions such as light attenuation, turbidity, scattering, and dynamic backgrounds caused by aquatic vegetation and suspended particles. These factors significantly degrade visual quality in underwater recordings, making manual analysis of marine footage inefficient and prone to error. Consequently, automated fish detection systems based on computer vision and deep learning have emerged as a promising solution for large-scale ecological monitoring and aquaculture surveillance. Recent advances in object detection have been largely driven by deep convolutional neural networks, particularly the You Only Look Once (YOLO) family and Region-Based Convolutional Neural Network (R-CNN) architectures. While modern YOLO detectors provide high inference speed and enable real-time analysis of video streams, their grid-based detection mechanism often struggles with precise localization of small or overlapping objects in cluttered underwater scenes. Conversely, region-based models such as Faster R-CNN offer superior bounding-box refinement and classification accuracy but suffer from higher computational overhead, limiting their applicability in real-time marine monitoring systems deployed on embedded platforms. To address these limitations, this study proposes a hybrid deep learning framework that integrates the rapid detection capability of YOLOv9 with the localization refinement strength of a Region-Based Convolutional Neural Network. In the proposed architecture, YOLOv9 functions as the primary detector to generate candidate object regions in real time, while the RCNN module performs secondary verification and bounding-box optimization to improve detection reliability in complex underwater environments. In addition, lightweight image enhancement techniques and attention-driven feature extraction are incorporated to mitigate the effects of turbidity, low illumination, and background interference commonly observed in marine imagery. The proposed framework is evaluated using publicly available underwater datasets including Fish4Knowledge and OBSEA, along with additional annotated underwater video samples collected from marine monitoring platforms. Performance is assessed using standard object detection metrics such as mean Average Precision (mAP), precision, recall, and frames per second (FPS) to measure both detection accuracy and real-time processing capability. Experimental results demonstrate that the hybrid YOLOv9--RCNN model achieves an improved detection performance with a mean Average Precision exceeding 92%, while maintaining real-time inference speeds above 35 FPS on GPU-enabled systems. Compared with standalone YOLO and Faster R-CNN baselines, the proposed approach significantly enhances localization accuracy for small and partially occluded fish while preserving computational efficiency. The developed framework provides a practical and scalable solution for automated underwater visual monitoring. Its capability to perform accurate real-time fish detection makes it suitable for deployment in aquaculture farms, marine biodiversity studies, and intelligent ocean observation systems. Overall, this work contributes a hybrid detection architecture that effectively balances speed and accuracy, advancing the development of robust computer vision systems for underwater ecological monitoring.

Keywords: Underwater Fish Detection, YOLOv9, Faster R-CNN, Deep Learning, Marine Monitoring, Object Detection, Aquaculture Surveillance

↓ Download Full Article (PDF)