Ship Detection on Remote Sensing Synthetic Aperture Radar Data.

The present project was conducted as part of my diploma thesis which focuses on the investigation of methods for the effective detection of ships in synthetic aperture radar satellite imagery utilizing deep learning techniques. These methods use the Faster-RCNN and YOLOv5 network architectures to create three different detectors. More specifically, the first two models created are based on the Faster-RCNN network architecture and utilize a set of normal and rotated bounding boxes for the detection process. The one-stage detection network is based on the architecture of the YOLOv5 model and uses regular bounding boxes to delimit the estimated targets. The produced models are trained and evaluated on the HRSID dataset. The greatest accuracy is found in models that use regular bounding boxes to derive estimates. While, the model with rotated bounding boxes, shows the largest localization errors and is characterized by an increased number of false negative detections.

HRSID Properties.

The High-Resolution SAR Images Dataset contains 116 co-polarized and 20 cross-polarized SAR imageries.
The original imageries for constructing HRSID are 99 Sentinel-1B imageries, 36 TerraSAR-X and 1 TanDEM-X images.
The above 136 panoramic SAR imageries cropped to 5604 high-resolution SAR images.
These 5604 images have dimensions of 800 × 800 pixels, resolution of 96 dpi, and there are in .jpeg format.
The colour depth of the images is 8 bits (one channel).
The extracted 5604 high-resolution SAR images contain 16951 ship instances.
The spatial resolutions of SAR images are 0.5, 1 and 3 meters per pixel.
The annotations of each instance are the corresponding bounding box and the ship’s outline.
The annotations of each SAR image constitute a .json file in MS COCO dataset format.
Paper Link: https://ieeexplore.ieee.org/abstract/document/9127939
Dataset Link: https://github.com/chaozhong2010/HRSID

Proposed architectures of Faster-RCNN.

Faster-RCNN is a two stage detection architecture and contains 3 different submodules: a) Backbone Network, b) Region Proposal Network and c) Fast-RCNN. At the proposed model, Feature Pyramid Network with ResNet backbone was used for the creation of P2-P6 spatial levels. Region Proposal Network receives serially the P2-P6 feature maps and for every Pi level creates a hidden representation, which is shared between the regression and classification layers, and produces two output tensors with predicted objectness logits and anchor deltas for every anchor in the Pi. Next, predicted anchor deltas are applied to the corresponding anchors and the above boxes are sorted by the predicted objectness scores at each Pi level. Then, after the application of a confidence threshold and the NMS algorithm, RPN retains a subset of the anchor boxes from which k ROIs were extracted. Finally, ROI (Box) Head takes the outputs from the FPN and RPN networks, which are the multiscale feature maps and the ROIs respectively, and uses the latter to crop the regions of interest from the feature maps. The cropped regions are then pooled (transformed into the same dimensions) and fed as flattened feature vectors into a pair of fully connected layers that extract the class probabilities and the corresponding coordinates for a predefined number of boxes.

Image source: https://medium.com/@hirotoschwert/digging-into-detectron-2-part-4-3d1436f91266

Proposed architecture of YOLOv5.

YOLOv5 is a one shot detector which contains 2 different networks: a) Feature Extraction Network (Backbone Network) and b) PANet. Backbone network is used for feature extraction and It uses the main modules of C3 (VGP+FLOPS↓) and SPPF (multiscale feature fusion). The PANet network creates a set of feature maps in 3 different spatial scales (P3-P5) which have 3 different anchors at every spatial location. The above tensors (P3-P5) are then fed into the corresponding layer of the “Head” network and after the application of a confidence threshold and the NMS algorithm the final bounding box predictions (class_id, x1, y1, x2, y2, confidence_score) were extracted.

Quantitative Evaluation

Mean Average Precision

Metric	Faster - RCNΝ (Normal Bboxes)	Faster - RCNΝ (Rotated Bboxes)	YOLOv5	STANet¹	DB-YOLO²
AP^0.50:.05:.95	68.1	42.9	71.1	69.5	72.0
AP^0.50	91.4	75.3	94.2	92.4	94.4
AP^0.75	79.3	45.5	82.0	81.1	-
AP^small	69.3	41.3	62.9	70.9	-
AP^medium	68.5	51.1	80.7	68.6	-
AP^large	44.1	20.9	55.1	37.8	-

Mean Average Recall

Metric	Faster - RCNΝ (Normal Bboxes)	Faster - RCNΝ (Rotated Bboxes)	YOLOv5	STANet¹	DB-YOLO²
AR^max=1	27.8	21.9	28.2	-	-
AR^max=10	61.6	44.9	63.5	-	-
AR^max=100	74.0	48.3	75.9	-	-
AR^small	73.5	46.4	69.5	-	-
AR^medium	79.1	57.9	84.5	-	-
AR^large	64.3	29.7	65.1	-	-

¹ SOTA Two Stage Detector (Wang et. al.) See paper
² SOTA One Stage Detector (Zhu et. al.) See paper

Qualitative Evaluation

I created a short video from the large ALOS-2 scene which is provided in the official repository of the HRSID dataset and I run the Faster-RCNN and YOLOv5 models with normal bounding boxes. The rotated bounding boxes are not supported by the Detectron2 framework for video inference so the corresponding Faster-RCNN which utilizes the above bounding box type it is not used.

Faster RCNN with normal bounding boxes

Faster-RCNN-Normal.Bounding.Boxes.mp4

YOLOv5

YOLOv5.mp4

Requirements

torch == 1.7.1+cu110                           torchvision==0.8.2+cu110                       pyyaml == 5.1     
detectron2 == 0.5                              cv2 == 4.1.2                                   wandb == 0.12.11

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
Annotations		Annotations
Configuration_Files		Configuration_Files
Notebooks		Notebooks
Qualitative Evaluation		Qualitative Evaluation
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotations

Annotations

Configuration_Files

Configuration_Files

Notebooks

Notebooks

Qualitative Evaluation

Qualitative Evaluation

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Ship Detection on Remote Sensing Synthetic Aperture Radar Data.

HRSID Properties.

Proposed architectures of Faster-RCNN.

Proposed architecture of YOLOv5.

Quantitative Evaluation

Qualitative Evaluation

Requirements

About

Languages

License

jasonmanesis/Ship-Detection-on-Remote-Sensing-Synthetic-Aperture-Radar-Data

Folders and files

Latest commit

History

Repository files navigation

Ship Detection on Remote Sensing Synthetic Aperture Radar Data.

HRSID Properties.

Proposed architectures of Faster-RCNN.

Proposed architecture of YOLOv5.

Quantitative Evaluation

Qualitative Evaluation

Requirements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages