Skip to content
This repository has been archived by the owner on Feb 9, 2024. It is now read-only.

pszemraj/BoulderAreaDetector

Repository files navigation

BoulderAreaDetector

Deploys a deep learning CNN classifying satellite imagery to streamlit for user testing.

  • The app is an MVP demo of BoulderSpot. The original idea for BoulderSpot originated in the June 2021 CASSINI Hackathon, that repo is here.
  • BoulderSpot uses a similar model to the one included here to classify whether aerial images are potential boulder areas or not. The class results are then used as part of a graph-like framework to analyze aerial imagery all across Switzerland. You can find more details on the website!

An example of model predictions on a holdout set:

Predicted Class-climb_area Examples

A picture of some of the boulders the (full) model found after an in-person data validation trip:

Boulderspot-trip-03-Valhalla-06-min

Model Stats - CNN Classifier

In short, the predictor under-the-hood is: fastai library using a convolutional neural network trained on a labeled dataset of several thousand images with two classes (climb_area, other). Source image data for training is mostly arial (possibly some satellite) sampled from Switzerland.

Note: the model deployed in the streamlit app has changed. the original model used in this app was ResNet101 and the trained model file is ~170 MB. As GitHub has limits / special rules around files greater than 100 mb in size, the model has been updated to MixNet-XL, which exhibits similar performance but is smaller (in parameters, and therefore file size).

Also included in the repo is a zipped model file of a trained Big Transfer model that is more accurate than either of the two. As this model is > 100 mb and streamlit unzipping+predicting performance is yet to be tested, it is not deployed to the app yet, but can be used locally.

A decent writeup on how to create, train, and save a fastai computer vision model is in this Medium article. BoulderAreaDetector uses a decently sized labeled dataset (several thousand satellite images, each 256x256 in the two classes), but has not had any significant level of hyperparameter optimization yet beyond fast. ai basics.

MixNet: Model itself

  • MixNet *Note: the above links to timm source code as the MixNet paper is already linked above*
  • package: fast.ai (pytorch)
  • trained for 20 epochs
  • Loss: FlattenedLoss of CrossEntropyLoss()
  • Optimizer: Adam
  • Total params: 11,940,824

MixNet:Confusion Matrix & Metrics

MixNet-XL Confusion Matrix

              precision    recall  f1-score   support

  climb_area       0.84      0.57      0.68       206
       other       0.98      0.99      0.99      3854

    accuracy                           0.97      4060
   macro avg       0.91      0.78      0.83      4060
weighted avg       0.97      0.97      0.97      4060

More details can be found in /info

Probability Distributions (on a holdout set)

#TODO

Examples / Inference

#TODO

Highest Loss Images (test set)

The following images had the highest loss when evaluated as part of the test (not holdout) set during training:

highest loss MixNet imgs


Details on Original ResNet101 Fine-Tuned Model

This was the original model that was replaced as the file size was too large.

  • ResNet101
  • package: fast.ai (pytorch)
  • trained for 20 epochs
  • Loss: FlattenedLoss of CrossEntropyLoss()
  • Optimizer: Adam

confusion matrix resnet101 More details:

              precision    recall  f1-score   support

  climb_area       0.79      0.76      0.77       101
       other       0.97      0.97      0.97       800

    accuracy                           0.95       901
   macro avg       0.88      0.87      0.87       901
weighted avg       0.95      0.95      0.95       901

Citations

MixNet

@misc{tan2019mixconv,
      title={MixConv: Mixed Depthwise Convolutional Kernels},
      author={Mingxing Tan and Quoc V. Le},
      year={2019},
      eprint={1907.09595},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Big Transfer

@misc{kolesnikov2020big,
      title={Big Transfer (BiT): General Visual Representation Learning},
      author={Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and
      Jessica Yung and Sylvain Gelly and Neil Houlsby},
      year={2020},
      eprint={1912.11370},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

ResNet

@misc{he2015deep,
      title={Deep Residual Learning for Image Recognition},
      author={Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
      year={2015},
      eprint={1512.03385},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}