Megalodon is an extensive library of all-that-is-to-sharks. It contains various shark datasets, a model zoo and pre-trained networks for various tasks.
- The dataset.csv file
- The bounding box ground-truth for detection tasks
- Spectrograms of shark calls
The dataset is highly imbalanced due to some lesser known species of sharks. It is dependent on the images available on Google. For the sake of sample, some images are available in the repository. All the images are not published and will not be published. However, the numpy files and the code would be available for use-cases. The frequency distribution of different species is shown in the graph.
Although spotting differences between various sharks is pretty easy given their distinct features: Hammerhead Shark vs White Shark. Similarly, it is relatively easy to spot differences between Nurse Shark and White or Tiger Shark. However it is not so easy to spot differences between White Shark and Tiger Shark. This is because of striking similarity between the two types (shown in sample images below). It would be interesting to explore the classification results on such a dataset.
Nurse Shark | White Shark | Tiger Shark |
---|---|---|
The entire dataset is available in the Extensive-Data
folder. It has a single .csv file dataset.csv
. The .csv file is described below.
Column Name | Row Data |
---|---|
Shark Specie | String: the name of shark |
Image | URL: the link to the image |
Shark Call | URL: the link to the shark call |
Textual Description | String: the description of the shark specie |