A Rust implementation of Siamese Neural Networks for One-shot Image Recognition for benchmarking performance and results.
Python is a very flexible scripting language, suitable for machine learning experiments. However, it's widely known that the semantic properties of Python and of interpreted languages in general do not lend themselves well in performance, frugality, or scaling. While various Python libraries have been implemented in performant languages such as C++, there may remain sizeable gains in performance, safety, and efficiency in using modern performant languages directly. Rust is a language that in 2020 seems mature enough to begin to be used seriously. We aim to implement Siamese Neural Networks for One-shot Image Recognition using both Rust and Python (with popular libraries) and to compare their performance, efficiency, and anomalies (errors and discrepancies) to hopefully provide insight into whether or not it's worth using a more difficult programming language.
You can expedite the setup and run the basic test cases using cargo test
.
git submodule update --init
Review the contents of oneshot-data/data_augmented
and choose a <background-dataset>
from the *background*
datasets.
unzip oneshot-data/data_augmented/<background-dataset>.zip
To select <num-background-pairs>
pairs of images from <background-dataset>
for a background set, run the following script.
cargo run -- <dataset-directory> <num-pairs>
sampling
: Sample<num-background-pairs>
image pairs in a 50%/50% positive/negative split.- Negative pairs using the same script strictly avoid using the same character.
- Pairs are represented by filenames relative to the repository root.
data
: Given an image pair set, load the pairs into two parallel arrays of images (each image is an 105x105 matrix ofbool
values) along another parallel array of labels (bool
values for positive/negative).bool
values were chosen because images in the Omniglot dataset are bilevel (black/white) and therefore each pixel can be represented by just one bit.
network
: (Work in Progress)
Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332-1338.
- Update: Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2019). The Omniglot Challenge: A 3-Year Progress Report. Preprint available on arXiv:1902.03477.