Implement the RBM layer to learn binary codes for large scale image retrieval #274
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Using the Caffe reference ImageNet model, the features extracted for the training images of the ILSVRC 2013 image classification task are more than 25GB stored in leveldb. It is evident that the occupied space must be reduced for efficient and scalable large scale image retrieval. In recent years, Restricted
Boltzmann Machine (RBM) has been successfully used to embed real-valued features into binary codes for computation and storage efficient document retrieval [1] and image retrieval [2][3][4]. Deep Boltzmann Machines (DBM) consisted of stacked Multilayer RBM are able to map the floating point features of similar images or texts into neighborhoods in the Hamming space. Thus the feature dimensions are reduced at least 32 times and the similarities among the original features are reserved. The final binary features of the ILSVRC 2013 training images would be less than 1GB and fit comfortably into a single CPU memory card or GPU card.
This work is in progress.
[1] Salakhutdinov R. R, and Hinton, G. E. Semantic Hashing. Proceedings of the SIGIR Workshop on Information Retrieval and Applications of Graphical Models, Amsterdam. 2007.
[2] A. Torralba, R. Fergus, and Y. Weiss. Small codes and large image databases for recognition. In Proc. of CVPR, pages 1–8, 2008.
[3] Ranzato, M., Krizhevsky, A. and Hinton, G. E. Factored 3-way restricted Boltzmann machines for modeling natural images. Proc. Thirteenth International Conference on Artificial Intelligence and Statistics. 2010.
[4] Krizhevsky, A. and Hinton, G.E. Using Very Deep Autoencoders for Content-Based Image Retrieval. European Symposium on Artificial Neural Networks ESANN-2011, Bruges, Belgium. 2011.