- Clone repository
git clone https://github.com/luca-ant/WhatsSee.git
or
git clone git@github.com:luca-ant/WhatsSee.git
- Install dependencies
sudo apt install python3-setuptools
sudo apt install python3-pip
sudo apt install python3-venv
or
sudo pacman -S python-setuptools
sudo pacman -S python-pip
sudo pacman -S python-virtualenv
- Create a virtual environment and install requirements modules
cd WhatsSee
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt
- Training: To train the model. You can choose the dataset, the number of training and validation examples and number of epoch. (All arguments are optional. Use 0 as value to choose all examples) Caution! Whole dataset will be downloaded!
python whats_see.py train -d flickr -nt 6000 -nv 1000 -ne 50
- Resume: To resume last saved training and continue it.
python whats_see.py resume
- Evaluate: To evaluate whole model on test images and calculate BLEU scores. You can specify the number of test examples (Use 0 as value to choose all examples).
python whats_see.py evaluate -n 1000
- Test: To test the model by generating a caption of a test's image and compare the generated caption with the real ones.
python whats_see.py test -f TEST_IMAGE_FILE
- Generate: To generate a caption of your own image.
python whats_see.py generate -f YOUR_IMAGE_FILE
To deploy web aplication, simple run start_server.sh script. Open a browser and navigate to localhost:4753.
./start_server.sh
A pre-trained model can be found on releases page.
The neural network was trained on training images of Flickr dataset here and it achieved the following BLEU scores on test images:
- BLEU-1: 49.3%
- BLEU-2: 30.5%
- BLEU-3: 21.7%
- BLEU-4: 11.1%
- WhatsSee was developed by Luca Antognetti