Hikaru Shindo, Manuel Brack, Gopika Sudhakaran, Devendra Singh Dhami, Patrick Schramowski, Kristian Kersting
We propose DeiSAM, which integrates large pre-trained neural networks with differentiable logic reasoners. Given a complex, textual segmentation description, DeiSAM leverages Large Language Models (LLMs) to generate first-order logic rules and performs differentiable forward reasoning on generated scene graphs.Dockerfile is avaialbe in the .devcontainer folder.
To install further dependencies, clone Grounded-Segment-Anything and then:
cd neumann/
pip install -e .
cd ../Grounded-Segment-Anything/
cd segment_anything
pip install -e .
cd ../GroundingDINO
pip install -e .
If an error appears regarding OpenCV (circular import), try:
pip uninstall opencv-python
pip uninstall opencv-contrib-python
pip uninstall opencv-contrib-python-headless
pip3 install opencv-contrib-python==4.5.5.62
Download vit model
wget https://huggingface.co/spaces/abhishek/StableSAM/resolve/main/sam_vit_h_4b8939.pth
DeiVG datasets can be downloaded here
link. Please locate downloaded files to data/
.
Please download the latest Visual Genome here link, and locate downloaded files to data/visual_genome/
.
To solve DeiVG using DeiSAM:
python src/solve_deivg.py --api-key YOUR_OPENAI_API_KEY -c 1
python src/solve_deivg.py --api-key YOUR_OPENAI_API_KEY -c 2
python src/solve_deivg.py --api-key YOUR_OPENAI_API_KEY -c 3
To perform learning on DeiSAM:
python src/learn_deisam.py --api-key YOUR_OPENAI_API_KEY -c 1 -sm VETO -su
python src/learn_deisam.py --api-key YOUR_OPENAI_API_KEY -c 2 -sm VETO -su
See LICENSE.