This is an attempt to configure a chatbot using the falcon-7b-instruct parameter model to run locally on a machine with <8Gb VRAM. Using 4 bit quantization to reduce memory load. Source to guide.
- Download docker
- docker build -t chatbot-image:latest -f docker/Dockerfile .
- docker run -it --gpus all -v $(pwd):/workspace chatbot-image:latest bash
- ...