0. Install make, gcc and git-lfs in the system
- Create a directory named os_model and move to that directory.
mkdir os_model
cd os_model
- Create a virtual environment named osenv(do it according to your OS).
python -m venv osenv
source osenv/bin/activate
- Install Git Large File System(lfs) and clone the sqlcoder-34b repo from hugging face. This takes a lot of time because the remote repo is nearly 140GB. Be patient and do not cancel the download.
git lfs install
git clone https://huggingface.co/defog/sqlcoder-34b-alpha
-
Download only tokenizer.model file from the following repo https://huggingface.co/defog/sqlcoder-7b/tree/main and place in the sqlcoder-34b-alpha folder.
-
Clone llama.cpp and setup python conversion stuff
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
-
Create a directory called sqlcoder-34b-alpha in the models directory located in llama.cpp folder
-
Run the following commands
make
python3 convert.py <model_folder../os_model/sqlcoder-34b-alpha> --outfile ./models/sqlcoder-34b-alpha/ggml-sqlcoder-34b-f16.gguf --outtype f16
- Now let's quantize the above generated model to q4_k. You can quantize it to q8_0 for better performance(to do it replace q4_k with q8_0 in the below code).
./quantize ./models/sqlcoder-34b-alpha/ggml-sqlcoder-34b-f16.gguf ./models/sqlcoder-34b-alpha/ggml-sqlcoder-34b-q4_k.gguf.bin q4_k
- We can use this model for various text-generation tasks. The model downloaded performs well for the task of natural language to SQL query generation.