Skip to content

Commit

Permalink
Merge main in dev (#94)
Browse files Browse the repository at this point in the history
* [IDEA] Auto convert MODEL_PATH model in .env (#19)

* Create convert.py

Has to be actively run 

Checks model type of MODEL_PATH's model from .env and if is old ggml type, converts to ggjt and closes. If already ggjt, it just exits.

The code to check the model type is on line 863 "def lazy_load_file(path: Path) -> ModelPlus:"

Could be interesting if you could call it automatically from startllm and then after conversion change the name of your old file to oldmodel.bin and the new file to what you have in .env and then use it.

* Update README.md

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* Better .env with more settings (#20)

* Improve .env with introducing model_stop & better documentation 

* Update example.env

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* Create Dockerfile

* Update Dockerfile

* add docker support

* Update README.md

* Update README.md

* Update README.md

* fixed parsing error for literat linebreak+ added missing .bin suffix

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update README.md

* simplify+change default LLM 

refering to #24

* added .epub, .html support 

see #24 ; might be missing unstructured module

* Update requirements.txt

see #24

* Create docker-image.yml

* Update README.md

* Update README.md

* Delete docker-image.yml

* Create docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update Dockerfile

* Update docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update docker-image.yml

* Update example.env

* Update docker-image.yml

* Update README.md

* Update docker-image.yml

* Update README.md

* Update README.md

* Update docker-image.yml

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update Dockerfile

* Update Dockerfile

* Update README.md

* Update README.md

* Update Dockerfile

* Simplify Dockerfile

* Update Dockerfile

* Update README.md

* Update README.md

* Update README.md

* Update docker-image.yml

* GUI SUPPORT (#21)

* GUI SUPPORT

Run `streamlit run .\gui.py` to use it.

* Preload models only once instead of every query.

Text input doesnt disable like I tried to do after a query.

* Preload models + better interface

Still needs some work.

* specified version of streamlit else None

* Working version: Added loading message and improved UI

Still need to do cleanup, but should perform properly.

* Code cleanup for commit

* Faster responses + GUI | Run: streamlit run .\gui.py

* Edit .env directly in GUI

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Stable GUI (#29)

+ Way better qa_system initialization.
+ Tweaked UI

* Add poetry conf (#32)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* clean for merge

* cleanup

* Add pre-config, fix convert.py, (#35)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* clean for merge

* fix README.md

* update README.md for new convert

* redirect to main repo

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* fix ingest.py (#37)

* added list to .loader -> multiple files support

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* fix Dockerfile and README.md for streamlit (#40)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* clean for merge

* fix README.md

* update README.md for new convert

* redirect to main repo

* fix ingest.py

* rollback README.md

* fix Dockerfile and README.md for streamlit

* fix README.md

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* Update README.md

* changed workflow to stable

* Update README.md

* Merge hippa branch (#41)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* clean for merge

* fix README.md

* update README.md for new convert

* redirect to main repo

* fix ingest.py

* rollback README.md

* fix Dockerfile and README.md for streamlit

* fix README.md

* cleaner document handling in ingest.py
remove CI

* add support for ptt, docx

* add sample documents
clean gitignore
bump package version

* load env variables in centralized file
load more variables from env
lint

* remove CI on merge

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* removed suppress code analysis warnings

this lets github flag the repo as HTML codebase. which alters discoverability

* fix the flag as html

* html error already fixed

* Allow appending ingest to existing db (#43)

*Don't recreate new db each time we run ingest

*Print progress of ingestion
*Add USE_MLOCK as env arg
*Skip prompt if empty
*lint
closes #42  

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* README cleanup (#44)

modified:   README.md

Edit the README .env to match the new example.env format
Edit typos

* Add HF embedings, add custom tuned prompts, add GPU accel (#45)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* clean for merge

* fix README.md

* update README.md for new convert

* redirect to main repo

* fix ingest.py

* rollback README.md

* fix Dockerfile and README.md for streamlit

* fix README.md

* cleaner document handling in ingest.py
remove CI

* add support for ptt, docx

* add sample documents
clean gitignore
bump package version

* load env variables in centralized file
load more variables from env
lint

* remove CI on merge

* check for empty query

* print embedding progress
allow reusing collection
add env USE_MLOCK

* fix model_stop

* fix model_stop

* several minor improvements to startLLM.py
fix empty MODEL_STOP
add CHAIN_TYPE in env

* pre-commit formatting

* Add support for HuggingFace embeddings

* - add custom prompt templates tailored for vic7b-5, and better than the default ones
- update vic7b to 5.1
- add instructions for GPU support

* update example.env

* update prompts

* fix typo

* fix typo

* update example.env

* re-add strip

* Add N_GPU_LAYERS to .env
fix GPU instructions

---------

Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>

* Added document loaders for mail file types. (#48)

Co-authored-by: PATRICK  GEBERT <patrick.gebert@etecture.de>

* Adding missing dependency (#53)

* Added document loaders for mail file types.

* Updated dependencies for mail document loader.

---------

Co-authored-by: PATRICK  GEBERT <patrick.gebert@etecture.de>

* Revert "Adding missing dependency (#53)" (#54)

This reverts commit cc8cda6.

* Move to module, better terminal output, libgen script (#50)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* fix ingest.py

* rollback README.md

* fix Dockerfile and README.md for streamlit

* cleaner document handling in ingest.py
remove CI

* add support for ptt, docx

* add sample documents
clean gitignore
bump package version

* load env variables in centralized file
load more variables from env
lint

* check for empty query

* print embedding progress
allow reusing collection
add env USE_MLOCK

* fix model_stop

* several minor improvements to startLLM.py
fix empty MODEL_STOP
add CHAIN_TYPE in env

* pre-commit formatting

* Add support for HuggingFace embeddings

* - add custom prompt templates tailored for vic7b-5, and better than the default ones
- update vic7b to 5.1
- add instructions for GPU support

* update example.env

* update prompts

* fix typo

* fix typo

* update example.env

* re-add strip

* Add N_GPU_LAYERS to .env
fix GPU instructions

* move to package

* refactor ingest.py as class, add saving on the fly

* refactor ingest.py as class, add saving on the fly

* refactor ingest.py as class, add saving on the fly

* add script to ask questions to libgen

* format prompts and text in startLLM.py

* add formatting to ask_libgen.py

* add formatting to ingest.py

* handle errors

* escape strings

* fix logger level

* merge main

* update README.md

* clean for merge

* fix link

* fixed-extract-msg (#55)

* fixed extract-msg revert

* sorry for inconvenience i'm remote

* added runtime :stable badge

* Fix gui (#58)

* update requirements.txt

* remove state_of_the_union.txt
update gitignore
add multithreading

* add poetry config

* update streamlit version

* update Dockerfile

* update Dockerfile

* fix Dockerfile

* update Dockerfile

* update README.md

* update README.md

* update convert.py & pyproject.toml

* add tokenizer model

* update README & lint

* add pre-commit

* run pre-commit

* fix README.md

* fix (?) convert.py

* fix (?) convert.py

* fix package versions

* fix ingest.py

* rollback README.md

* fix Dockerfile and README.md for streamlit

* cleaner document handling in ingest.py
remove CI

* add support for ptt, docx

* add sample documents
clean gitignore
bump package version

* load env variables in centralized file
load more variables from env
lint

* check for empty query

* print embedding progress
allow reusing collection
add env USE_MLOCK

* fix model_stop

* several minor improvements to startLLM.py
fix empty MODEL_STOP
add CHAIN_TYPE in env

* pre-commit formatting

* Add support for HuggingFace embeddings

* - add custom prompt templates tailored for vic7b-5, and better than the default ones
- update vic7b to 5.1
- add instructions for GPU support

* update example.env

* update prompts

* fix typo

* fix typo

* update example.env

* re-add strip

* Add N_GPU_LAYERS to .env
fix GPU instructions

* move to package

* refactor ingest.py as class, add saving on the fly

* refactor ingest.py as class, add saving on the fly

* refactor ingest.py as class, add saving on the fly

* add script to ask questions to libgen

* format prompts and text in startLLM.py

* add formatting to ask_libgen.py

* add formatting to ingest.py

* handle errors

* escape strings

* fix logger level

* merge main

* update README.md

* clean for merge

* fix link

* Refactor the GUI

* add pypandoc_binary dep (#60)

* update readme

* fix pypandoc

* revert README.md

* automatically download models from HF (#61)

* automatically download models from HF
split load_env into load_env+utils

* add parameter for number of docs to get

* update env

* fix formatting

* fix default download

* put downloaded model in local

* fix downloading from datasets

* fix symlinks

* remove q4 model from readme, remove obsolete file

* fixed link to samples

* Revert "automatically download models from HF (#61)" (#64)

This reverts commit ccdf849.

* Fixed HF download + gui stable  (#65)

* Update load_env.py

* Create utils.py

* Update gui.py

* Update startLLM.py

* Update ask_libgen.py

* Update ingest.py

* Delete meta.json

* Update README.md

* broken link

* Fix gui+dataset (#67)

* fix gui import

* fix dataset import

* missed updating example.env

* Better chain (#70)

* put n_gpu_layers in args thanks to new langchain version

* basic chain
+ fix get_num_tokens on llama
+ fix text formatting in HTML error

* add chain to startLLM
+ add MODEL_MAX_TOKENS parameter

* fix formatting issue in print_HTML

* tweak prompt

* add betterrefine

* small fix

* fix default chain

* fix embedding with llama + model download path (#72)

* fix embedding with llama

* fix downloaded model path

* Add issue templates (#73)

* add bug template from langchain- wip

* update bug-report.yml

* add config.yml

* update bug-report.yml

* add feature-request.yml

* add documentation.yml other.yml

* update bug-report.yml (#78)

* add update note

* Better docker file (#80)

* update Dockerfile

* python >=3.10

* update lockfile

* update Dockerfile
update project version

* update Dockerfile

* fix gui in docker

* fix for main

* move files around (#81)

* fix empty doc (#82)

* Parallelize ingestion (#85)

* fix empty doc

* ingest multithreading

* better multithreading

* change default val

* change default val

* add .doc and .ppt support (#86)

* add .doc and .ppt support

* update lock

* Update README.md

* fix mp warning (#88)

* added escape characters

* Fix html formatting (#90)

* fix formatting

* fix formatting

* Update README.md

* Fix docker (#92)

* Better docker image, update readme instructions

* improved docker images

* Update README.md

---------

Co-authored-by: alxspiker <alxtheprogrammer@gmail.com>
Co-authored-by: su77ungr <69374354+su77ungr@users.noreply.github.com>
Co-authored-by: mlynar-czyk <64000170+mlynar-czyk@users.noreply.github.com>
Co-authored-by: Patrick Gebert <patrick.gebert@web.de>
Co-authored-by: PATRICK  GEBERT <patrick.gebert@etecture.de>
  • Loading branch information
6 people authored May 18, 2023
1 parent 038d226 commit fd103b3
Show file tree
Hide file tree
Showing 27 changed files with 2,802 additions and 454 deletions.
4 changes: 4 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.git
models
.venv
db
111 changes: 111 additions & 0 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
name: "\U0001F41B Bug Report"
description: Submit a bug report to help us improve CASALIOY
labels: ["02 Bug Report"]
body:
- type: markdown
attributes:
value: >
Thank you for taking the time to file a bug report. Before creating a new
issue, please make sure to take a few moments to check the issue tracker
for existing issues about the bug.
- type: textarea
id: env
attributes:
label: .env
description: Please share your exact .env file. *format it with ``` as in the example below.*
placeholder: |
```
# Generic
MODEL_N_CTX=1024
TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2
TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF
USE_MLOCK=true
# Ingestion
PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50
# Generation
MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_STOP=[STOP]
CHAIN_TYPE=stuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=6 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=4
```
validations:
required: true

- type: input
id: system-info-python
attributes:
label: Python version
placeholder: python 3.11.3
validations:
required: true
- type: input
id: system-info-system
attributes:
label: System
placeholder: Ubuntu-22.04
validations:
required: true
- type: input
id: system-info-casalioy
attributes:
label: CASALIOY version
placeholder: A release number (ex. `0.0.8`) or a commit id (ex `13cce0e`)
validations:
required: true

- type: checkboxes
id: information-scripts-examples
attributes:
label: Information
description: "The problem arises when using:"
options:
- label: "The official example scripts"
- label: "My own modified scripts"

- type: checkboxes
id: related-components
attributes:
label: Related Components
description: "Select the components related to the issue (if applicable):"
options:
- label: "Document ingestion"
- label: "GUI"
- label: "Prompt answering"

- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Reproduction
description: |
Please provide a [code sample](https://stackoverflow.com/help/minimal-reproducible-example) that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
placeholder: |
Steps to reproduce the behavior:
1.
2.
3.
- type: textarea
id: expected-behavior
validations:
required: true
attributes:
label: Expected behavior
description: "A clear and concise description of what you would expect to happen."
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
blank_issues_enabled: true
version: 2.1
19 changes: 19 additions & 0 deletions .github/ISSUE_TEMPLATE/documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: Documentation
description: Report an issue related to the LangChain documentation.
title: "DOC: <Please write a comprehensive title after the 'DOC: ' prefix>"
labels: [03 - Documentation]

body:
- type: textarea
attributes:
label: "Issue with current documentation:"
description: >
Please make sure to leave a reference to the document/code you're
referring to.
- type: textarea
attributes:
label: "Idea or request for content:"
description: >
Please describe as clearly as possible what topics you think are missing
from the current documentation.
30 changes: 30 additions & 0 deletions .github/ISSUE_TEMPLATE/feature-request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: "\U0001F680 Feature request"
description: Submit a proposal/request for a new CASALIOY feature
labels: ["02 Feature Request"]
body:
- type: textarea
id: feature-request
validations:
required: true
attributes:
label: Feature request
description: |
A clear and concise description of the feature proposal. Please provide links to any relevant GitHub repos, papers, or other resources if relevant.
- type: textarea
id: motivation
validations:
required: true
attributes:
label: Motivation
description: |
Please outline the motivation for the proposal. Is your feature request related to a problem? e.g., I'm always frustrated when [...]. If this is related to another GitHub issue, please link here too.
- type: textarea
id: contribution
validations:
required: true
attributes:
label: Your contribution
description: |
Is there any way that you could help, e.g. by submitting a PR?
18 changes: 18 additions & 0 deletions .github/ISSUE_TEMPLATE/other.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Other Issue
description: Raise an issue that wouldn't be covered by the other templates.
title: "Issue: <Please write a comprehensive title after the 'Issue: ' prefix>"
labels: [04 - Other]

body:
- type: textarea
attributes:
label: "Issue you'd like to raise."
description: >
Please describe the issue you'd like to raise as clearly as possible.
Make sure to include any relevant links or references.
- type: textarea
attributes:
label: "Suggestion:"
description: >
Please outline a suggestion to improve the issue here.
2 changes: 1 addition & 1 deletion .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ jobs:
DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
run: |
docker build . --file Dockerfile --tag su77ungr/casalioy:stable
docker build . -t su77ungr/casalioy:stable
docker login --username $DOCKER_USERNAME --password $DOCKER_PASSWORD
docker push su77ungr/casalioy:stable
50 changes: 42 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,14 +1,48 @@
FROM python:3.11
###############################################
# Base Image
###############################################
FROM python:3.11-slim as python-base
# We set POETRY_VERSION=1.3.2 because 1.4.x has some weird legacy issues
# CASALIOY_FORCE_CPU = we install cpu-only pytorch.
ENV PYTHONFAULTHANDLER=1 \
PYTHONUNBUFFERED=1 \
PYTHONHASHSEED=random \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_NO_INTERACTION=1 \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_VERSION=1.3.2 \
CASALIOY_FORCE_CPU=true
RUN apt-get update && apt-get install -y build-essential git htop gdb nano unzip curl && rm -rf /var/lib/apt/lists/*
#RUN if [ "$CASALIOY_ENABLE_LLAMA_GPU" = "true" ]; then \
# apt-get install -y nvidia-cuda-toolkit nvidia-cuda-toolkit-gcc; \
# fi; \
RUN pip install --upgrade setuptools virtualenv

###############################################
# Builder Image
###############################################
FROM python-base as builder-base
RUN pip install "poetry==$POETRY_VERSION"
WORKDIR /srv
RUN git clone https://github.com/su77ungr/CASALIOY.git
WORKDIR CASALIOY
RUN poetry install --with GUI,LLM --without dev --sync
RUN . .venv/bin/activate && pip install --force streamlit
RUN . .venv/bin/activate && \
if [ "$CASALIOY_FORCE_CPU" = "true" ]; then \
pip install --force torch torchvision --index-url https://download.pytorch.org/whl/cpu; \
else \
pip install --force sentence_transformers; \
fi

RUN pip3 install poetry
RUN python3 -m poetry config virtualenvs.create false
RUN python3 -m poetry install
RUN python3 -m pip install --force streamlit sentence_transformers # Temp fix, see pyproject.toml
RUN python3 -m pip uninstall -y llama-cpp-python
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python3 -m pip install llama-cpp-python # GPU support
RUN pre-commit install
###############################################
# Production Image
###############################################
FROM python-base as production
COPY --from=builder-base /srv /srv
WORKDIR /srv/CASALIOY
COPY example.env .env
RUN echo "source /srv/CASALIOY/.venv/bin/activate" >> ~/.bashrc
RUN . .venv/bin/activate && python -c "import nltk; nltk.download('averaged_perceptron_tagger')"
39 changes: 39 additions & 0 deletions Dockerfile-GPU
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
###############################################
# Base Image
###############################################
FROM nvidia/cuda:12.1.1-base-ubuntu22.04 as base
# We set POETRY_VERSION=1.3.2 because 1.4.x has some weird legacy issues
ENV PYTHONFAULTHANDLER=1 \
PYTHONUNBUFFERED=1 \
PYTHONHASHSEED=random \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_NO_INTERACTION=1 \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_VERSION=1.3.2
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update && apt install -y software-properties-common && add-apt-repository -y ppa:deadsnakes/ppa && apt-get install -y python3.11 python3.11-venv python3-pip build-essential git htop gdb nano unzip curl && rm -rf /var/lib/apt/lists/*
RUN python3.11 -m pip install --upgrade setuptools virtualenv

###############################################
# Builder Image
###############################################
FROM base as builder-base
RUN python3.11 -m pip install "poetry==$POETRY_VERSION"
WORKDIR /srv
RUN git clone https://github.com/su77ungr/CASALIOY.git
WORKDIR CASALIOY
RUN python3.11 -m poetry install --with GUI,LLM --without dev --sync
RUN . .venv/bin/activate && pip install --force streamlit sentence_transformers
RUN . .venv/bin/activate && pip uninstall -y llama-cpp-python && CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --force llama-cpp-python

###############################################
# Production Image
###############################################
FROM base as production
COPY --from=builder-base /srv /srv
WORKDIR /srv/CASALIOY
COPY example.env .env
RUN echo "source /srv/CASALIOY/.venv/bin/activate" >> ~/.bashrc
RUN . .venv/bin/activate && python -c "import nltk; nltk.download('averaged_perceptron_tagger'); nltk.download('punkt')"
Loading

0 comments on commit fd103b3

Please sign in to comment.