Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Commit

Permalink
Merge branch 'main' of github.com:rustformers/llama-rs into feature/p…
Browse files Browse the repository at this point in the history
…th-to-ggml
  • Loading branch information
philpax committed Apr 6, 2023
2 parents cd489a2 + eea9fc7 commit dcc69c4
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 11 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ follows:
from inside the `CREDITS.md` file.
- Run the bindgen script:
```shell
$ cargo run --bin generate_ggml_bindings ggml-sys
$ cargo run --bin generate-ggml-bindings ggml-sys
```
- Fix any compiler errors that pop up due to the new version of the bindings and
test the changes.
21 changes: 21 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Start with a rust alpine image
FROM rust:alpine3.17 as builder
# This is important, see https://github.com/rust-lang/docker-rust/issues/85
ENV RUSTFLAGS="-C target-feature=-crt-static"
# if needed, add additional dependencies here
RUN apk add --no-cache musl-dev
# set the workdir and copy the source into it
WORKDIR /app
COPY ./ /app
# do a release build
RUN cargo build --release --bin llama-cli
RUN strip target/release/llama-cli

# use a plain alpine image, the alpine version needs to match the builder
FROM alpine:3.17
# if needed, install additional dependencies here
RUN apk add --no-cache libgcc
# copy the binary into the final image
COPY --from=builder /app/target/release/llama-cli .
# set the binary as entrypoint
ENTRYPOINT ["/llama-cli"]
28 changes: 19 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,12 @@ kinds of sources.
After acquiring the weights, it is necessary to convert them into a format that
is compatible with ggml. To achieve this, follow the steps outlined below:

> **Warning**
>
> **Warning**
>
> To run the Python scripts, a Python version of 3.9 or 3.10 is required. 3.11
> is unsupported at the time of writing.

``` shell
```shell
# Convert the model to f16 ggml format
python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1

Expand All @@ -95,7 +94,7 @@ python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1
```

> **Note**
>
>
> The [llama.cpp repository](https://github.com/ggerganov/llama.cpp) has
> additional information on how to obtain and run specific models. With some
> caveats:
Expand All @@ -104,17 +103,15 @@ python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1
> (versioned) ggml formats, but not the mmap-ready version that was [recently
> merged](https://github.com/ggerganov/llama.cpp/pull/613).

*Support for other open source models is currently planned. For models where
_Support for other open source models is currently planned. For models where
weights can be legally distributed, this section will be updated with scripts to
make the install process as user-friendly as possible. Due to the model's legal
requirements, this is currently not possible with LLaMA itself and a more
lengthy setup is required.*
lengthy setup is required._

- https://github.com/rustformers/llama-rs/pull/85
- https://github.com/rustformers/llama-rs/issues/75


### Running

For example, try the following prompt:
Expand Down Expand Up @@ -147,6 +144,19 @@ Some additional things to try:
A modern-ish C toolchain is required to compile `ggml`. A C++ toolchain
should not be necessary.

### Docker

```shell
# To build (This will take some time, go grab some coffee):
docker build -t llama-rs .

# To run with prompt:
docker run --rm --name llama-rs -it -v ${PWD}/data:/data -v ${PWD}/examples:/examples llama-rs -m data/gpt4all-lora-quantized-ggml.bin -p "Tell me how cool the Rust programming language is:"

# To run with prompt file and repl (will wait for user input):
docker run --rm --name llama-rs -it -v ${PWD}/data:/data -v ${PWD}/examples:/examples llama-rs -m data/gpt4all-lora-quantized-ggml.bin -f examples/alpaca_prompt.txt --repl
```

## Q&A

### Why did you do this?
Expand Down
13 changes: 12 additions & 1 deletion ggml-sys/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,18 @@ fn main() {
}
"aarch64" => {
if compiler.is_like_clang() || compiler.is_like_gnu() {
build.flag("-mcpu=native");
if std::env::var("HOST") == std::env::var("TARGET") {
build.flag("-mcpu=native");
} else {
#[allow(clippy::single_match)]
match target_os.as_str() {
"macos" => {
build.flag("-mcpu=apple-m1");
build.flag("-mfpu=neon");
}
_ => {}
}
}
build.flag("-pthread");
}
}
Expand Down

0 comments on commit dcc69c4

Please sign in to comment.