0.2.0
Changes
- Add support for LLaMA, MPT models
- Change
repetition_penalty
default value from1.0
to1.1
Breaking Changes
- Update GGML library which has breaking changes to quantization formats. Old models have to be re-quantized.
Some of the latest quantized models are available here.