NOTE: The models weights are not included because too heavy, while the datasets are publicly available on huggingface. If interested we can provide the weights for pre pre-trained Q-LoRA, averaging at just 300Mb.
Our repo creates a webui to show the effects of piiranya vs our LLAMA3.1 Instruct approach, which by QLORA fine tuning provides an interesting more generalizing approach to the problem which goes towards "Foundation model" solutions.
python inference.py
main inference -h
python main.py webui 0.0.0.0 7654 false
[HOST][PORT][PUBLISH]
Webui and for more look at notebooks/evaluation.ipynb
Model Name | Model Size | Author | Considerations for Privacy Masking in Multiple Languages |
---|---|---|---|
CodeGemma | 7B | Dependable but may not focus on multilingual text masking specifically. | |
Code Llama | 7B, 13B, 34B, 70B | Meta AI | Potentially good for code tasks; may not be optimized for privacy masking, but Meta's models are often versatile. |
Danube2 | 1.8B | H2O.ai | Small size, may not perform as well on complex, multilingual tasks. |
Dolly | 3B, 7B, 12B | Databricks | Could be suitable for NLP tasks, but multilingual support needs to be confirmed. |
Falcon | 7B, 40B, 180B | TII UAE | Large variants available, potentially useful. Check for multilingual capabilities. |
FreeWilly2 | 70B | Stability AI | Very large model, might offer robust NLP capabilities including some multilingual features. |
Function Calling Llama 2 | 7B | Trelis | Focuses on function calls more than general NLP tasks. |
Gemma | 2B, 7B | Similar considerations as CodeGemma above. | |
Llama 2 | 7B, 13B, 70B | Meta AI | Llama models are generally strong for text-related tasks with some support for multilingual datasets. |
LongChat | 7B, 13B | LMSYS | Specially designed for dialogue, might have useful attention mechanisms for entity recognition. |
Mathstral | 7B | Mistral AI | Math-focused, less promising for privacy masking. |
MicroLlama | 300M | Ken Wang | Very small model, not suitable for complex tasks. |
Mixtral MoE | 8x7B | Mistral AI | Mixture of Experts model might bring robustness through ensemble but check multilingual compatibility. |
Mistral | 7B, 123B | Mistral AI | Large model sizes available may leverage sophisticated language understanding and multilingual capabilities. |
Nous-Hermes | 7B, 13B, 70B | NousResearch | Variants' availability suggests potential robustness in NLP tasks. Multilingual support needs confirmation. |
OpenLLaMA | 3B, 7B, 13B | OpenLM Research | Open source nature makes it flexible, but multilingual capabilities should be verified. |
Phi 1.5 & 2 | 1.3B, 2.7B | Microsoft | Smaller models may not handle sophisticated tasks well. |
Platypus | 7B, 13B, 70B | Lee et al. | Check for multilingual support, potential for good language model performance. |
Pythia | 14M to 12B | EleutherAI | Wide range of sizes; verify for multilingual text masking. |
RedPajama-INCITE | 3B, 7B | Together | Check for robust text processing capabilities in multiple languages. |
StableLM | 3B, 7B | Stability AI | Variants available; assess for multilingual text masking. |
TinyLlama | 1.1B | Zhang et al. | Small model size not ideal for this task. |
Vicuna | 7B, 13B, 33B | LMSYS | Known for conversation, likely has various NLP capabilities. Confirm multilingual support. |
- LlamaV2 13B
- Function Calling Llama 2 7B
- meta-llama/Meta-Llama-3.1-8B-Instruct <---