Skip to content

Neural-Wave/project-DroppedNeurons

Repository files navigation

preview

video pitch NOTE: The models weights are not included because too heavy, while the datasets are publicly available on huggingface. If interested we can provide the weights for pre pre-trained Q-LoRA, averaging at just 300Mb.

Quickstart

Our repo creates a webui to show the effects of piiranya vs our LLAMA3.1 Instruct approach, which by QLORA fine tuning provides an interesting more generalizing approach to the problem which goes towards "Foundation model" solutions.

inference

python inference.py
main inference -h

webui

python main.py webui 0.0.0.0 7654 false

[HOST][PORT][PUBLISH]

evaluation

Webui and for more look at notebooks/evaluation.ipynb

LLM Fine Tuning

Model choice comparison table:

Model Name Model Size Author Considerations for Privacy Masking in Multiple Languages
CodeGemma 7B Google Dependable but may not focus on multilingual text masking specifically.
Code Llama 7B, 13B, 34B, 70B Meta AI Potentially good for code tasks; may not be optimized for privacy masking, but Meta's models are often versatile.
Danube2 1.8B H2O.ai Small size, may not perform as well on complex, multilingual tasks.
Dolly 3B, 7B, 12B Databricks Could be suitable for NLP tasks, but multilingual support needs to be confirmed.
Falcon 7B, 40B, 180B TII UAE Large variants available, potentially useful. Check for multilingual capabilities.
FreeWilly2 70B Stability AI Very large model, might offer robust NLP capabilities including some multilingual features.
Function Calling Llama 2 7B Trelis Focuses on function calls more than general NLP tasks.
Gemma 2B, 7B Google Similar considerations as CodeGemma above.
Llama 2 7B, 13B, 70B Meta AI Llama models are generally strong for text-related tasks with some support for multilingual datasets.
LongChat 7B, 13B LMSYS Specially designed for dialogue, might have useful attention mechanisms for entity recognition.
Mathstral 7B Mistral AI Math-focused, less promising for privacy masking.
MicroLlama 300M Ken Wang Very small model, not suitable for complex tasks.
Mixtral MoE 8x7B Mistral AI Mixture of Experts model might bring robustness through ensemble but check multilingual compatibility.
Mistral 7B, 123B Mistral AI Large model sizes available may leverage sophisticated language understanding and multilingual capabilities.
Nous-Hermes 7B, 13B, 70B NousResearch Variants' availability suggests potential robustness in NLP tasks. Multilingual support needs confirmation.
OpenLLaMA 3B, 7B, 13B OpenLM Research Open source nature makes it flexible, but multilingual capabilities should be verified.
Phi 1.5 & 2 1.3B, 2.7B Microsoft Smaller models may not handle sophisticated tasks well.
Platypus 7B, 13B, 70B Lee et al. Check for multilingual support, potential for good language model performance.
Pythia 14M to 12B EleutherAI Wide range of sizes; verify for multilingual text masking.
RedPajama-INCITE 3B, 7B Together Check for robust text processing capabilities in multiple languages.
StableLM 3B, 7B Stability AI Variants available; assess for multilingual text masking.
TinyLlama 1.1B Zhang et al. Small model size not ideal for this task.
Vicuna 7B, 13B, 33B LMSYS Known for conversation, likely has various NLP capabilities. Confirm multilingual support.

Possible choices:

  • LlamaV2 13B
  • Function Calling Llama 2 7B
  • meta-llama/Meta-Llama-3.1-8B-Instruct <---

Parameters

alt text source