PyTorch implementation of normalization-free LLMs investigating entropic behavior to find desirable activation functions
pythia leaky-relu relu privacy-preserving-machine-learning pytorch-implementation gelu gpt-2 model-optimization transformers-models normalization-free-training llm-inference llm-evaluation llm-architecture private-inference entropy-collapse attention-we
-
Updated
Nov 2, 2024 - Python