Achieve the lowest perplexity score by rearranging words in the text.
- Removed the sliding attention window since we only have short texts.
- Added calculation of top-p directly in the forward method.
- The code is prepared for compilation using torch.compile without graph breaks.
- Tree-based search approach to explore possible permutations of the words.
- Tree-based search in reverse order: from the end of the text to the beginning.
- Tree-based search with Monte Carlo sampling.
.
├── results
│ └── results.txt # Results
├── algorithm_tbs_mcs.py # Tree-based search with Monte Carlo sampling
├── algorithm_tbs.py # Tree-based search
├── algorithm_tbs_reverse.py # Tree-based search in reverse order
├── config.py # Gemma-2 configuration
├── generate.py # Gemma-2 generation
├── model.py # Gemma-2 model
├── tokenizer.py # Tokenizer
├── .gitignore
├── LICENSE
├── README.md
└── requirements.txt