Evaluation code for various unsupervised automated metrics for Natural Language Generation.
-
Updated
Aug 20, 2024 - Python
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Python ROUGE Score Implementation for Chinese Language Task (official rouge score)
Python implementation of ROUGE
Evaluation tools for image captioning. Including BLEU, ROUGE-L, CIDEr, METEOR, SPICE scores.
ROUGE L metric implementation using tensorflow ops
A serverless set of functions for evaluating whether incoming messages to an LLM system seem to contain instances of prompt injection; uses cascading cosine similarity and ROUGLE-L calculation against known good and bad prompts
This is a Pytorch implementation of a summarization model that is fine-tuned on the top of Google-T5 pre-trained model.
Evaluation and agreement scripts for the DISCOSUMO project. Each evaluation script takes both manual annotations as automatic summarization output. The formatting of these files is highly project-specific. However, the evaluation functions for precision, recall, ROUGE, Jaccard, Cohen's kappa and Fleiss' kappa may be applicable to other domains too.
Add a description, image, and links to the rouge-l topic page so that developers can more easily learn about it.
To associate your repository with the rouge-l topic, visit your repo's landing page and select "manage topics."