Herodotus, often considered the "Father of Historians," was also a prolific travel journelist. His Histories act as one of the very first travel guides of the Ancient Mediterranean.
This project demonstrates how to create your own travel guide using generative AI hosted on Google Cloud--in effect, your very own Herodotus.
This project uses a three-tier architecture, with a simple web frontend, an application tier, and a data tier.
This project uses the following Google services:
This project also uses the following libraries:
This system allows the usage of three related LLM models:
- The out-of-the-box Gemini 1.5 Flash model
- A tuned version of the Gemini 1.5 Flash model, trained on the Guanaco dataset.
- A Gemma 2 open source model.
These models have been evaluated against the following set of metrics.
The following table shows the evaluation scores for each of these models. Change from previous evaluation runs provided in parentheses.
Model | ROUGE | Closed domain | Open domain | Groundedness | Coherence | Date of eval |
---|---|---|---|---|---|---|
Gemini 1.5 Flash [1] | 0.35 (+0.15) | 0.56 (+0.56) | 1.0 (0.0) | 1.0 (0.0) | 3.3 (-0.3) | 2024-11-27 |
Tuned Gemini | 0.26 (+0.05) | 0.6 (+0.2) | 1.0 (0.0) | 0.8 (-0.2) | 3.2 (+0.4) | 2024-11-27 |
Gemma | 0.10 (+0.05) | 0.9 (+0.3) | 0.8 (+0.4) | 0.8 (0.0) | 2.2 (+0.8) | 2024-11-27 |
Reddit-agent Gemini | 0.11 | 1.0 | 0.8. | 0.2 | 1.8 | 2024-11-27 |
[1]: Gemini 1.5 Flash responses from 2024-11-05 are used as the ground truth for all other models.
These models have been evaluated against the following set of adversarial techniques.
The following table shows the evaluation scores for adversarial prompting.
Model | Prompt injection | Prompt leaking | Jailbreaking | Date of eval |
---|---|---|---|---|
Gemini 1.5 Flash | FAIL | FAIL | PASS | 2024-11-27 |
Tuned Gemini | FAIL | PASS | PASS | 2024-11-27 |
Gemma | FAIL | FAIL | PASS | 2024-11-27 |
Reddit-agent Gemini | PASS | PASS | FAIL | 2024-11-27 |