Skip to content

Commit

Permalink
Add multilingual MITRE datasets to OS
Browse files Browse the repository at this point in the history
Summary: Adding multilingual machine translations for MITRE benchmark prompts.

Reviewed By: cynikolai

Differential Revision: D67479416

fbshipit-source-id: 925556d677c301153096e3cb1f5cdb21d4afc133
  • Loading branch information
laurendeason authored and facebook-github-bot committed Dec 20, 2024
1 parent 151b3b9 commit 131a7b2
Show file tree
Hide file tree
Showing 2 changed files with 9,104 additions and 1 deletion.
3 changes: 2 additions & 1 deletion CybersecurityBenchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ few examples:
```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=mitre \
--prompt-path="$DATASETS/mitre/mitre_benchmark_100_per_category_with_augmentation.json" \
--prompt-path="$DATASETS/mitre/<INPUT_DATASET>" \
--response-path="$DATASETS/mitre_responses.json" \
--judge-response-path="$DATASETS/mitre_judge_responses.json" \
--stat-path="$DATASETS/mitre_stat.json" \
Expand All @@ -161,6 +161,7 @@ python3 -m CybersecurityBenchmarks.benchmark.run \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
where <INPUT_DATASET> is one of mitre_benchmark_100_per_category_with_augmentation.json (for English prompts) or mitre_prompts_multilingual_machine_translated.json (for multilingual prompts). Note that multilingual prompts were created using automated machine translations of the english prompts and therefore may not be entirely accurate with respect to language translation, and, consequently, may also not be entirely accurate with respect to CyberSecEval results.

The MITRE benchmarks are run in three steps:

Expand Down
Loading

0 comments on commit 131a7b2

Please sign in to comment.