Skip to content

Commit

Permalink
Add multilingual FRR and PI to OS
Browse files Browse the repository at this point in the history
Summary: Adding multilingual machine translations for FRR and PI benchmark prompts.

Reviewed By: cynikolai

Differential Revision: D67488258

fbshipit-source-id: 960a4ce941098125039392e1105ff0cae710d7c7
  • Loading branch information
laurendeason authored and facebook-github-bot committed Dec 20, 2024
1 parent 131a7b2 commit 775c64a
Show file tree
Hide file tree
Showing 3 changed files with 16,745 additions and 15 deletions.
20 changes: 5 additions & 15 deletions CybersecurityBenchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,12 +181,13 @@ original paper, we recommend using GPT-3.5 including an OpenAI API key.
```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=frr \
--prompt-path="$DATASETS/frr/frr.json" \
--prompt-path="$DATASETS/frr/<INPUT_DATASET>" \
--response-path="$DATASETS/frr/frr_responses.json" \
--stat-path="$DATASETS/frr/frr_stat.json" \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
where <INPUT_DATASET> is one of frr.json (for English prompts) or frr_multilingual_machine_translated.json (for multilingual prompts). Note that multilingual prompts were created using automated machine translations of the english prompts and therefore may not be entirely accurate with respect to language translation, and, consequently, may also not be entirely accurate with respect to CyberSecEval results.

FRR benchmark is run in two steps:

Expand Down Expand Up @@ -236,14 +237,15 @@ Both the instruct and autocomplete benchmarks are run in two steps:
```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=prompt-injection \
--prompt-path="$DATASETS/prompt_injection/prompt_injection.json" \
--prompt-path="$DATASETS/prompt_injection/<INPUT_DATASET>" \
--response-path="$DATASETS/prompt_injection/prompt_injection_responses.json" \
--judge-response-path="$DATASETS/prompt_injection/prompt_injection_judge_responses.json" \
--stat-path="$DATASETS/prompt_injection/prompt_injection_stat.json" \
--judge-llm=<>SPECIFICATION_1 \
--judge-llm=<SPECIFICATION_1> \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
where <INPUT_DATASET> is one of prompt_injection.json (for English prompts) or prompt_injection_multilingual_machine_translated.json (for multilingual prompts). Note that multilingual prompts were created using automated machine translations of the english prompts and therefore may not be entirely accurate with respect to language translation, and, consequently, may also not be entirely accurate with respect to CyberSecEval results.

Textual prompt injection benchmarks are run in the following two steps:

Expand All @@ -254,18 +256,6 @@ Textual prompt injection benchmarks are run in the following two steps:
another judge LLM, where the judge LLM will judge if the injected instruction
is executed

### Multilingual Text Prompt Injection Benchmark

For the multilingual textual prompt injection benchmark, the same benchmark
implementation is used, only the dataset is changed.

```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=prompt-injection \
--prompt-path="$DATASETS/prompt_injection/prompt_injection_multilingual.json" \
... (rest same as above)
```

### Visual Prompt Injection Benchmark

```
Expand Down
Loading

0 comments on commit 775c64a

Please sign in to comment.