Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT Add ability to fetch wmdp-bio, wmdp-chem, and wmdp-cyber datasets #380

Merged
merged 54 commits into from
Sep 18, 2024

Conversation

mshirsekar1
Copy link
Contributor

Description

Add the ability to fetch the wmdp-bio, wmdp-chem, and wmdp-cyber datasets from Hugging Face. This is related to issue #186

Tests and Documentation

Test ability to fetch the dataset from Hugging Face, and validated the QuestionAnsweringDataset format can be used as prompts successfully

mshirsekar1 and others added 30 commits September 16, 2024 11:54
FEAT add ability to fetch PKU-SafeRLHF dataset
Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
@mshirsekar1
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Microsoft"

@romanlutz
Copy link
Contributor

Amazing! Thank you so much for tackling this!

Can you remove the existing WMDP datasets that are stored in the repo?

@romanlutz romanlutz linked an issue Sep 18, 2024 that may be closed by this pull request
pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved
pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved
@mshirsekar1 mshirsekar1 marked this pull request as ready for review September 18, 2024 22:25
@romanlutz romanlutz merged commit 88c8872 into Azure:main Sep 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update WMDP Dataset
3 participants