Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFace token detector not working properly (wrong number of characters) #6823

Closed
2 tasks done
DmitriyLewen opened this issue May 30, 2024 Discussed in #6784 · 2 comments · Fixed by #7216
Closed
2 tasks done

HuggingFace token detector not working properly (wrong number of characters) #6823

DmitriyLewen opened this issue May 30, 2024 Discussed in #6784 · 2 comments · Fixed by #7216
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. scan/secret Issues relating to secret scanning

Comments

@DmitriyLewen
Copy link
Contributor

Discussed in #6784

Originally posted by asankov May 27, 2024

Description

I am playing around with the secret detector in https://github.com/aquasecurity/trivy/blob/main/pkg/fanal/secret/ and I notice that the detector is not able to detect Hugging Face tokens.

Looking at the HF Regex it expected 39 symbols after hf_. However, my HF token has only 34 symbols.

Example HF token: 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 (already revoked)

Desired Behavior

Detect a HF token.

Actual Behavior

Not detecting a HF token.

Reproduction Steps

1. Create a Hugging Face account at https://huggingface.co/
2. Generate an API token at https://huggingface.co/settings/tokens
3. Provide that token as input to the `secret.Scanner`
4. Assert that it returns no findings

Target

Filesystem

Scanner

Secret

Output Format

None

Mode

Standalone

Debug Output

$ trivy fs hf --debug
2024-05-27T13:40:23+03:00	DEBUG	Parsed severities	severities=[UNKNOWN LOW MEDIUM HIGH CRITICAL]
2024-05-27T13:40:23+03:00	DEBUG	Ignore statuses	statuses=[]
2024-05-27T13:40:23+03:00	DEBUG	Cache dir	dir="/Users/asankov/Library/Caches/trivy"
2024-05-27T13:40:23+03:00	DEBUG	DB update was skipped because the local DB is the latest
2024-05-27T13:40:23+03:00	DEBUG	DB info	schema=2 updated_at=2024-05-27T06:12:09.854561954Z next_update=2024-05-27T12:12:09.854561794Z downloaded_at=2024-05-27T10:39:59.156462Z
2024-05-27T13:40:23+03:00	INFO	Vulnerability scanning is enabled
2024-05-27T13:40:23+03:00	DEBUG	Vulnerability type	type=[os library]
2024-05-27T13:40:23+03:00	INFO	Secret scanning is enabled
2024-05-27T13:40:23+03:00	INFO	If your scanning is slow, please try '--scanners vuln' to disable secret scanning
2024-05-27T13:40:23+03:00	INFO	Please see also https://aquasecurity.github.io/trivy/v0.51/docs/scanner/secret/#recommendation for faster secret detection
2024-05-27T13:40:23+03:00	DEBUG	Enabling misconfiguration scanners	scanners=[azure-arm cloudformation dockerfile helm kubernetes terraform terraformplan-json terraformplan-snapshot]
2024-05-27T13:40:23+03:00	DEBUG	[secret] No secret config detected	config_path="trivy-secret.yaml"
2024-05-27T13:40:23+03:00	DEBUG	[nuget] The nuget packages directory couldn't be found. License search disabled
2024-05-27T13:40:23+03:00	DEBUG	OS is not detected.
2024-05-27T13:40:23+03:00	DEBUG	Detected OS: unknown
2024-05-27T13:40:23+03:00	INFO	Number of language-specific files	num=0

Operating System

macOS

Version

Version: 0.51.4
Vulnerability DB:
  Version: 2
  UpdatedAt: 2024-05-27 06:12:09.854561954 +0000 UTC
  NextUpdate: 2024-05-27 12:12:09.854561794 +0000 UTC
  DownloadedAt: 2024-05-27 10:39:59.156462 +0000 UTC

Checklist

@DmitriyLewen DmitriyLewen added kind/bug Categorizes issue or PR as related to a bug. scan/secret Issues relating to secret scanning help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels May 30, 2024
@nikpivkin
Copy link
Contributor

Would it make sense to create a topic about token format on the forum? https://discuss.huggingface.co/

@DmitriyLewen
Copy link
Contributor Author

I asked about this in HuggingChat:

After further investigation, I found that the Hugging Face token can indeed have fewer characters.
According to the Hugging Face documentation, the token can be either 34 or 40 characters long, depending on the type of token.
So, to confirm, the schema for the Hugging Face token is:
hf_<34 or 40 alphanumeric characters>

So token format is hf_<34 or 40 alphanumeric characters>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. scan/secret Issues relating to secret scanning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants