Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore image refs failing #93

Closed
himynamesdave opened this issue Nov 22, 2024 · 1 comment · Fixed by #99
Closed

ignore image refs failing #93

himynamesdave opened this issue Nov 22, 2024 · 1 comment · Fixed by #99
Assignees
Labels
bug Something isn't working

Comments

@himynamesdave
Copy link
Member

himynamesdave commented Nov 22, 2024

https://github.com/muchdogesec/txt2stix/blob/issue-with-ignores/tests/manual-tests/cases-standard-tests.md#ignore-image-refs

Input file

<img href="https://www.google.com/image.png" alt="tiktok.com" />

![facebook.com](https://www.twitter.com/image.png)

instagram.com

Should ignore https://www.twitter.com/image.png and https://www.google.com/image.png

python3 txt2stix.py \
	--relationship_mode standard \
	--input_file tests/data/manually_generated_reports/embedded_img_ignore.txt \
	--name 'ignore image refs true' \
	--tlp_level clear \
	--confidence 100 \
	--use_extractions pattern_domain_name_only,pattern_domain_name_subdomain,pattern_url,pattern_file_name \
	--ignore_image_refs true \
	--report_id 649da017-4090-48b2-97da-b24d37418ee6

but see extractions

data--649da017-4090-48b2-97da-b24d37418ee6.json

Only

instagram.com

extracts. but script should only ignore image link (not alt tags, etc.) So expected extractions are

instagram.com
tiktok.com
facebook.com

@himynamesdave himynamesdave added the enhancement New feature or request label Nov 22, 2024
@github-project-automation github-project-automation bot moved this to Todo in Roadmap Nov 22, 2024
@himynamesdave himynamesdave added bug Something isn't working and removed enhancement New feature or request labels Nov 22, 2024
fqrious added a commit that referenced this issue Nov 22, 2024
himynamesdave added a commit that referenced this issue Nov 24, 2024
* issue with ignores

* updating tests

* fix lines #93 #94

---------

Co-authored-by: Fadl <chaos@efqr.dev>
@fqrious
Copy link
Contributor

fqrious commented Nov 25, 2024

As such we should add flag ignore_image_refs which will ignore all content inside HTML or markdown image tags

I was asked to completely remove the image in the original issue.

img tags do not have hrefs, they have srcs, so all but the twitter link will be extracted after this change is implemeted.

fqrious added a commit that referenced this issue Nov 25, 2024
@fqrious fqrious moved this from Todo to Pushed in Roadmap Nov 25, 2024
@github-project-automation github-project-automation bot moved this from Pushed to Attention in Roadmap Nov 25, 2024
himynamesdave pushed a commit that referenced this issue Nov 25, 2024
himynamesdave added a commit that referenced this issue Nov 25, 2024
* only remove src from img #93

* Update embedded_img_ignore.txt

---------

Co-authored-by: Fadl <chaos@efqr.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants