Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EXR/HDR/TGA/DDS/KTX2/QOI support #25

Merged
merged 24 commits into from
May 6, 2023

Conversation

AllenDang
Copy link
Contributor

@AllenDang AllenDang commented May 3, 2023

Add EXR/HDR/TGA/DDS/KTX2/QOI support

@Roughsketch
Copy link
Owner

That's a lot, thanks for the PR! It'll probably take me a bit to review since I'm always busy so please be patient. I'm also not familiar with ktx2 or qoi files so I'll need to look them up.

One thing I'll note right now is I'm not 100% confident in the TGA header check which is the main reason why it wasn't supported so far. I know there's an optional footer with a more concrete value but I also don't like having to read the end of the file to verify the file type. Before I accept this I might need to verify that the chance of a false positive isn't super common. Don't want a bunch of random files being treated as TGAs after all.

@AllenDang AllenDang changed the title Add exr support Add EXR/HDR/TGA/DDS/KTX2/QOI support May 4, 2023
@AllenDang
Copy link
Contributor Author

@Roughsketch Yes, tga is tricky, currently implementation of type checking is not solid.
And may be we should consider to use file extension for type checking first?

@Roughsketch
Copy link
Owner

File extension checking doesn't work for this crate because I leave the option of parsing from byte arrays or readers which don't have extension information. I'm debating whether or not it makes sense to feature lock it so that people can opt-in if they need it and avoid false positives.

Also while looking up TGA header stuff I found this reference which includes the values 0, 32, and 33 as possible data type codes. Might be something you want to add to the header check.

For now I need to sleep, but I hope to have time to fully look at the rest of this by Sunday.

@AllenDang
Copy link
Contributor Author

File extension checking doesn't work for this crate because I leave the option of parsing from byte arrays or readers which don't have extension information. I'm debating whether or not it makes sense to feature lock it so that people can opt-in if they need it and avoid false positives.

Also while looking up TGA header stuff I found this reference which includes the values 0, 32, and 33 as possible data type codes. Might be something you want to add to the header check.

For now I need to sleep, but I hope to have time to fully look at the rest of this by Sunday.

tga type checking for 0, 32, 33 is added

Copy link
Owner

@Roughsketch Roughsketch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check the image files themselves but here's a first pass of what stands out to me. Hopefully it's not too much.

src/formats/dds.rs Outdated Show resolved Hide resolved
src/formats/dds.rs Outdated Show resolved Hide resolved
src/formats/dds.rs Outdated Show resolved Hide resolved
src/formats/exr.rs Outdated Show resolved Hide resolved
src/formats/exr.rs Outdated Show resolved Hide resolved
src/formats/tga.rs Outdated Show resolved Hide resolved
Ok(ImageSize { width, height })
}

pub fn matches(header: &[u8]) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the most tricky part since I really want to avoid false positives.

Using Paul Bourke's TGA format specification I think there's a better way to verify if a header is TGA.

For example, lets take image_type 9:

Current implementation:

  • image_type is a valid number (9)
  • colormap_type of 0 and 1 are both valid

This means any file with the 3rd byte being 9 and the 2nd byte being 0 or 1 will be a TGA file.

My proposed extension of this would be using the information at the bottom of the page I linked:

  • image_type is 9 which is an expected value
  • With image_type 9, colormap_type must be 1 so 0 would be invalid/not TGA
  • Byte 8 (Color Map Entry Size) must be one of 16/24/32
  • Byte 18 (Image Descriptor Byte) bits 0-3 should be 0/1 if byte 8 is 16, 0 if byte 8 is 24, and 8 if byte 8 is 32. Bit 4 should also be 0.

I think this sort of cross referencing based on image_type would give a lot more confidence in an image being TGA or not. Unfortunately Paul's reference only has examples for types 1, 2, 9, and 10. I haven't really done digging to see if there are examples for 0, 3, 11, 32, or 33 yet.

If references for those are hard to come by I would prefer if this starts support with 1, 2, 9, and 10 with a comment about the other values and maybe when references are found we can add them.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way this is something I might want to benchmark, but if doing a seek to the end of the file and reading the TGA 2.0 footer doesn't take too long that is also an option that is much more robust. This does not handle earlier than 2.0 TGA files though.

If you want to try this out do something like:

reader.seek(SeekFrom::End(18));
// Check that next 16 bytes are TRUEVISION-XFILE
// Optionally also check that byte 17 is . and byte 18 is null

I'm not sure what happens if the reader is less than 18 bytes though.

src/util.rs Outdated Show resolved Hide resolved
tests/qoi.rs Outdated Show resolved Hide resolved

#[test]
fn exr_test() {
let dim = size("tests/images/tga/test.tga").unwrap();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mention this in the larger TGA post but if we start checking files based on image_type, then we likely would want to source TGA files of varying types to verify they all work correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tga is sooooo tricky.... why they define a header like that...

@Roughsketch
Copy link
Owner

Thanks for the updates! I think most everything looks fine now except for the TGA matching stuff.

I'm wondering if it may be better to split TGA into its own PR since I believe the other stuff is fine. Alternatively I can merge the TGA stuff as-is and disable it internally until either you or I take a pass at it. I'm still not fully sure what to do for it but I don't want to hold up the other formats.

If you're done with everything besides TGA then the next step I want to do is run a fuzzer over the new formats to ensure nothing breaks and then after that it should be good to merge.

@AllenDang
Copy link
Contributor Author

Yes, it's done on my side

@Roughsketch
Copy link
Owner

@AllenDang I made a PR into your branch with the fuzzer fixes. If it looks good to you go ahead and merge it and then I can do final tests and merge this PR. I'll probably try to figure out what to do with TGA and HDR before releasing it as a version. Hopefully can get it out this weekend.

Apply fuzzer fixes and minor changes
@AllenDang
Copy link
Contributor Author

@AllenDang I made a PR into your branch with the fuzzer fixes. If it looks good to you go ahead and merge it and then I can do final tests and merge this PR. I'll probably try to figure out what to do with TGA and HDR before releasing it as a version. Hopefully can get it out this weekend.

It's done! Thanks for the PR.

@Roughsketch
Copy link
Owner

Thanks again for the PR! Will try to get this up this weekend if I have time to revisit the TGA stuff at the very least.

@Roughsketch Roughsketch merged commit c959a1e into Roughsketch:master May 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants