Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libyuv initial integration #4363

Closed
wants to merge 2 commits into from
Closed

libyuv initial integration #4363

wants to merge 2 commits into from

Conversation

ToSeven
Copy link
Contributor

@ToSeven ToSeven commented Aug 25, 2020

No description provided.

@@ -0,0 +1,2 @@
[libfuzzer]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since AFL and Honggfuzz don't respect max_len we generally discourage doing this.
It's better to enforce this in LLVMFuzzerTestOneInput.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, got it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the line
libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data, width, width, height);

The 3rd to last value is a stride that may be affected by odd width subsampling.
It should be( (width + 1) / 2) * 2 or (width + 1) & ~1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your reply! This is my mistake, so I have modified this line of code and show blow.
libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data,(width+1)/2*2, width, height);

unsigned int fastrand_seed;
inline int fastrand()
{
fastrand_seed = fastrand_seed * 214013u + 2531011u;
Copy link
Contributor

@jonathanmetzman jonathanmetzman Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i run this fuzzer today and then I run the same binary tomorrow, can the same testcase/input/data behave differently?
I think this would be a major problem if I understand this right. You will get unreproducible crashes and fuzzing won't work properly because different processes will have different ideas on what testcases are doing.
I think https://github.com/google/fuzzing/blob/master/docs/split-inputs.md might allow you to do something similar in a way that is deterministic wrt the input (data). Can you use this? Or please let me know if I'm misunderstanding.
I this is nondeterministic because you use srand to set fastrand_seed and then use this in for setting height and width. So actually to me it looks like the same data won't behave the same even on execs of LLVMFuzzerTestOneInput in the same process. libFuzzer targets should to be deterministic and stateless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right! I have understanded this problem, and will read the split-inputs.md , fix this problem.

@jonathanmetzman
Copy link
Contributor

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

@ToSeven
Copy link
Contributor Author

ToSeven commented Aug 26, 2020

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

I am not a maintainer, but I have planning to contribute my code to the official repository. Should I email the maintainers to comment here? or I firstly commit the code to libyuv then contribute it to oss-fuzz.

@ToSeven
Copy link
Contributor Author

ToSeven commented Aug 26, 2020

Additionally, Is *.dict neccesary for the fuzzing in libyuv? I do not know how to write the keyword in *.dict. Can you give me some advices? thanks!

@jonathanmetzman
Copy link
Contributor

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

I am not a maintainer, but I have planning to contribute my code to the official repository. Should I email the maintainers to comment here? or I firstly commit the code to libyuv then contribute it to oss-fuzz.

Please get them to comment here first. We don't allow integrating projects without maintainer permission since as soon as it's integrating they will start getting emails for every bug OSS-Fuzz finds and if they aren't ready to fix them, the bug can be made public without a fix (after 90 days).

@jonathanmetzman
Copy link
Contributor

Additionally, Is *.dict neccesary for the fuzzing in libyuv? I do not know how to write the keyword in *.dict. Can you give me some advices? thanks!

I don't really know what you mean by "Is *.dict ". Dictionaries aren't necessary but help a lot.
Here's a dict for JavaScript: https://github.com/google/fuzzing/blob/master/dictionaries/js.dict
more docs: http://llvm.org/docs/LibFuzzer.html#dictionaries

Also note that you don't need to do kw="blah" you can just write "blah" on it's on line in a dict.

@ToSeven
Copy link
Contributor Author

ToSeven commented Aug 28, 2020

@jonathanmetzman I have invited the maintainer to comment here. there is a question. If I modified my code and merged it to the OSS-Fuzz, Would I get a $1,000 bonus for the initial integration? I am a student and want to earn some money to relieve my parents to afford. so it is important for me.

@inferno-chromium
Copy link
Collaborator

@fbarchard @dalecurtis - are you ok with libyuv integration to OSS-Fuzz, can you please take a review pass on fuzzer code. Are you ok with receiving with libyuv bugs.

@ToSeven - can you fix the projects/libyuv/project.yaml conflict so that CI can run

@ToSeven
Copy link
Contributor Author

ToSeven commented Sep 7, 2020

@fbarchard @dalecurtis - are you ok with libyuv integration to OSS-Fuzz, can you please take a review pass on fuzzer code. Are you ok with receiving with libyuv bugs.

@ToSeven - can you fix the projects/libyuv/project.yaml conflict so that CI can run
Ok!

@dalecurtis
Copy link
Member

Again defer to the project owner @fbarchard, but I think this is a good idea. lgtm

height = fastrand() % range_size ;
width = fastrand() % range_size;
kPixels = width * height;
kHalfPixels = (width + 1) / 2 * (height + 1) / 2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kHalfPixels = (width + 1) / 2 * (height + 1) / 2; will evaluate the * before the / 2 ?
Suggest
kHalfPixels = ((width + 1) / 2) * ((height + 1) / 2);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -4,10 +4,14 @@ primary_contact: "mikhal@webrtc.org"
auto_ccs:
- "fbarchard@google.com"
- "frkoenig@google.com"
- "byone.heng@google.com"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a valid email?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I mistake this email address. It should be byone.heng@gmail.com

fuzzing_engines:
- libfuzzer
- afl
sanitizers:
- address
- memory
- undefined
architectures:
- x86_64

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libyuv supports arm and mips as well. Not sure if they are supported by fuzz

Copy link
Contributor Author

@ToSeven ToSeven Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fbarchard
Copy link

fbarchard commented Sep 8, 2020 via email

@ToSeven ToSeven force-pushed the master branch 2 times, most recently from 28dab49 to f141025 Compare September 10, 2020 09:55
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
{

FuzzedDataProvider fuzzed_data(data, size);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't allocate enough memory for the source.
The amount of bytes consumed is
int range_size = sqrt((static_cast(size) + 1) / 2);
size_t width = fuzzed_data.ConsumeIntegralInRange(1, range_size);
size_t height = fuzzed_data.ConsumeIntegralInRange(1, range_size);
int kPixels = width * height;
int kHalfPixels = ((width + 1) / 2) * ((height + 1) / 2);
size_t new_size = kPixels + kHalfPixels * 2

libyuv::I420ToNV21 will consume new_size bytes if you pass these width and height parameters.

For example if you were passed a size of 1, your code computes width and height of 1.
the Y plane is 1x1 so 1 byte
the U and V planes are half width and height but round up to 1x1 so 1 byte each
total of 3 bytes consumed.

theres really no valid 1 byte YUV image. its 3 planes and each plane is a minimum of 1 byte.
if you're working backward from bytes, you'll need to be careful about how you compute width and height if you need to stay under 'size' bytes.
Your width and height assume 1 plane I think"?
Come up with a formula where width * height + ((width + 1) / 2) * ((height + 1) / 2) * 2 is <= size
e.g. for 4 bytes, width and height are still 1. or 2x1. but not 2x2. 2x2 takes 6 bytes.
it doesnt need to be square and for good code coverage it probably shouldnt be.
for example and interesting case is 1x16
a width of 1 requires the rows of u and v to be rounded out to 1
y = 1x16 = 16 bytes
u = 1x8
v = 1x8
total is 32 bytes.

Copy link
Contributor Author

@ToSeven ToSeven Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modify my code and show blew.

int range_size = (static_cast(size) + 1) / 2;
do
{
height = fuzzed_data.ConsumeIntegralInRange(0, range_size);
width = fuzzed_data.ConsumeIntegralInRange(0, range_size) ;
kPixels = width * height;
kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
}while (kPixels + kHalfPixels * 2 > size);


FuzzedDataProvider fuzzed_data(data, size);

int range_size = sqrt((static_cast<int>(size) + 1) / 2);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider non-square?
if you come up with a randomized width that is less than size you can compute height

if (cpu_flag == 1)
{
//enable all cpu specific optimizations.
libyuv::MaskCpuFlags(-1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only tests 1 instruction set - whatever the host machine has. e.g. avx2.
there is code for ssse3 and avx2 for most functions. as well as c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it.

const uint8_t *src_u_data = second_part.data();
const uint8_t *src_v_data = third_part.data();

libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data, ((width + 1) / 2) * 2, width, height);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tests the simple case where width == stride.
the function will notice this and I420ToNV21 will treat the each plane as a single row.
e.g. the Y plane of I420 and NV21 are actually the same. If width = stride it will change the parameters internally to 0 for stride and width * height, 1 for width and height, making a single call to memcpy() for the entire image, testing very different code, even if the width was odd.
ideally you would test where width is less than stride, forcing the code to do 1 row at a time.
libyuv functions also allow a negative stride, which means the image will be inverted, but you will need to adjust the starting pointer if you care to test that.
the stride could be randomized to a value less than width, but the results of the image would be unpredictable and not all source bytes would be consumed.
the width could be randomized to be less than stride. it buffer size is determined by stride but you can process a subregion.

std::vector<uint8_t> second_part =
fuzzed_data.ConsumeBytes<uint8_t>(kHalfPixels);
std::vector<uint8_t> third_part =
fuzzed_data.ConsumeRemainingBytes<uint8_t>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function expects to consume kHalfPixels worth of bytes, but due to size being too small, this vector wont allocate enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@ToSeven
Copy link
Contributor Author

ToSeven commented Sep 18, 2020

Please review my code, thanks!

cpu_info = -1;

#if defined(__arm__) || defined(__aarch64__)
return cpu_info &= ~(random_num & 0x04);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 is not safe... it could change in future.
It would be safe to literally mask the flags with a random number, disabling parts of the cpuinfo.
e.g. disable avx.
Its safe to disable anything, just not enable anything.

kPixels = width * height;
kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
}
while (kPixels + kHalfPixels * 2 > size);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop effectively doesnt test odd widths, since they round the size up. It wont crash but it wont test interesting values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 44th line should be kHalfPixels=(( width + 1) / 2 )*(( height + 1 ) / 2 ), I forgot to modify it.
I am sorry that I can't understand your meaning. ”widths“ is generated randomly by the fuzzer, and sometimes is odd. so what do "widths" mean? Could you tell me in more detail? just give me an example. thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give me some more details. I am always waiting for your news. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fbarchard I not sure whether you are recently busy or not, but if I disturbed you, I am sorry. I look forward to hearing from you about more details on the issue soon.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the best person to review fuzz targets. An owner of the oss-fuzz should approve if they have someone willing to take on the support for the fuzz target itself.
I think you've got the basic call to the I420ToNV21 function right, but I think the fuzz end of things is wrong.
Someone familiar with fuzz should review. There seems to be 2 issues

  1. you are given a 'size' in bytes, and from that you attempt to come up with a width and height that fit. But if they dont, you loop. It would be better to not have a loop, but compute a width and height that are expected to work.
    For example come up with a width that is random and less than size or sqrt of size, and then calculate a height that will make an image less than size bytes.

  2. you allocate 'size' bytes for destination.
    thats not strictly right. the source and destination formats are different.
    In general it should compute how many bytes the destination will require based on the width and height.
    I assume in fuzz we dont care if the destination size is larger than 'size' which is used for source. It should allocate a buffer that is large enough.

@ToSeven ToSeven changed the title libyuv initial interagation libyuv initial integration Sep 25, 2020
kPixels = width * height;
kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
}
while (kPixels + kHalfPixels * 2 > size);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the best person to review fuzz targets. An owner of the oss-fuzz should approve if they have someone willing to take on the support for the fuzz target itself.
I think you've got the basic call to the I420ToNV21 function right, but I think the fuzz end of things is wrong.
Someone familiar with fuzz should review. There seems to be 2 issues

  1. you are given a 'size' in bytes, and from that you attempt to come up with a width and height that fit. But if they dont, you loop. It would be better to not have a loop, but compute a width and height that are expected to work.
    For example come up with a width that is random and less than size or sqrt of size, and then calculate a height that will make an image less than size bytes.

  2. you allocate 'size' bytes for destination.
    thats not strictly right. the source and destination formats are different.
    In general it should compute how many bytes the destination will require based on the width and height.
    I assume in fuzz we dont care if the destination size is larger than 'size' which is used for source. It should allocate a buffer that is large enough.

int stride = width;
width = fuzzed_data.ConsumeIntegralInRange(0, width);

libyuv::I420ToNV21(src_y_data, stride, src_u_data, (stride + 1) / 2, src_v_data, (stride + 1) / 2, dst_y_data, stride, dst_uv_data, ((width + 1) / 2) * 2, width, height);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(stride + 1) / 2 is misleading.
a stride is how many bytes to advance from row to row.
You could compute the stride based on width or come up with some stride variables.
src_y_stride = width;
src_uv_stride = (width + 1) / 2;
dst_y_stride = width;
dst_uv_stride = ((width + 1) / 2) * 2;

The dst uv stride you didn't round. if the width is odd, the stride needs to be rounded up.
e.g. width =3
dst_uv_stride is 4.

@ToSeven
Copy link
Contributor Author

ToSeven commented Oct 13, 2020

@jonathanmetzman @fbarchard @inferno-chromium Please review my code, Thanks!

@ToSeven
Copy link
Contributor Author

ToSeven commented Oct 28, 2020

Can someone review my code? I have waited for several weeks.....

@ToSeven
Copy link
Contributor Author

ToSeven commented Nov 5, 2020

@fbarchard I know you are not the best person to review fuzz targets, but I can not find anyone to review my code. you only need to check if my code has problems with the libyuv. If okay, I will contact the owner of the oss-fuzz to review fuzz targets. I really want to contribute my code to the open-source world. Thank you!

@ToSeven
Copy link
Contributor Author

ToSeven commented Dec 12, 2020

@jonathanmetzman, please review my code, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants