libyuv initial integration #4363

ToSeven · 2020-08-25T07:16:42Z

No description provided.

jonathanmetzman · 2020-08-25T23:55:08Z

projects/libyuv/libyuv_fuzzer.options

@@ -0,0 +1,2 @@
+[libfuzzer]


since AFL and Honggfuzz don't respect max_len we generally discourage doing this.
It's better to enforce this in LLVMFuzzerTestOneInput.

ok, got it.

In the line
libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data, width, width, height);

The 3rd to last value is a stride that may be affected by odd width subsampling.
It should be( (width + 1) / 2) * 2 or (width + 1) & ~1

Thank you for your reply! This is my mistake, so I have modified this line of code and show blow.
libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data,(width+1)/2*2, width, height);

jonathanmetzman · 2020-08-25T23:59:59Z

projects/libyuv/libyuv_fuzzer.cc

+unsigned int fastrand_seed;
+inline int fastrand()
+{
+    fastrand_seed = fastrand_seed * 214013u + 2531011u;


If i run this fuzzer today and then I run the same binary tomorrow, can the same testcase/input/data behave differently?
I think this would be a major problem if I understand this right. You will get unreproducible crashes and fuzzing won't work properly because different processes will have different ideas on what testcases are doing.
I think https://github.com/google/fuzzing/blob/master/docs/split-inputs.md might allow you to do something similar in a way that is deterministic wrt the input (data). Can you use this? Or please let me know if I'm misunderstanding.
I this is nondeterministic because you use srand to set fastrand_seed and then use this in for setting height and width. So actually to me it looks like the same data won't behave the same even on execs of LLVMFuzzerTestOneInput in the same process. libFuzzer targets should to be deterministic and stateless.

You are right! I have understanded this problem, and will read the split-inputs.md , fix this problem.

jonathanmetzman · 2020-08-26T00:03:40Z

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

ToSeven · 2020-08-26T03:45:22Z

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

I am not a maintainer, but I have planning to contribute my code to the official repository. Should I email the maintainers to comment here? or I firstly commit the code to libyuv then contribute it to oss-fuzz.

ToSeven · 2020-08-26T04:29:56Z

Additionally, Is *.dict neccesary for the fuzzing in libyuv? I do not know how to write the keyword in *.dict. Can you give me some advices？ thanks!

jonathanmetzman · 2020-08-26T18:54:04Z

Are you one of the maintainers of libyuv? Which one? If not could you please ask one of them to comment here saying they want this integration? Thanks!

I am not a maintainer, but I have planning to contribute my code to the official repository. Should I email the maintainers to comment here? or I firstly commit the code to libyuv then contribute it to oss-fuzz.

Please get them to comment here first. We don't allow integrating projects without maintainer permission since as soon as it's integrating they will start getting emails for every bug OSS-Fuzz finds and if they aren't ready to fix them, the bug can be made public without a fix (after 90 days).

jonathanmetzman · 2020-08-26T18:56:44Z

Additionally, Is *.dict neccesary for the fuzzing in libyuv? I do not know how to write the keyword in *.dict. Can you give me some advices？ thanks!

I don't really know what you mean by "Is *.dict ". Dictionaries aren't necessary but help a lot.
Here's a dict for JavaScript: https://github.com/google/fuzzing/blob/master/dictionaries/js.dict
more docs: http://llvm.org/docs/LibFuzzer.html#dictionaries

Also note that you don't need to do kw="blah" you can just write "blah" on it's on line in a dict.

ToSeven · 2020-08-28T02:54:10Z

@jonathanmetzman I have invited the maintainer to comment here. there is a question. If I modified my code and merged it to the OSS-Fuzz, Would I get a $1,000 bonus for the initial integration? I am a student and want to earn some money to relieve my parents to afford. so it is important for me.

inferno-chromium · 2020-09-07T07:41:56Z

@fbarchard @dalecurtis - are you ok with libyuv integration to OSS-Fuzz, can you please take a review pass on fuzzer code. Are you ok with receiving with libyuv bugs.

@ToSeven - can you fix the projects/libyuv/project.yaml conflict so that CI can run

ToSeven · 2020-09-07T08:40:40Z

@fbarchard @dalecurtis - are you ok with libyuv integration to OSS-Fuzz, can you please take a review pass on fuzzer code. Are you ok with receiving with libyuv bugs.

@ToSeven - can you fix the projects/libyuv/project.yaml conflict so that CI can run
Ok!

dalecurtis · 2020-09-08T18:21:59Z

Again defer to the project owner @fbarchard, but I think this is a good idea. lgtm

fbarchard · 2020-09-08T21:36:18Z

projects/libyuv/libyuv_fuzzer.cc

+        height = fastrand() % range_size ;
+        width = fastrand() % range_size;
+        kPixels = width * height;
+        kHalfPixels = (width + 1) / 2 * (height + 1) / 2;


kHalfPixels = (width + 1) / 2 * (height + 1) / 2; will evaluate the * before the / 2 ?
Suggest
kHalfPixels = ((width + 1) / 2) * ((height + 1) / 2);

fbarchard · 2020-09-08T21:42:40Z

projects/libyuv/project.yaml

@@ -4,10 +4,14 @@ primary_contact: "mikhal@webrtc.org"
 auto_ccs:
    - "fbarchard@google.com"
    - "frkoenig@google.com"
+    - "byone.heng@google.com"


this is not a valid email?

sorry, I mistake this email address. It should be byone.heng@gmail.com

fbarchard · 2020-09-08T21:43:31Z

projects/libyuv/project.yaml

 fuzzing_engines:
    - libfuzzer
    - afl
 sanitizers:
    - address
    - memory
    - undefined
+architectures:
+    - x86_64


libyuv supports arm and mips as well. Not sure if they are supported by fuzz

https://google.github.io/oss-fuzz/getting-started/new-project-guide/

fbarchard · 2020-09-08T21:49:39Z

I dont maintain the OSS version. There is a chromium git version and an internal version. If a bug is found/reported, I can fix it up stream, but it'll need to be reproduced without OSS/fuzz.

…

On Tue, Sep 8, 2020 at 11:22 AM Dale Curtis ***@***.***> wrote: Again defer to the project owner @fbarchard <https://github.com/fbarchard>, but I think this is a good idea. lgtm — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4363 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADRZQDJVV36HCUAYZAJRX5LSEZY5NANCNFSM4QKKDYEA> .

fbarchard · 2020-09-11T05:21:29Z

projects/libyuv/libyuv_fuzzer.cc

+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
+{
+
+    FuzzedDataProvider fuzzed_data(data, size);


This doesn't allocate enough memory for the source.
The amount of bytes consumed is
int range_size = sqrt((static_cast(size) + 1) / 2);
size_t width = fuzzed_data.ConsumeIntegralInRange(1, range_size);
size_t height = fuzzed_data.ConsumeIntegralInRange(1, range_size);
int kPixels = width * height;
int kHalfPixels = ((width + 1) / 2) * ((height + 1) / 2);
size_t new_size = kPixels + kHalfPixels * 2

libyuv::I420ToNV21 will consume new_size bytes if you pass these width and height parameters.

For example if you were passed a size of 1, your code computes width and height of 1.
the Y plane is 1x1 so 1 byte
the U and V planes are half width and height but round up to 1x1 so 1 byte each
total of 3 bytes consumed.

theres really no valid 1 byte YUV image. its 3 planes and each plane is a minimum of 1 byte.
if you're working backward from bytes, you'll need to be careful about how you compute width and height if you need to stay under 'size' bytes.
Your width and height assume 1 plane I think"?
Come up with a formula where width * height + ((width + 1) / 2) * ((height + 1) / 2) * 2 is <= size
e.g. for 4 bytes, width and height are still 1. or 2x1. but not 2x2. 2x2 takes 6 bytes.
it doesnt need to be square and for good code coverage it probably shouldnt be.
for example and interesting case is 1x16
a width of 1 requires the rows of u and v to be rounded out to 1
y = 1x16 = 16 bytes
u = 1x8
v = 1x8
total is 32 bytes.

I modify my code and show blew.

int range_size = (static_cast(size) + 1) / 2;
do
{
height = fuzzed_data.ConsumeIntegralInRange(0, range_size);
width = fuzzed_data.ConsumeIntegralInRange(0, range_size) ;
kPixels = width * height;
kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
}while (kPixels + kHalfPixels * 2 > size);

fbarchard · 2020-09-11T05:23:22Z

projects/libyuv/libyuv_fuzzer.cc

+
+    FuzzedDataProvider fuzzed_data(data, size);
+
+    int range_size = sqrt((static_cast<int>(size) + 1) / 2);


consider non-square?
if you come up with a randomized width that is less than size you can compute height

fbarchard · 2020-09-11T05:24:53Z

projects/libyuv/libyuv_fuzzer.cc

+    if (cpu_flag == 1)
+    {
+        //enable all cpu specific optimizations.
+        libyuv::MaskCpuFlags(-1);


this only tests 1 instruction set - whatever the host machine has. e.g. avx2.
there is code for ssse3 and avx2 for most functions. as well as c.

Okay, got it.

fbarchard · 2020-09-11T05:33:44Z

projects/libyuv/libyuv_fuzzer.cc

+    const uint8_t *src_u_data = second_part.data();
+    const uint8_t *src_v_data = third_part.data();
+
+    libyuv::I420ToNV21(src_y_data, width, src_u_data, (width + 1) / 2, src_v_data, (width + 1) / 2, dst_y_data, width, dst_uv_data, ((width + 1) / 2) * 2, width, height);


this tests the simple case where width == stride.
the function will notice this and I420ToNV21 will treat the each plane as a single row.
e.g. the Y plane of I420 and NV21 are actually the same. If width = stride it will change the parameters internally to 0 for stride and width * height, 1 for width and height, making a single call to memcpy() for the entire image, testing very different code, even if the width was odd.
ideally you would test where width is less than stride, forcing the code to do 1 row at a time.
libyuv functions also allow a negative stride, which means the image will be inverted, but you will need to adjust the starting pointer if you care to test that.
the stride could be randomized to a value less than width, but the results of the image would be unpredictable and not all source bytes would be consumed.
the width could be randomized to be less than stride. it buffer size is determined by stride but you can process a subregion.

fbarchard · 2020-09-11T05:35:36Z

projects/libyuv/libyuv_fuzzer.cc

+    std::vector<uint8_t> second_part =
+        fuzzed_data.ConsumeBytes<uint8_t>(kHalfPixels);
+    std::vector<uint8_t> third_part =
+        fuzzed_data.ConsumeRemainingBytes<uint8_t>();


The function expects to consume kHalfPixels worth of bytes, but due to size being too small, this vector wont allocate enough?

ToSeven · 2020-09-18T08:06:49Z

Please review my code, thanks!

fbarchard · 2020-09-22T04:00:32Z

projects/libyuv/fuzz_common.cc

+    cpu_info = -1;
+
+#if defined(__arm__) || defined(__aarch64__)
+    return cpu_info &= ~(random_num & 0x04);


4 is not safe... it could change in future.
It would be safe to literally mask the flags with a random number, disabling parts of the cpuinfo.
e.g. disable avx.
Its safe to disable anything, just not enable anything.

fbarchard · 2020-09-22T04:01:59Z

projects/libyuv/libyuv_fuzzer.cc

+        kPixels = width * height;
+        kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
+    }
+    while (kPixels + kHalfPixels * 2 > size);


This loop effectively doesnt test odd widths, since they round the size up. It wont crash but it wont test interesting values

The 44th line should be kHalfPixels=(( width + 1) / 2 )*(( height + 1 ) / 2 ), I forgot to modify it.
I am sorry that I can't understand your meaning. ”widths“ is generated randomly by the fuzzer, and sometimes is odd. so what do "widths" mean? Could you tell me in more detail? just give me an example. thanks!

Please give me some more details. I am always waiting for your news. Thanks!

@fbarchard I not sure whether you are recently busy or not, but if I disturbed you, I am sorry. I look forward to hearing from you about more details on the issue soon.

I'm not the best person to review fuzz targets. An owner of the oss-fuzz should approve if they have someone willing to take on the support for the fuzz target itself.
I think you've got the basic call to the I420ToNV21 function right, but I think the fuzz end of things is wrong.
Someone familiar with fuzz should review. There seems to be 2 issues

you are given a 'size' in bytes, and from that you attempt to come up with a width and height that fit. But if they dont, you loop. It would be better to not have a loop, but compute a width and height that are expected to work.
For example come up with a width that is random and less than size or sqrt of size, and then calculate a height that will make an image less than size bytes.

you allocate 'size' bytes for destination.
thats not strictly right. the source and destination formats are different.
In general it should compute how many bytes the destination will require based on the width and height.
I assume in fuzz we dont care if the destination size is larger than 'size' which is used for source. It should allocate a buffer that is large enough.

fbarchard · 2020-10-08T05:25:07Z

projects/libyuv/libyuv_fuzzer.cc

+        kPixels = width * height;
+        kHalfPixels = (width + 1) / 2 * (height + 1) / 2;
+    }
+    while (kPixels + kHalfPixels * 2 > size);


I'm not the best person to review fuzz targets. An owner of the oss-fuzz should approve if they have someone willing to take on the support for the fuzz target itself.
I think you've got the basic call to the I420ToNV21 function right, but I think the fuzz end of things is wrong.
Someone familiar with fuzz should review. There seems to be 2 issues

you are given a 'size' in bytes, and from that you attempt to come up with a width and height that fit. But if they dont, you loop. It would be better to not have a loop, but compute a width and height that are expected to work.
For example come up with a width that is random and less than size or sqrt of size, and then calculate a height that will make an image less than size bytes.

you allocate 'size' bytes for destination.
thats not strictly right. the source and destination formats are different.
In general it should compute how many bytes the destination will require based on the width and height.
I assume in fuzz we dont care if the destination size is larger than 'size' which is used for source. It should allocate a buffer that is large enough.

fbarchard · 2020-10-08T05:30:26Z

projects/libyuv/libyuv_fuzzer.cc

+    int stride = width;
+    width = fuzzed_data.ConsumeIntegralInRange(0, width);
+
+    libyuv::I420ToNV21(src_y_data, stride, src_u_data, (stride + 1) / 2, src_v_data, (stride + 1) / 2, dst_y_data, stride, dst_uv_data, ((width + 1) / 2) * 2, width, height);


(stride + 1) / 2 is misleading.
a stride is how many bytes to advance from row to row.
You could compute the stride based on width or come up with some stride variables.
src_y_stride = width;
src_uv_stride = (width + 1) / 2;
dst_y_stride = width;
dst_uv_stride = ((width + 1) / 2) * 2;

The dst uv stride you didn't round. if the width is odd, the stride needs to be rounded up.
e.g. width =3
dst_uv_stride is 4.

ToSeven · 2020-10-13T11:29:23Z

@jonathanmetzman @fbarchard @inferno-chromium Please review my code, Thanks!

ToSeven · 2020-10-28T09:34:36Z

Can someone review my code? I have waited for several weeks.....

ToSeven · 2020-11-05T02:01:24Z

@fbarchard I know you are not the best person to review fuzz targets, but I can not find anyone to review my code. you only need to check if my code has problems with the libyuv. If okay, I will contact the owner of the oss-fuzz to review fuzz targets. I really want to contribute my code to the open-source world. Thank you!

ToSeven · 2020-12-12T10:23:38Z

@jonathanmetzman, please review my code, thanks!

jonathanmetzman reviewed Aug 25, 2020

View reviewed changes

jonathanmetzman reviewed Aug 26, 2020

View reviewed changes

ToSeven mentioned this pull request Sep 3, 2020

Potential new projects to integrate with OSS-Fuzz #402

Closed

44 tasks

DavidKorczynski mentioned this pull request Sep 6, 2020

[libyuv] MJPEG codec fuzzer #4411

Closed

fbarchard reviewed Sep 8, 2020

View reviewed changes

ToSeven force-pushed the master branch 2 times, most recently from 28dab49 to f141025 Compare September 10, 2020 09:55

fbarchard reviewed Sep 11, 2020

View reviewed changes

ToSeven force-pushed the master branch from f141025 to 536fced Compare September 15, 2020 01:31

fbarchard reviewed Sep 22, 2020

View reviewed changes

ToSeven changed the title ~~libyuv initial interagation~~ libyuv initial integration Sep 25, 2020

fbarchard reviewed Oct 8, 2020

View reviewed changes

libyuv initial integration

f86f29d

ToSeven force-pushed the master branch from 536fced to f86f29d Compare October 13, 2020 10:55

Merge branch 'google:master' into master

aebcbe4

oliverchang closed this Dec 1, 2022

DavidKorczynski mentioned this pull request Jan 3, 2024

Initial libyuv integration #11436

Open


		FuzzedDataProvider fuzzed_data(data, size);

		int range_size = sqrt((static_cast<int>(size) + 1) / 2);

libyuv initial integration #4363

libyuv initial integration #4363

Conversation

ToSeven commented Aug 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathanmetzman Aug 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathanmetzman commented Aug 26, 2020

ToSeven commented Aug 26, 2020 • edited by jonathanmetzman Loading

ToSeven commented Aug 26, 2020

jonathanmetzman commented Aug 26, 2020

jonathanmetzman commented Aug 26, 2020

ToSeven commented Aug 28, 2020 • edited Loading

inferno-chromium commented Sep 7, 2020

ToSeven commented Sep 7, 2020

dalecurtis commented Sep 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToSeven Sep 9, 2020 • edited Loading

Choose a reason for hiding this comment

fbarchard commented Sep 8, 2020 via email

Choose a reason for hiding this comment

ToSeven Sep 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToSeven commented Sep 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToSeven commented Oct 13, 2020

ToSeven commented Oct 28, 2020

ToSeven commented Nov 5, 2020

ToSeven commented Dec 12, 2020

jonathanmetzman Aug 25, 2020 •

edited

Loading

ToSeven commented Aug 26, 2020 •

edited by jonathanmetzman

Loading

ToSeven commented Aug 28, 2020 •

edited

Loading

ToSeven Sep 9, 2020 •

edited

Loading

ToSeven Sep 11, 2020 •

edited

Loading