Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold modulo operator on constant values and raise zero remainder error quickly #2252

Closed
wants to merge 1 commit into from

Conversation

itchyny
Copy link
Contributor

@itchyny itchyny commented Jan 24, 2021

Currently constant division is folded on parsing and zero division raises Division by zero? error quickly. There is Remainder by zero? error message found in parser.y but this will never be thrown because constant_fold does not fold the modulo operator. So I added folding on constant modulo and make jq -n "1%0" throws the Remainder by zero? error and exits with 3 (just like jq -n "1/0"). The infinity is caught by block_is_const_inf. Returning infinity here may look wired (zero modulo is not infinity) but it applies to division as well; zero division in jq is not infinity but it's ok because the infinity is filtered out by block_is_const_inf. This follows the commit b9c2a32.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 84.136% when pulling 3442d21 on itchyny:fold-modulo-constant into 80052e5 on stedolan:master.

@nicowilliams
Copy link
Contributor

Rebased to run tests.

@itchyny itchyny force-pushed the fold-modulo-constant branch from a60140b to 45a0831 Compare July 25, 2023 00:44
@itchyny
Copy link
Contributor Author

itchyny commented Jul 25, 2023

I also updated the constant folding tests.

@itchyny itchyny added this to the 1.7 release milestone Jul 25, 2023
Copy link
Member

@wader wader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe parts of your commit message could be a code comment on why infinity is returned?

@@ -230,6 +230,7 @@ static block constant_fold(block a, block b, int op) {
case '-': res = jv_number(na - nb); break;
case '*': res = jv_number(na * nb); break;
case '/': res = jv_number(na / nb); break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix the division by zero here too. Just FYI.

@nicowilliams
Copy link
Contributor

Maybe parts of your commit message could be a code comment on why infinity is returned?

Oh, hmm, maybe there's a better way to deal with division by zero / modulo zero in this function: if the divisor is zero then return a no-op block so that the caller will not constant-fold this so that the jq program can get a run-time error instead of infinity. Or maybe we should try to make this a compile-time error. We'd need some way for constant_fold() to indicate "error" and then have the callers of gen_binop() FAIL().

What do you think, @itchyny?

@nicowilliams nicowilliams requested a review from emanuele6 July 27, 2023 16:49
@emanuele6
Copy link
Member

emanuele6 commented Jul 27, 2023

I agree, 100 % 0 should not be folded to infinity, it should either not be folded, or generate a compile time error

EDIT: I didn't realise INFINITY is used to signal errors to the caller when I wrote this comment, but yes, making consant_fold() return a no-op block sounds more sensible than using INFINITY, also because the way we are currently detecting division by zero with INFINITY relies on undefined behaviour.

@emanuele6
Copy link
Member

Also (maybe it does not matter much for constant folding since you can't have a constant nan, infinity, etc) things like jv_number_value(a) % jv_number_value(b) implicitly convert double values to integer values, which is undefined behaviour if converting nan and some other values.
Since in C, division/modulo by 0 is unspecified for all values, and casting double to int is unspecified for some values, we should probably have DIV() and MOD() macros or functions that handle all the edge cases, and use them everywhere instead of bare / and % when dealing with values returned by jv_number_value().

@itchyny
Copy link
Contributor Author

itchyny commented Jul 27, 2023

Ahh wait, zero division is undefined behavior in C? I didn't know that.

@emanuele6
Copy link
Member

emanuele6 commented Jul 27, 2023

I didn't know either! ^^

I only learned it a few days ago because OSS Fuzz tried to compile 0/0, and the constant folding triggered an UBSAN warning.

I wasn't able to reproduce the UBSAN warning locally for some reason, but I looked at the C standard, and apparently division/modulo with 0 as rhs is unspecified behaviour for all types.

@emanuele6
Copy link
Member

Since in C, division/modulo by 0 is unspecified for all values, and casting double to int is unspecified for some values, we should probably have DIV() and MOD() macros or functions that handle all the edge cases, and use them everywhere instead of bare / and % when dealing with values returned by jv_number_value().

Another alternative solution I like is to somehow export the f_plus, f_mod, f_divide, etc functions from src/builtin.c and use those for constant folding. Those functions should already handle all the edge cases, and do type checking by themselves, so if you use them we don't even have to duplicate the type checking and error messages parts. We can just have a look up table that calls f_plus for op == '+', f_divide for op == '/', etc. and then check if the return value is invalid, and if it is use that as error message.

@nicowilliams
Copy link
Contributor

My preference is that constant_fold() return gen_noop() whenever it's asked to divide by zero. This will push the handling of division by literal zero into run-time if this is all we do.

We could also then handle this as a compile-time error at the call-sites for gen_binop(), though that would be a bit more complex since we would need a convention for constant_fold() to return an error that gen_binop() can return and that the call-sites can check for. Since we're close to a release, I think I'd rather do the simple thing that is pushing the error to run-time by having constant_fold() return gen_noop() when the divisor is zero.

@@ -230,6 +230,7 @@ static block constant_fold(block a, block b, int op) {
case '-': res = jv_number(na - nb); break;
case '*': res = jv_number(na * nb); break;
case '/': res = jv_number(na / nb); break;
case '%': res = jv_number((intmax_t)nb == 0 ? INFINITY : (intmax_t)na % (intmax_t)nb); break;
Copy link
Contributor

@nicowilliams nicowilliams Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this instead:

diff --git a/src/parser.y b/src/parser.y
index 549f6f74..879774b4 100644
--- a/src/parser.y
+++ b/src/parser.y
@@ -230,7 +230,8 @@ static block constant_fold(block a, block b, int op) {
     case '+': res = jv_number(na + nb); break;
     case '-': res = jv_number(na - nb); break;
     case '*': res = jv_number(na * nb); break;
-    case '/': res = jv_number(na / nb); break;
+    case '/': if (nb == 0.0) return gen_noop(); res = jv_number(na / nb); break;
+    case '%': if (nb == 0.0) return gen_noop(); res = jv_number(na % nb); break;
     case EQ:  res = (cmp == 0 ? jv_true() : jv_false()); break;
     case NEQ: res = (cmp != 0 ? jv_true() : jv_false()); break;
     case '<': res = (cmp < 0 ? jv_true() : jv_false()); break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turns division of a literal by a literal zero into a run-time error.

Copy link
Member

@emanuele6 emanuele6 Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicowilliams @itchyny Yes, I say let's do that and stop treating division by zero (expressions that return INFINITY) compile time errors; that was just confusing.
Then let's also revert #2253, and close #2780 since we don't actually want 0/0 to return NaN.
After that the OSS fuzz issue should be fixed, and costant-folding's behaviour will be aligned with f_divide.

@itchyny Since you can't see it, the OSS fuzz issue is https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60848 if you want to link it in the commit; the reproducer is jq '0/0' (it doesn't trigger UBSAN on my amd64 PC though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants