Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mysterious case of vars dropped in iteration after rewrite #982

Closed
anderseknert opened this issue Aug 12, 2024 · 7 comments
Closed

Mysterious case of vars dropped in iteration after rewrite #982

anderseknert opened this issue Aug 12, 2024 · 7 comments

Comments

@anderseknert
Copy link
Member

anderseknert commented Aug 12, 2024

I was hoping to be able to reproduce this in OPA only, but so far I've not been able to do so. Therefore I'll file an issue here, and we can see where to take it later. This is by all means a bug somewhere, but since it doesn't affect Regal as the code currently is shipped, I won't label it as such.

In an effort to reduce the number of walk calls made, I've tried to replace two "static" rules with one dynamic. The code below is found in the search.rego file.

Old

found.refs[rule_index] contains value if {
    some i, rule in _rules

    # converting to string until https://github.com/open-policy-agent/opa/issues/6736 is fixed
    rule_index := sprintf("%d", [i])

    walk(rule, [_, value])

    is_ref(value)
}

found.symbols[rule_index] contains value.symbols if {
    some i, rule in _rules

    # converting to string until https://github.com/open-policy-agent/opa/issues/6736 is fixed
    rule_index := sprintf("%d", [i])

    walk(rule, [_, value])
}

New

found[type][rule_index] contains item if {
    some i, rule in _rules

    # converting to string until https://github.com/open-policy-agent/opa/issues/6736 is fixed
    rule_index := sprintf("%d", [i])

    walk(rule, [_, value])

    [type, item] := refs_or_symbols(value)
}

refs_or_symbols(value) := ["refs", value] if {
    is_ref(value)
    not value.symbols
}

refs_or_symbols(value) := ["symbols", value.symbols]

While this seems to work, the remaining "static" rule now shows some truly weird behavior:

found.vars[rule_index][context] contains var if {
    some i, rule in _rules

    # converting to string until https://github.com/open-policy-agent/opa/issues/6736 is fixed
    rule_index := sprintf("%d", [i])

    walk(rule, [path, value])

    some context, vars in _find_vars(value, regal.last(path))
    some var in vars
}

Storing the result of _find_vars(value, regal.last(path)) shows a few vars found in the input AST. However, printing the value of the context in the some iteration below shows only a few of those.

found.vars[rule_index][context] contains var if {
    some i, rule in _rules

    # converting to string until https://github.com/open-policy-agent/opa/issues/6736 is fixed
    rule_index := sprintf("%d", [i])

    walk(rule, [path, value])

    fv := _find_vars(value, regal.last(path))

    print(fv) # will print all variables as expected

    some context, vars in fv

    print(context) # will only print a few variables from fv

    some var in vars
}

Example input to reproduce:

p.rego

package p

import rego.v1

allow if {
    some x
    input[x]
}

Now run:

go run main.go lint --enable-print p.rego

Before the change, the output of print will be:

{"some": [{"location": {"col": 7, "row": 6, "text": "eA=="}, "type": "var", "value": "x"}]}
some
{"ref": [{"location": {"col": 8, "row": 7, "text": "eA=="}, "type": "var", "value": "x"}]}
ref

As expected. After the change, the output is:

{"some": [{"location": {"col": 7, "row": 6, "text": "eA=="}, "type": "var", "value": "x"}]}
{"ref": [{"location": {"col": 8, "row": 7, "text": "eA=="}, "type": "var", "value": "x"}]}
ref

As seen, the some var that should have been bound to context dissapeared somewhere.

@srenatus
Copy link
Member

I've been zooming in on what the compiler does to data.regal.ast.found. Without the change (old), we end up with this:

found.vars[__local290__][__local291__] contains __local295__ if {
	__local289__ = data.regal.ast._rules[__local288__]
	sprintf("%d", [__local288__], __local1972__)
	__local290__ = __local1972__
	walk(__local289__, [path, value])
	regal.last(path, __local1973__)
	data.regal.ast._find_vars(value, __local1973__, __local1974__)
	__local292__ = __local1974__[__local291__]
	__local295__ = __local292__[__local294__]
}

found.refs[__local298__] contains value if {
	__local297__ = data.regal.ast._rules[__local296__]
	sprintf("%d", [__local296__], __local1975__)
	__local298__ = __local1975__
	walk(__local297__, [_, value])
	data.regal.ast.is_ref(value)
}

found.symbols[__local301__] contains __local3060__ if {
	__local300__ = data.regal.ast._rules[__local299__]
	sprintf("%d", [__local299__], __local1976__)
	__local301__ = __local1976__
	walk(__local300__, [_, value])
	__local3060__ = value.symbols
}

With the change, we get:

found.vars[__local290__][__local291__] contains __local295__ if {
	__local289__ = data.regal.ast._rules[__local288__]
	sprintf("%d", [__local288__], __local1973__)
	__local290__ = __local1973__
	walk(__local289__, [path, value])
	regal.last(path, __local1974__)
	data.regal.ast._find_vars(value, __local1974__, __local1975__)
	__local292__ = __local1975__[__local291__]
	__local295__ = __local292__[__local294__]
}

found[__local299__][__local298__] contains __local300__ if {
	__local297__ = data.regal.ast._rules[__local296__]
	sprintf("%d", [__local296__], __local1976__)
	__local298__ = __local1976__
	walk(__local297__, [_, value])
	data.regal.ast.refs_or_symbols(value, __local1977__)
	[__local299__, __local300__] = __local1977__
}

Now, we know from looking at this in our zoom call, that for the second eval, this is what failed:

	__local292__ = __local1975__[__local291__]

(it's the some context, vars in fv). So that means when read head rules overlap like this, the evaluator has a problem with scoping thins correctly: it's keeping the previous iterations bindings when it shouldn't.

⚠️ None of this is new information, just putting down what we talked about. Maybe @johanfylling has an idea about the eval change there ☝️

@anderseknert
Copy link
Member Author

anderseknert commented Aug 13, 2024

I spent some time just now trying to narrow this one down somewhat, and while I still haven't got a clue as to what the cause of this is, I've at least found that:

  1. It's not a concurrency issue. Removing all the code launching goroutines for linting doesn't make any difference

  2. Disabling all other rules "fixes" the issue

go run main.go lint --enable-print --disable-all --enable use-some-for-output-vars  p.rego
{"some"}
some
{"ref"}
ref
{"ref"}
ref
1 file linted. No violations found.
  1. Updating the query sent to make a direct reference to data.regal.ast.found.vars "fixes" the issue

i.e. going from this

	lintQuery = ast.MustParseBody(`lint := {
		"violations": data.regal.main.lint.violations,
		"notices": data.regal.main.lint.notices,
	}`)

to this

	lintQuery = ast.MustParseBody(`lint := {
		"violations": data.regal.main.lint.violations,
		"notices": data.regal.main.lint.notices,
		"vars": data.regal.ast.found.vars,
	}`)
go run main.go lint --enable-print p.rego
{"some"}
some
{"ref"}
ref
{"ref"}
ref
1 file linted. No violations found.

I don't know what to make of this, but perhaps it'll help someone else get closer to a resolution of this mystery... :) CC @johanfylling @srenatus

@anderseknert
Copy link
Member Author

Starting to really narrow it down now! Turns out there's a single rule that when enabled triggers this, and when disabled things work as expected again.

go run main.go lint --enable-print --disable unused-output-variable  p.rego
{"some"}
some
{"ref"}
ref
{"ref"}
ref
1 file linted. No violations found.

There are many rules that query ast.found.vars, but this linter rule does it differently, and that seems to be the cause of the issue:

➜ ack ast.found.vars
bundle/regal/rules/bugs/unused_output_variable.rego
45:	var := ast.found.vars[rule_index].ref[_]

bundle/regal/rules/bugs/var_shadows_builtin.rego
11:	var := ast.found.vars[_][_][_]

bundle/regal/rules/style/prefer_snake_case.rego
20:	var := ast.found.vars[_][_][_]

bundle/regal/rules/testing/metasyntactic_variable.rego
46:	var := ast.found.vars[i][_][_]

bundle/regal/rules/custom/naming_convention.rego
84:	var := ast.found.vars[_][_][_]

@johanfylling
Copy link
Member

I wonder if this isn't a virtual-cache thing 🤔 ..

@anderseknert
Copy link
Member Author

anderseknert commented Aug 13, 2024

Finally able to reproduce this in OPA: open-policy-agent/opa#6926

Since this isn't an issue in any current code used in this project, I'll close this issue. Definitely coming back to this when fixed in OPA though!

@srenatus
Copy link
Member

Time to revisit this, perhaps? 🤔

@anderseknert
Copy link
Member Author

Yes, too much to do right now, but once we have OPA v0.68.0 out and about, let's do that!

And create a new issue about it being much slower than walking twice, lol. But could be that the virtual caching fixed that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants