Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect undefined remote types in a separate pass #380

Closed

Conversation

erszcz
Copy link
Collaborator

@erszcz erszcz commented Feb 8, 2022

This tries fixing #379 by introducing a separate pass over the AST.

Remaining tasks:

  • fix typelib:remove_pos to actually remove pos from all remote_type elements
  • fix the failing test case

@erszcz erszcz marked this pull request as draft February 8, 2022 22:09
Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you like the idea of a separate pass. :-)

I think it is slightly more complex than this though. What about remote types that expand to other remote types? There may even be circular dependencies between remote types.

E.g. here is a call to gradualizer_lib:get_type_definition, which expands types and calls remove pos internally.

Instead of a completely separate pass, how about checking for undefined remote types (and undefined user types and records) just after importing a type or spec from gradualizer_lib and just before removing the pos from these types? That code is currently inside gradualizer_lib:get_type_definition but we can split it out again if you prefer.

  • fix typelib:remove_pos to actually remove pos from all remote_type elements

Doesn't it already? I.e. it removes the line and column, but it keeps the filename annotation in user types and records, which is good because it these are different in different modules.

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 9, 2022

Seems you like the idea of a separate pass. :-)

Well, it was relatively easy to pull off, at the very least to check if it solves the problem :)

I think it is slightly more complex than this though. What about remote types that expand to other remote types?

Good point!

Instead of a completely separate pass, how about [...]

I was thinking of incorporating this check in the existing pass, but it wasn't clear to me where we could do it. I'll try your suggestion out.

  • fix typelib:remove_pos to actually remove pos from all remote_type elements

Doesn't it already?

It doesn't remove pos from Mod and Name:

Gradualizer/src/typelib.erl

Lines 119 to 121 in b216594

remove_pos({remote_type, _, [Mod, Name, Params]}) ->
Params1 = lists:map(fun remove_pos/1, Params),
{remote_type, erl_anno:new(0), [Mod, Name, Params1]};

which do have it according to the trace from #379:

{trace,<0.93.0>,call,
 {typechecker,normalize_rec,
  [{remote_type,0,
    [{atom,3,'Elixir.Access'},
     {atom,3,get},
     [{atom,0,'Elixir.List'},
      {remote_type,0,[{atom,3,'Elixir.String'},{atom,3,t},[]]}]]},
...

@zuiderkwast
Copy link
Collaborator

It doesn't remove pos from Mod and Name:

Good point! It should, so that two identical remote types will compare as identical.

@erszcz erszcz force-pushed the issues/379-remote-type-position-handling branch from ae15ad6 to d95fd9e Compare February 9, 2022 15:46
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 9, 2022

@zuiderkwast

I think it is slightly more complex than this though. What about remote types that expand to other remote types?

It seems to me this case is covered by tests in undefined_errors_test.erl, specifically by undefined_errors: normalize_remote_type() which calls undefined_errors_helper:und_ty(), the spec of which in turn refers back to undefined_errors:undefined_type(). The implementation from yesterday couldn't catch it, but today I added a clause which looks for remote function calls and checks types used in specs of those remote functions.

If that's what you had in mind, I think we could live with this solution in a separate pass, as there are plenty of other bugs waiting to be fixed, and this is good enough to make CI green. WDYT?

@erszcz erszcz marked this pull request as ready for review February 9, 2022 16:59
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 9, 2022

It seems to me this case is covered by tests in undefined_errors_test.erl.

Ok, I'm taking that back, it's not tested yet. The one I described above is testing reference to a remote type via a spec, not via remote type expansion.

@erszcz erszcz force-pushed the issues/379-remote-type-position-handling branch from 5901bec to 29e59ab Compare February 9, 2022 17:32
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 9, 2022

What about remote types that expand to other remote types?

Ok, with the latest changes we also explicitly handle that 👍

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 11, 2022

@zuiderkwast Any feedback on this one? I think it's ready for merging.

What about remote types that expand to other remote types?

We handle that.

Instead of a completely separate pass, how about checking for undefined remote types (and undefined user types and records) just after importing a type or spec from gradualizer_lib and just before removing the pos from these types?

I considered that, but I don't think that anytime before actually calling type_check_forms we can say with certainty that all files which ought to be imported into the DB, are already imported. In other words, if we check for an undefined remote type in gradualizer_db when importing a file, it might be the case that the type is undefined now, but it's defined in the next file queued up for importing. Unless I misunderstood you?

Comment on lines +4842 to +4843
check_undefined_types(Forms) ->
gradualizer_lib:fold_ast(fun check_remote_type/2, [], Forms).
Copy link
Collaborator

@zuiderkwast zuiderkwast Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but I can't see that we handle the following:

  • remote types which expand to structures containing user types or records which are undefined
  • remote types which expand to structures containing more remote types Handled
  • user types and records belong to a remote module (i.e. they come from expanding a remote type) which refer to other remote types, user types and records
  • cycles of the above

Some of it could be solved if gradualizer_db would perform linting when parsing a module, but I don't think it does (yet). (Btw, that's another topic: Unify the gradualizer_db and typechecker code to parse and extracts stuff from a parse tree.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the difference between:

remote types which expand to structures containing user types or records which are undefined

and

user types and records belong to a remote module (i.e. they come from expanding a remote type) which refer to other remote types, user types and records

be? Could you give an example?

I've added a number of tests - could you confirm if they capture what you had in mind above? Apparently, there was no need to change the check itself.

Copy link
Collaborator

@zuiderkwast zuiderkwast Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote types which expand to structures containing user types or records which are undefined

-module(foo).
-export_type([t/0]).
-type t() :: {undef(), #undef{}}.

user types and records belong to a remote module (i.e. they come from expanding a remote type) which refer to other remote types, user types and records

-module(bar).
-export_type([t/0]).
-type t() :: [w() | #r{}].
-type w() :: {foo:undef()}.
-record(r, {f :: foo:undef()}).

If these are covered by tests, then I suspect the line and column are either zero (because I can't see that they're covered by the new pass, but only by the regular typechecking pass) or if they're non-zero, maybe line and column refer to the location in foo or bar where the error is rather than the module being type checked. We can probably report the line and column where foo:t() or bar:t() is used in the module being checked rather than where the error is in the foo and bar modules.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were right, this work so far only improved one particular situation when the reference was to a remote type which was not exported. All the other reports were still based on type representation fetched from gradualizer_db where position information was already lost, so the reported lines were 0.

@erszcz erszcz force-pushed the issues/379-remote-type-position-handling branch from 50007a8 to 6835d36 Compare February 11, 2022 18:01
Remove position info when fetching types from gradualizer_db not when importing them.
This means the caller can decide if they want pos info available or removed.
This, in turn, means the check_undefined_types pass can fetch types with this info present,
while the main typechecker pass can discard position info for easy type comparison.
@erszcz erszcz force-pushed the issues/379-remote-type-position-handling branch from 96df3b5 to 81da18b Compare February 15, 2022 13:58
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 15, 2022

"Store position info in gradualizer_db" changes where type information is removed in gradualizer_db. Instead of removing it on import, it's kept in the DB and only removed before returning from the DB. The caller can decide whether to remove the position information or keep it. The main typechecker pass asks for removing it, the check_undefined_types pass asks to keep it.

Results from current master:

> gradualizer:type_check_file("test/misc/undefined_errors.erl", []).
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type2/0 on line 0
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type3/0 on line 0
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type5/0 on line 0
test/misc/undefined_errors.erl: Undefined remote type undefined_errors_helper:not_exported_type/0 on line 0
test/misc/undefined_errors.erl: Undefined type undefined_type1/0 on line 0
test/misc/undefined_errors.erl: Undefined type undefined_type4/0 on line 0
test/misc/undefined_errors.erl: The atom on line 25 at column 41 is expected to have type #undefined_record1{} but it has type ok

-spec remote_struct_with_record() -> undefined_errors_helper:expands_to_struct_with_undefined_record().
remote_struct_with_record() -> {struct, ok}.
                                        ^^

test/misc/undefined_errors.erl: Call to undefined function undefined_errors_helper:undefined_call/0 on line 28 at column 41
test/misc/undefined_errors.erl: The function call on line 32 at column 20 is expected to have type #defined_record{} but it has type #undefined_record2{}

-record(defined_record, {a, b, c}).
-spec remote_record() -> #defined_record{}.
remote_record() -> undefined_errors_helper:und_rec().
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

nok
>
> gradualizer:type_check_file("priv/test/undefined_errors_helper.erl", []).
priv/test/undefined_errors_helper.erl: record undefined_record1 undefined on line 14 at column 61
priv/test/undefined_errors_helper.erl: record undefined_record2 undefined on line 16 at column 20
priv/test/undefined_errors_helper.erl: type undefined_type1() undefined on line 9 at column 14
priv/test/undefined_errors_helper.erl: type undefined_type4() undefined on line 13 at column 60
nok

Results with "Store position info in gradualizer_db":

> gradualizer:type_check_file("test/misc/undefined_errors.erl", []).
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type2/0 on line 11
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type3/0 on line 12
test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type5/0 on line 19
test/misc/undefined_errors.erl: Undefined remote type undefined_errors_helper:not_exported_type/0 on line 37 at column 25
test/misc/undefined_errors.erl: Undefined type undefined_type1/0 on line 0
test/misc/undefined_errors.erl: Undefined type undefined_type4/0 on line 0
test/misc/undefined_errors.erl: The atom on line 25 at column 41 is expected to have type #undefined_record1{} but it has type ok

-spec remote_struct_with_record() -> undefined_errors_helper:expands_to_struct_with_undefined_record().
remote_struct_with_record() -> {struct, ok}.
                                        ^^

test/misc/undefined_errors.erl: Call to undefined function undefined_errors_helper:undefined_call/0 on line 28 at column 41
test/misc/undefined_errors.erl: The function call on line 32 at column 20 is expected to have type #defined_record{} but it has type #undefined_record2{}

-record(defined_record, {a, b, c}).
-spec remote_record() -> #defined_record{}.
remote_record() -> undefined_errors_helper:und_rec().
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

nok
>
> gradualizer:type_check_file("priv/test/undefined_errors_helper.erl", []).
priv/test/undefined_errors_helper.erl: record undefined_record1 undefined on line 14 at column 61
priv/test/undefined_errors_helper.erl: record undefined_record2 undefined on line 16 at column 20
priv/test/undefined_errors_helper.erl: type undefined_type1() undefined on line 9 at column 14
priv/test/undefined_errors_helper.erl: type undefined_type4() undefined on line 13 at column 60
nok

Diff of the above:

--- remote-types.before.log	2022-02-15 14:44:11.000000000 +0100
+++ remote-types.after.log	2022-02-15 14:44:13.000000000 +0100
@@ -1,8 +1,8 @@
 > gradualizer:type_check_file("test/misc/undefined_errors.erl", []).
-test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type2/0 on line 0
-test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type3/0 on line 0
-test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type5/0 on line 0
-test/misc/undefined_errors.erl: Undefined remote type undefined_errors_helper:not_exported_type/0 on line 0
+test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type2/0 on line 11
+test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type3/0 on line 12
+test/misc/undefined_errors.erl: Undefined remote type undefined_errors:undefined_type5/0 on line 19
+test/misc/undefined_errors.erl: Undefined remote type undefined_errors_helper:not_exported_type/0 on line 37 at column 25
 test/misc/undefined_errors.erl: Undefined type undefined_type1/0 on line 0
 test/misc/undefined_errors.erl: Undefined type undefined_type4/0 on line 0
 test/misc/undefined_errors.erl: The atom on line 25 at column 41 is expected to have type #undefined_record1{} but it has type ok

As can be seen above we can still improve on:

 test/misc/undefined_errors.erl: Undefined type undefined_type1/0 on line 0
 test/misc/undefined_errors.erl: Undefined type undefined_type4/0 on line 0

and there's 1 test failing, but the direction is good 👍

@zuiderkwast
Copy link
Collaborator

I just added remove_pos calls to gradualizer_db in a PR very recently. :) TBH I don't like the direction with added complexity.

When we check mod.erl which refers to remote:t() which has some error, I think the error location should be that within mod.erl, so the position in gradualizer_db is not necessary. Those error positions will be returned when remote.erl is checked.

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 15, 2022

I just added remove_pos calls to gradualizer_db in a PR very recently. :)

I realise that and I think it was a good move. I'm just thinking it might be needed in slightly different places in gradualizer_db, but I'm not fixed on this being the right or the only way to solve the problem.

When we check mod.erl which refers to remote:t() which has some error, I think the error location should be that within mod.erl, so the position in gradualizer_db is not necessary. Those error positions will be returned when remote.erl is checked.

Hmmm, so do you think we should provide this position information after we catch the thrown {undef, remote_type, ...}, but before we return from typechecker:type_check_forms? That might work 🤔

@zuiderkwast
Copy link
Collaborator

Hmmm, so do you think we should provide this position information after we catch the thrown {undef, remote_type, ...}, but before we return from typechecker:type_check_forms? That might work

Yes, that's what I mean, sort of. It doesn't necessarily have to be a catch. If we pass around the current location, we can throw a tuple with the right location when we get not_found from gradualizer_db. Example:

    case gradualizer_db:get_record_type(Module, RecName) of
        not_found ->
            throw({undef, record, Anno, {Module, RecName}});
            %% ... where Anno is within the currently checked module

If this is done in the separate pass, we probably also need to pass around a set to prevent cycles.

Then, we can hopefully remove all the not_found cases from the main typechecking pass, if we can avoid typechecking functions, specs and types which refer to undefined types.

@erszcz
Copy link
Collaborator Author

erszcz commented Feb 15, 2022

Superseded by #384.

@erszcz erszcz closed this Feb 15, 2022
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 15, 2022

It doesn't necessarily have to be a catch. If we pass around the current location [...]

Obviously. I looked if we could pass the spec down from type_check_function, but apparently it's not available there.

Using the function's own position info would be a bit lame, as it's not the same as that of the spec (it might be completely off, there's no requirement on specs actually being close to the functions they specify, see https://github.com/erlang/otp/blob/60497f1b78d9fc7f02e16b8dba035548abbf4e9f/lib/stdlib/src/otp_internal.hrl#L32-L36 for an example).

So the spec is only looked up in #env.fenv downstream of type_check_function. #env.fenv on the other hand is populated as early as in create_env. All in all, it seemed to be easier to catch and rethrow on the site of fetching the spec from #env.fenv - please see #384 for the actual solution. It's significantly simpler than this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants