Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function overloads with function args not narrowing inner function params #2758

Closed
CodeGreen0386 opened this issue Jul 14, 2024 · 4 comments · Fixed by #2765
Closed

Function overloads with function args not narrowing inner function params #2758

CodeGreen0386 opened this issue Jul 14, 2024 · 4 comments · Fixed by #2765

Comments

@CodeGreen0386
Copy link

How are you using the lua-language-server?

Visual Studio Code Extension (sumneko.lua)

Which OS are you using?

Windows

What is the issue affecting?

Annotations, Type Checking, Hover

Expected Behaviour

LuaLS correctly narrows a function overload based on the arguments of the function and the inner function params are also narrowed correctly.

Actual Behaviour

LuaLS either defaults to the base outer function definition, or narrows enough to get the hover over the inner function params correct, but type checking the params fail to narrow fields of said params.

Reproduction steps

I have no clue how to reproduce this, it just happens whenever it feels like it after a while and does not persist after a refresh. However, it still happens quite often and it is getting mildly annoying. Sometimes it happens after modifying the file a little bit, it just seems random.

Additional Notes

The Factorio modding community heavily uses automatically generated LuaLS annotations from the machine readable API documentation for Factorio. One function in particular, script.on_event(), seems to have endlessly caused trouble with LuaLS over the years and recently we have switched to using overloads, since that fit the purpose of the function better. Several of us have been encountering an issue with this method specifically, and nobody has been able to reproduce it consistently.

Here's a link to the API docs for the aforementioned function:
https://lua-api.factorio.com/latest/classes/LuaBootstrap.html#on_event

The first parameter, event, further referred to as the event ID, is either an enum value or a string, or an array of enum values or strings. For the purposes of this bug report, we can ignore the array, because it is irrelevant to the issue.

The second parameter, handler, is a function that takes one argument, also called event, which contains all of the information relevant to the event indicated by the enum. Each event ID has it's own overload for script.on_event, and each overload's event parameter is a separate class, all of which extend from a base class, EventData.

The third parameter, filters, is irrelevant to the bug.

Sometimes, for no apparent reason whatsoever, LuaLS fails to narrow the handler event parameter based on the event ID. The most commonly experienced bug is when someone tries to use event.player_index. A significant number of events have a field for player_index, an integer, but not all of them do, so it is not a part of the base EventData class. This means that if LuaLS ever fails to narrow an overload for one of these functions, event.player_index is a very common target for undefined-field warnings.

Here's a screenshot of the bug in action:
image

And here's a github gist for all of the related typedefs for script.on_event:
https://gist.github.com/CodeGreen0386/3f78b82dbeb7eadf07e66c983e6de637

Log File

No response

@justarandomgeek
Copy link

The third parameter, filters, is irrelevant to the bug.

well, it kind of is but it's mostly a repeat of the same thing as the second: it has narrower types when properly matched, or doesn't exist at all on some overloads.

I'd also note that our defines.events enum (like all of our enums) is a little odd specifically to make it an opaque enum - the actual numeric values are not supposed to be visible anywhere, so they're masked away to just their types here for the overload resolution to see. I suspect this is involved somehow, but i'm not sure why it would cause any trouble.

@tomlau10
Copy link
Contributor

tomlau10 commented Jul 15, 2024

I think I've been able to reproduce this issue more reliably.

Setup

First create a workspace with the following files:

  • .luarc.json
    this enables the inlay function param hint for better understanding what variable type that LuaLS is currently inferred.
{
    "hint.enable": true
}
  • meta.lua
    a minimized definition file modified from the provided gist
    each EventData sub class has a field of its name, with this we can know if the event type is correctly inferred
---@meta _

defines = {}

script = {
---@param event (defines.events)|(string)|(((defines.events)|(string))[])
---@param handler (fun(event:EventData))|(nil)
---@overload fun(event:string, handler:fun(event:EventData.CustomInputEvent))
---@overload fun(event:defines.events.on_ai_command_completed, handler:fun(event:EventData.on_ai_command_completed))
---@overload fun(event:defines.events.on_area_cloned, handler:fun(event:EventData.on_area_cloned))
---@overload fun(event:integer, handler:fun(event:EventData))
on_event = function(event, handler) end
}

---@enum defines.events
defines.events={
on_ai_command_completed=#{} --[[@as defines.events.on_ai_command_completed]],
on_area_cloned=#{} --[[@as defines.events.on_area_cloned]],
}

---@class EventData
---@field name defines.events
---@field tick integer
---@field mod_name? string

---@class EventData.CustomInputEvent:EventData
---@field CustomInputEvent true

---@class EventData.on_ai_command_completed:EventData
---@field on_ai_command_completed true

---@class EventData.on_area_cloned:EventData
---@field on_area_cloned true
  • test.lua
    yes, the event name is missing a d, but this just sets up the reproduction environment 🙂
local e = defines.events
script.on_event(e.on_ai_command_complete, function (event)
end)

Reproduction steps

  • Now with the above setup, e.on_ai_command_complete is actually undefined
    • so it is of unknown type
    • and from the hover preview of handler, it matches all function definitions and selected EventData as param type
    • somewhat expected ✅
  • Then we type the missing d => e.on_ai_command_completed
    • now it will be correctly displayed as defines.events.on_ai_command_completed
    • but the handler still matches all function definitions, type narrowing seems doesn't triggered ❌
  • ‼️ At this stage, any modification of the file seems to retrigger the type narrowing
    For example, just adding a trailing newline at the bottom
  • Save the file and reload vscode window now (use the Reload window cmd)
    • at file load, the handler type narrowing is correct
    • ‼️ But then, by just adding trailing a new line, the type narrow will stop working ... 🙄

Thoughts

  • I'm not very familiar with LuaLS's codebase, but just by skimming through the files, the logic of finding matched functions call seems to reside in vm.getExactMatchedFunctions
    function vm.getExactMatchedFunctions(func, args)
  • This returns an array of matched function signature (?), and its only caller is the matchCall() from here
    local function matchCall(source)
    local call = source.parent
    if not call
    or call.type ~= 'call'
    or call.node ~= source then
    return
    end
    local myNode = vm.getNode(source)
    if not myNode then
    return
    end
    local funcs = vm.getExactMatchedFunctions(source, call.args)
    if not funcs then
    return
    end
  • I tried to add a debug print there print("matchCall", #funcs)
    It correctly shows matched number when I type / delete the d in e.on_ai_command_completed
    • after I typed d to make it e.on_ai_command_completed => matchCall 2
    • then I deleted the d to revert it to e.on_ai_command_complete => matchCall 4
  • So this vm.getExactMatchedFunctions logic seems working correctly, and will trigger on every file change.
    Just that inferred view is not updated correctly 😕

@tomlau10
Copy link
Contributor

Some updates and findings on this issue:

But then, by just adding trailing a new line, the type narrow will stop working ...

By adding or removing trailing new lines on the above reproduction file that I provided, this issue randomly triggers 😕
Each time I add/remove newline, it may correct itself, or trigger the bug.

As far as I know, the inferred view of a source object is based on its node's view. And in order to change / add a view to a node, vm.setNode(src, node) is called. So I added some debugging print log / debug.traceback() there when the source object is matched.


Some interesting findings:

  • When the type narrowing is working, all the stacktraces start with:
	script/core/semantic-tokens.lua:893: in local 'callback'
	script/parser/guide.lua:710: in function 'parser.guide.eachSourceBetween'
	script/core/semantic-tokens.lua:888: in function 'core.semantic-tokens'
	script/provider/provider.lua:1116: in function <script/provider/provider.lua:1104>
	[C]: in function 'xpcall'
	script/proto/proto.lua:200: in function <script/proto/proto.lua:175>

and the node object will have 3 views in the end [function, doc.type.function*2]

  • On the other hand when the bug triggers, all the stacktraces start with:
	script/core/diagnostics/init.lua:119: in upvalue 'check'
	script/core/diagnostics/init.lua:181: in function 'core.diagnostics'
	[C]: in function 'xpcall'
	script/provider/diagnostic.lua:337: in function 'provider.diagnostic.doDiagnostic'
	[C]: in function 'xpcall'
	script/provider/diagnostic.lua:463: in function <script/provider/diagnostic.lua:457>

and the node object will have 6 views in the end [function, doc.type.function*5]


Each time the file is edited, the file will be re-parsed, and the diagnostics will be rechecked. And the above suggests that if a node recompilation is triggered by diagnostic, the final inferred view will be incorrect 😳 .

  • I then tried to delay the trigger of diagnostic by 1s here:
    await.call(function ()
    await.setID('diag:' .. uri)
    repeat
    await.sleep(0.1)
    until not m.isPaused()
    xpcall(m.doDiagnostic, log.error, uri)
    end)
    end

    changed to =>
await.call(function ()
    await.sleep(1) --<< sleep 1 second
    await.setID('diag:' .. uri)

🎉 The triggering chance of this random bug is greatly reduced (to <1% on each edit)


This is of course not a proper fix, but at worst may be a way to "mitigate" the effect of this bug. 🤔

@tomlau10
Copy link
Contributor

Good news 🎉 I have found the root cause of this issue. The logic doesn't clear node caches of call.args after doing the function type narrowing, thus their views are not recomputed based on the type narrowed function.

It's the call order of vm.compileNode() which matters. I guess in the problematic call order, the arg nodes are keeping the outdated cache of parent which has all overload signatures.

To fix this, the logic has to clear call.args node caches after setting the type narrowed node in matchCall():

if needRemove then
local newNode = myNode:copy()
newNode:removeNode(needRemove)
newNode.originNode = myNode
vm.setNode(source, newNode, true)

insert codes below at line 577 =>

        if call.args then
            -- clear node caches of args to allow recomputation with the type narrowed call
            for _, arg in ipairs(call.args) do
                vm.removeNode(arg)
            end
        end

I have tested this against the reproduction case above, and it solves the problem 🎉 Both the random bug triggered when adding / removing trailing newlines, as well as just after completed the typing e.on_ai_command_completed. I will soon open a PR for this 😄

Meanwhile can you guys test it out to confirm if it can solve the issue? Say by just modifying the source code of your installed extension locally. @CodeGreen0386 @justarandomgeek

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants