Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite Loop in ssa.(*builder).resolveAlias #2329

Closed
clarkmcc opened this issue Oct 5, 2024 · 9 comments
Closed

Infinite Loop in ssa.(*builder).resolveAlias #2329

clarkmcc opened this issue Oct 5, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@clarkmcc
Copy link

clarkmcc commented Oct 5, 2024

Describe the bug
When running the compiler on Linux arm64, my module which normally runs in microseconds, will get stuck in a loop and never exit.
image

I don't have a reproduction case, so what I'm mainly asking is what could this mean? I'm using extism, I'm compiling the module, not using the interpreter, and I'm using a filesystem cache. When I switch to using the interpreter, the problem goes away.

On a side note, it seems problematic that context cancellation can't break me out of this infinite loop.

@clarkmcc clarkmcc added the bug Something isn't working label Oct 5, 2024
@ncruces
Copy link
Collaborator

ncruces commented Oct 5, 2024

Does this happen when compiling the module, or when running the compiled module?

Did you try any platforms other than Linux arm64?

What does it mean that it "normally" runs in microseconds?

If it's compiling the module, why don't you have a reproducer? Don't you have access to the Wasm, aren't you willing/allowed to share it?

@clarkmcc
Copy link
Author

clarkmcc commented Oct 5, 2024

I assumed it was during the compilation phase just based on the flamegraph. But if that isn't actually a safe assumption, I'll do some more testing this week and answer all your questions. I only tested on Linux arm64 as that is the only target I intent to run it on.

Happy to provide the WASM module itself, the framework we have around running the module will be more challenging, but I'll see what I can do.

Thanks for the quick reply

@ncruces
Copy link
Collaborator

ncruces commented Oct 5, 2024

It looks like it's under compilation given the flame graph, but as you said it “normally runs in microseconds” I seeked clarification.

From the flame graph it also seems like the same should happen on amd64, but testing that also helps narrow things down.

Another thing we can do is look at the file with blame see when it last changed, and test with a version before that.

But if you can provide the Wasm, we can more easily test it yes.

Answering another of your points, the context isn't really used to stop compilation, only runtime.

@mathetake
Copy link
Member

sorry, I tried to compile the binary (I received from @ncruces ) and that was compiled pretty much instantly. Until you can give us the real repro i am closing.

@mathetake
Copy link
Member

and could you make sure this reproduces with the latest main - checking the version used by your extism version.

@clarkmcc
Copy link
Author

clarkmcc commented Oct 5, 2024

Sure, I'll work on a repro. And just to clarify, is this working for you on linux/arm64 on the main wazero?

@mathetake
Copy link
Member

mathetake commented Oct 5, 2024

for all platforms (darwin, linux) x (amd64, arm64) compiled it pretty much instantly plus the code there is independent of the platform FWIW

@clarkmcc
Copy link
Author

clarkmcc commented Oct 5, 2024

So a couple interesting notes:

  • I can't reproduce this running the same code on darwin/arm64. I did notice that this code appears to be platform independent, but it is interesting that I can only reproduce on my linux/arm64 environment.
  • My linux/arm64 environment is actually an embedded device running armbian. It's a 4 core, 1 gb of memory device running good ole' Linux so I'm not sure why that would matter, but maybe it is relevant. It is going to make this nearly impossible to reproduce for you I am sure.

I'm working on upgrading to 1.8.1 and will test again to see if I can provide any more info.

I'm guessing there aren't any hints you can give me about what the resolveAlias function that could help me pin down the conditions under which it would never break out of a loop?

@ncruces
Copy link
Collaborator

ncruces commented Oct 5, 2024

If it's Linux specific, which would be surprising, I can test Monday under qemu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants