Add Yul Optimizer Step to Evaluate Yul `constant` Functions #15358

quangloc99 · 2024-08-23T19:26:37Z

Issue with Constants

Constants in Solidity are generally assumed to be hardcoded, with their values replaced wherever they are used. However, this isn't entirely accurate. According to the Solidity documentation:

For a constant variable, the expression assigned to it is copied to all the places where it is accessed and also re-evaluated each time. This allows for local optimizations.

While the documentation correctly states that the expression is copied, it is often challenging to optimize constants in many situations. Consider the following examples:

Constant that Stores a Power:

pragma solidity ^0.8.24;
contract Test {
    uint256 private constant A = 9;
    uint256 private constant B = 4;
    uint256 private constant C = B ** A;

    function test(uint256 x) public pure returns (uint256) {
        return x / C;
    }
}

In this example, the constant C is the result of a power operation, which is a checked operation. Because the checked power is implemented with the leave instruction, it is not inlined in the test function. As a result, the division cannot be optimized into a bitshift operation, which would save gas.

Multiple Constants Dependent on Each Other:

pragma solidity ^0.8.26;

contract Test {
    uint256 public constant A0 = 8;
    uint256 public constant A1 = A0 * 2;
    uint256 public constant A2 = A1 * 2;
    uint256 public constant A3 = A2 * 2;
    uint256 public constant A4 = A3 * 2;
    uint256 public constant A5 = A4 * 2;
    uint256 public constant A6 = A5 * 2;
    uint256 public constant A7 = A6 * 2;
    uint256 public constant A8 = A7 * 2;
    uint256 public constant A9 = A8 * 2;
    uint256 public constant A10 = A9 * 2;

    function test(uint256 x) public pure returns (uint256) {
        return x / A10;
    }
}

In this case, all constants are marked as public and used multiple times. Since the multiplication is also checked, the expressions used to compute these constants are too complex to be inlined and optimized.

This subtle detail affects how code is written and optimized. Contract writers often need to compute constants beforehand and hardcode the value into the contract, often with a comment explaining the expression. This approach is suboptimal, as a small change to one constant requires multiple manual updates.

Simply put, the experience of using constants in Solidity does not align with typical user expectations. Due to the lack of preprocessing and further optimization capabilities, the code is either unoptimized or painful to write.

Solution: New Optimization Step

To address this issue, I introduced a new optimization step called constantFunctionEvaluator. This optimization component scans all functions without arguments and attempts to execute them. The execution is performed using the existing interpreter from the test.

After execution, if the interpreter encounters no built-in function calls that can introduce side effects (such as mstore, mload, call, create, address, etc.), the component concludes that the function is constant. It then replaces the function's body with assignments to return variables corresponding to the evaluated values.

Implementation

The existing interpreter was moved from the tests directory to the source directory libyul. To restrict which built-ins can be called, the interpreter and expression evaluator were overridden to include additional checks. This interpreter is then used in the new optimization component.

Impact

This optimization step resolves the mentioned issues with constants. It reduces each constant's expression to a single number, allowing the optimizer to further inline and simplify the expressions.
In the Yul world, a function does not need to be marked pure to be evaluated. Thanks to the interpreter, the actual code flow is evaluated rather than relying on static analysis. This means some functions can even call side-effecting built-ins, yet still be simplified by this component if those built-ins are not encountered.
Although this optimization step targets constants, contract writers can also benefit from it in (pure) functions. As long as the function does not accept any arguments and its execution flow does not introduce any side effects, it can be optimized.

I hope this PR is helpful and can contribute to further optimizations. With the interpreter, optimizations such as evaluating expressions with only constant parameters can also be performed. I hope this PR serves as a proof of concept for that.

ekpyron · 2024-08-28T14:09:02Z

Thanks a lot for the PR! The analysis and reasoning as well as the execution are very impressive!

The most important concern we have here is that the interpreter the new optimizer step relies on so far is solely used for testing and fuzzing and its correctness is not considered critical. Using it for optimzations like in the PR will make it critical, so merging this would first require reviewing the interpreter for the necessary robustness on the defined subset of Yul.

As you already note in the PR description, so far optimization like this is supposed to happen via inlining and expression simplification, but as you correctly lay out this doesn't work in all cases. The checked exponentiation function could actually be refactored, s.t. it doesn't involve any leave anymore, we could also investigate that as a quicker path to some improvement in this, however, we'd need to check what effects in cost such a change would have in other real-world cases. But, as you also point out correctly, this method also relies on inlining heuristics actually choosing to inline in the cases at hand; the current inlining heuristics aren't particularly good, but improving on them is tricky, especially since we also need to watch out for amount of local variables in functions after inlining to prevent running out of stack space too quickly. It could be possible to do some analysis to make an educated predication about whether inlining and simplification will in fact resolve into a constant, but that would already be similar to interpretation - and interpreting may also have good performance metrics. So given all that, this new step is definitely worthwhile to consider!

Would you be interested in joining one of our design calls to discuss how to proceed here (see https://docs.soliditylang.org/en/v0.8.26/contributing.html#team-calls)? If that doesn't work for you, we can also continue the discussion asynchroneously.

We're currently in the process of preparing a release that will likely happen next week, but at the latest after that, we'll try to properly evaluate what would need to be done to bring the interpreter into a state in which it can be used in this manner (respectively whether that's feasible).

quangloc99 · 2024-08-28T15:53:30Z

Hi ekpyron. Thank you so much for taking the time and for your patience in reviewing this PR, especially considering it’s not a small one.

Regarding the interpreter, I would also like to help improve it, whether that means addressing change requests to the interpreter itself, or adding more tests to ensure its correctness. I understand that including the interpreter can enable the development of more powerful optimizations, but with great power comes great responsibility. I would love to see this PR in action, but it must be absolutely correct. Therefore, I want to give it as much support as I can.

Thanks for the team call invitation. Unfortunately, the timing is not suitable for me, so I would prefer to communicate asynchronously instead. I will try to keep you updated as much as possible.

Please let me know if there is anything else I should do for this PR, such as addressing change requests, adding documentation, or adding more tests.

Again, thank you so much for your time. I’m really grateful that this PR is being considered!

matheusaaguiar · 2024-09-09T21:55:51Z

Hey @quangloc99 ! Thanks for the PR!
Solidity team have discussed the next steps for this PR and as per @ekpyron recommendation, it would be best to write a new interpreter for the task rather than reusing the one from test. It should just handle control flow and plain "pure" opcodes (e.g. arithmetic or other which work purely from other values on stack). It should just abort on anything else like memory access, reverts, etc. It should also not handle memory/storage/calldata/returndata.

quangloc99 · 2024-09-10T16:53:52Z

Hi @matheusaaguiar! Thanks again for taking the time to review this PR!

I will proceed to write a new interpreter. My original intention was that if there was already an interpreter implementation with existing tests, then using the existing one would be a nice and safe approach. But now I understand that the original interpreter was only meant to be a quick tool, so it makes sense to create a proper one.

With this in mind, I want to lay out what I plan to do with the interpreter:

First of all, I will still use the existing one as a reference. I think the current one already handles control flow very well and includes all the necessary calculations for operations.
Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like std::expected. Maybe boost's outcome or std::variant could also work. Other than that, I’ll aim to clean up other parts.

I will try to keep the new interpreter restricted but with room for future extensions. It would be great if, in the future, more analysis could be done to include memory access as well.

quangloc99 · 2024-09-10T16:59:51Z

Oh, one more thing I just thought of. To test the new interpreter, should I add a new test suite like test/libyul/YulInterpreterTest.h?

matheusaaguiar · 2024-09-11T00:39:37Z

Hi @matheusaaguiar! Thanks again for taking the time to review this PR!

I will proceed to write a new interpreter. My original intention was that if there was already an interpreter implementation with existing tests, then using the existing one would be a nice and safe approach. But now I understand that the original interpreter was only meant to be a quick tool, so it makes sense to create a proper one.

With this in mind, I want to lay out what I plan to do with the interpreter:
* First of all, I will still use the existing one as a reference. I think the current one already handles control flow very well and includes all the necessary calculations for operations.

* Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like `std::expected`. Maybe `boost`'s `outcome` or `std::variant` could also work. Other than that, I’ll aim to clean up other parts.
I will try to keep the new interpreter restricted but with room for future extensions. It would be great if, in the future, more analysis could be done to include memory access as well.

Sounds like a good plan. The existing one is a good basis, you can definitely use it as reference.
I think you could use std::variant. There are some places where we already use it to return errors instead of letting them bubble up.

Oh, one more thing I just thought of. To test the new interpreter, should I add a new test suite like test/libyul/YulInterpreterTest.h?

Yes, I think you will have to implement a specific test suit. You can use YulInterpreterTest.h as reference. Unless it could be used with the existing functionalities, but I guess it is better to have a separate one.

Feel free to ping me if you have any questions or comments :)

matheusaaguiar · 2024-09-11T15:16:12Z

Hey @quangloc99
I guess was too eager to answer and forgot to check the compilation runtime and gas costs. It seems like this PR is not affecting them significantly neither for good or bad.
@cameel already ran some tests and will probably post the results later, but first we should investigate why the proposed optimized step is not having an impact here.
Maybe it needs to be inserted into the optimizer sequence ~~before~~ after the Function Specializer (F)?

You can check the gas benchmarks by downloading the "summarized" artifacts generated by our CI job, e.g.: PR-artifact and develop-artifact, run ./scripts/externalTests/benchmark_diff.py --style humanized --output-format markdown table develop-artifact PR-artifact. That should give you a table comparing the gas costs using some real world projects.

matheusaaguiar · 2024-09-11T15:33:12Z

Gas cost benchmarks

`ir-no-optimize`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`0%`
ens	`0%`
euler
gnosis
gp2	`0%`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`0%`	`-0%`	`0%`
zeppelin

`ir-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`+0%`
ens	`-0.04% ✅`	`0%`	`0%`
euler	`+0.04% ❌`
gnosis
gp2	`0%`
pool-together	`0%`
uniswap	`+0.01% ❌`
yield_liquidator	`0%`	`+0%`	`0%`
zeppelin	`+0.02% ❌`	`+0.06% ❌`	`-0.06% ✅`

`ir-optimize-evm-only`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`0%`
ens	`0%`	`0%`	`0%`
euler
gnosis
gp2	`0%`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`0%`	`0%`	`0%`
zeppelin	`0%`

`legacy-no-optimize`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`0%`
ens	`0%`
euler	`0%`
gnosis	`0%`
gp2	`0%`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`0%`	`+0%`	`0%`
zeppelin	`0%`	`-0%`	`-0.23% ✅`

`legacy-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`0%`
ens	`0%`	`+0%`	`0%`
euler	`0%`
gnosis	`0%`
gp2	`0%`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`0%`	`+0%`	`0%`
zeppelin	`0%`	`+0%`	`-0.02% ✅`

`legacy-optimize-evm-only`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`0%`
elementfi	`0%`
ens	`0%`	`0%`	`0%`
euler	`0%`
gnosis	`0%`
gp2	`0%`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`0%`	`0%`	`0%`
zeppelin	`0%`	`+0%`	`-0.07% ✅`

!V = version mismatch
!B = no value in the "before" version
!A = no value in the "after" version
!T = one or both values were not numeric and could not be compared
-0 = very small negative value rounded to zero
+0 = very small positive value rounded to zero

quangloc99 · 2024-09-11T15:40:23Z

Hey @quangloc99 I guess was too eager to answer and forgot to check the compilation runtime and gas costs. It seems like this PR is not affecting them significantly neither for good or bad. @cameel already ran some tests and will probably post the results later, but first we should investigate why the proposed optimized step is not having an impact here. Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

You can check the gas benchmarks by downloading the "summarized" artifacts generated by our CI job, e.g.: PR-artifact and develop-artifact, run ./scripts/externalTests/benchmark_diff.py --style humanized --output-format markdown table develop-artifact PR-artifact. That should give you a table comparing the gas costs using some real world projects.

I'm guessing that up till now, every project already has their constants with inlined value instead of a complex expression. I'm sure that the contract writers aware about the fact that the current optimizer could not handle all the cases to inline the constant effectively.

That's why I'm motivated to create this PR! I believe the compiler should handle the preprocessing better to let the contract writer write better, more readable code.

quangloc99 · 2024-09-11T15:46:16Z

Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

This is a very good suggestion! Let me try that.

matheusaaguiar · 2024-09-11T15:48:52Z

Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

This is a very good suggestion! Let me try that.

Sorry! I meant after the Function Specializer.

cameel · 2024-09-11T20:56:19Z

Hello. As @matheusaaguiar said, I did some benchmarking here to see how helpful the new step is in practice. The general concept of the new step does make sense, so it's actually surprising how little effect it seems to have. Even if real-life projects optimize their constants manually, this should work not just for constants - I'd expect it to find lots of optimization opportunities in the helpers generated internally by the compiler. We should also see some changes in gas expectations in our semantic test cases since we definitely do not hand-optimize constants in those.

Something seems off and I think it's important to investigate and understand what's happening before we jump into making this production-ready. I can at least confirm that it does have a positive effect on code with constants so it's not that it's broken. But it does seem less effective than it could be. My initial guess was that running it before FunctionSpecializer prevents it from running on most calls with constant arguments, and it probably does, but I don't think this is enough, because I tried that and saw no improvement in benchmarks. I think it may be that it also needs to run before any inlining happens since in the current design that irreversibly removes some optimization opportunities (we have no optimizer step that can reverse inlining).

In general I think that making full use of it requires some experimentation with the sequence to see where it can have some actual impact. Also, it's worth considering whether making the step simpler and relying more on composability with other steps wouldn't yield better results. For example, reducing its scope from a whole function to a single statement. Statements with constant values are much more common so this would let the step simplify more and not be affected by things being inlined. The step would not be able to eliminate calls, but that's the job that the FullInliner can do for you. It would also not simplify calculations spread over multiple statements, but if you run steps like ExpressionSplitter and LiteralRematerialiser before, you'll get anything that's side-effect free combined into a single statement so it won't make a difference. If you look at our existing steps that's the approach we took with them and the reason why they have lots of seemingly odd limitations.

Benchmarks

Here's some detail on what I checked. It wasn't very comprehensive and @matheusaaguiar has already posted the results from external tests, which are much more meaningful, but I have the results so I may just as well post them here.

I also checked the difference in gas expectations but since there was no difference, there's nothing to post :)

Bytecode size

From test/benchmarks/local.sh:

File	Pipeline	0.8.27	this PR	this PR + (`k` after `F`)
`verifier.sol`	legacy	4899 bytes	4963 bytes	4975 bytes
`verifier.sol`	via-ir	4324 bytes	4388 bytes	4400 bytes
`OptimizorClub.sol`	legacy
`OptimizorClub.sol`	via-ir	22082 bytes	22274 bytes	22310 bytes
`chains.sol`	legacy	5869 bytes	5901 bytes	5907 bytes
`chains.sol`	via-ir	21401 bytes	21433 bytes	21439 bytes

It shows that there is some effect, but it's rather small and negative. It's not very conclusive though, since your PR is not based on 0.8.27, so the difference could be from something that was merged in the meantime.

Compilation time

From test/benchmarks/external.sh:

Project	0.8.27	this PR
openzeppelin	37 s	40 s
uniswap-v4	147 s	147 s
eigenlayer	674 s	669 s

At least the PR does not seem to affect compilation time in a significant way, which was one of our concerns, since we're very focused on speeding up compilation via IR right now. It could still be only due to the step not having many optimization opportunities so we should recheck later when we make it more effective.

Semantic test with a gas usage

This is how I verified that the PR actually does have an effect. It's is a semantic test that you can drop into a .sol file somewhere under test/libsolidity/semanticTests/. Or actually a diff of such a test between current develop and your PR.

 contract C {}

 contract D {
     // Do something that eats a lot of gas to get the numbers over the minimum of 100k
     C c = new C();

     uint256 public constant A0 = 8;
     uint256 public constant A1 = A0 * 2;
     uint256 public constant A2 = A1 * 2;
     uint256 public constant A3 = A2 * 2;
     uint256 public constant A4 = A3 * 2;
     uint256 public constant A5 = A4 * 2;
     uint256 public constant A6 = A5 * 2;
     uint256 public constant A7 = A6 * 2;
     uint256 public constant A8 = A7 * 2;
     uint256 public constant A9 = A8 * 2;
     uint256 public constant A10 = A9 * 2;

     function f(uint256 x) public pure returns (uint256) {
         return x / A10;
     }
 }
 // ----
 // constructor() ->
-// gas irOptimized: 119400
-// gas irOptimized code: 123600
+// gas irOptimized: 117495
+// gas irOptimized code: 99800
 // gas legacy: 144563
 // gas legacy code: 438800
 // gas legacyOptimized: 121483
 // f(uint256): 1 -> 0

Something where most of the cost is coming from constants rather than something unrelated might be a better test though.

cameel · 2024-09-11T21:37:56Z

Also, some notes on the things that were discussed above:

Solidity team have discussed the next steps for this PR and as per @ekpyron recommendation, it would be best to write a new interpreter for the task rather than reusing the one from test. It should just handle control flow and plain "pure" opcodes (e.g. arithmetic or other which work purely from other values on stack). It should just abort on anything else like memory access, reverts, etc. It should also not handle memory/storage/calldata/returndata.

I'd like to clarify this a bit. This what we currently think is the best approach, but the actual next step was supposed to be to take a closer look at the current interpreter and make sure this is really a viable approach. We don't want to get into a task that will be too big and distract too much from the main point, which is the new step. If it turns out to be like that, we may still consider a different approach.

And anyway, like I said above, IMO the focus should be first on making sure that the step is actually effective and also agreeing that the overall approach is good. I.e. get the big decisions out of the way before committing to spending the effort needed to get the details right.

Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like std::expected. Maybe boost's outcome or std::variant could also work. Other than that, I’ll aim to clean up other parts.

Just a note that if you do try to do that, it's best to do it in a separate refactor PR. We generally prefer smaller PRs. Anything that can be extracted from a bigger PR and still makes sense on its own, should be a separate PR to reduce the scope of reviews (which is the biggest bottleneck for us).

Also, std::expected, might be nice, but I see it's C++23. We're still on C++17 (with a plan to migrate to C++20 in the near future). We do have a custom Result class you could use for this though. A variant is fine too.

Yes, I think you will have to implement a specific test suit. You can use YulInterpreterTest.h as reference. Unless it could be used with the existing functionalities, but I guess it is better to have a separate one.

I think it would be best to reuse the current suite and simply make it execute tests using both interpreters. This way we get test coverage for both at the same time and can be sure both give the same answer. This is the approach we took in semantic tests to cover both code generation pipelines and it works very well there.

quangloc99 · 2024-09-12T01:19:32Z

Hello @cameel! Thank you for taking the time to review and benchmark everything!

I’m grateful for your advice on how this PR should proceed. Based on your feedback, I’d like to share my thoughts and some of the decisions I made when creating this PR.

My motivation came from how constants are handled in Solidity. I was surprised when many of my constants weren’t inlined and were still being computed within the compiled program. This led me to investigate further, eventually resulting in this PR.

My guess as to why the benchmark results were weak remains the same. I believe that since the optimizer wasn’t able to effectively calculate, inline, and transform the code, contract developers were forced to manually compute their constants. Because of this, I think the benchmark was biased: there wasn’t a strong use of complex constant expressions, so there wasn’t much for this optimizer step to act on.

Personally, I don’t see this PR as a typical optimizer PR but more as an improvement to the development experience. With this PR, I expect developers will be able to write code that better expresses their intentions without having to focus so much on how the code gets optimized. The effect of this PR is more like a syntax change than an optimization step. It gives developers a new tool to work with, as I believe it changes the underlying semantics of constants: instead of inlining the expression, it inlines the value.

This change also leads to the question: if I want to introduce this new tool, where should I contribute? Initially, I thought I might contribute to libsolidity, as it provides the direct tools and interfaces developers use. However, contributing to libsolidity wasn’t straightforward (due to the complexity of Solidity expressions and the lack of an evaluator), and it also overlapped with the Yul optimizer. That’s why I decided to contribute to the Yul optimizer instead. In my opinion, this contribution fits well with the rest of the compiler components, with no conflicts, and this optimizer step does exactly what it says.

I understand that the interpreter is very powerful and can do much more than what’s demonstrated in this PR. However, I didn’t see it being used. With this PR, I also wanted to make the push for the interpreter usage, not just for constants or simple statements, but anywhere it’s applicable. This PR is just a proof of concept, showcasing one potential use for the interpreter. I believe there are many more ways to leverage it, but for now, I wanted to focus on this simple optimization.

I’ve been mindful of the review workload for the team, which is another reason I initially reused the interpreter from the tests. Not only did this reduce my effort, but I also thought it would make the review process easier. I apologize for the size of this PR. I tried to keep it as simple as I could.

quangloc99 · 2024-09-12T01:36:11Z

I’d like to share my thoughts on the advice as well.

I like the idea of moving this optimizer step to evaluate statements instead of constant functions. I understand that this will give the optimizer step a broader optimization impact. However, even though these two approaches to evaluation are similar, I think there are some differences that need to be addressed:

Evaluating statements is not always the same as evaluating constant functions. Suppose I have three variables that call the same constant function. Evaluating the statement would lead to the constant function being evaluated multiple times, which would also impact the compiler's runtime. So, if we proceed with statement evaluation, I think we still need to keep constant function evaluation.
It seems to me that statement evaluation requires more effort. To achieve this, I would either need to inspect the entire statement to see if all the parameters are pure before evaluating, or I’d need to rely on the DataflowAnalyzer. Both approaches seem more complicated than what I proposed in this PR.
As I mentioned, constant function evaluation gives developers a tool they can use proactively, similar to other programming languages. Statement evaluation doesn’t offer developers as much flexibility.

As I mentioned before, I want to advocate for the use of the interpreter. If the interpreter gets accepted and used in production, I believe there will be more ways to utilize it than what is demonstrated in this PR.

Regarding the composability point, I will experiment as suggested. However, I believe keeping this optimizer step focused on constant functions is fine as it is. The ultimate composability is how the interpreter is used across different optimizer steps rather than this one alone. I also believe that in the future, even better optimization steps will be introduced and this one may be deprecated, which is fine as the compiler evolves.

quangloc99 · 2024-09-12T01:41:18Z

What that aside, I want your advice on how to proceed with this PR. I have not made any additional changes rather than restoring the original interpreter in test. The interpreter I added is still inside libyul, and I will make change to it.

But for now, should I focus on expreimentation and benchmarking, or should I proceed with the suggested implementation changes?

Also for benchmarking, I want run and see the gas usage report locally. I tried to run ./test/external_tests.py but seems like it requires some environment variables, such as the RPC key. Can you help guiding me through running it please?

cameel · 2024-09-16T18:21:23Z

What that aside, I want your advice on how to proceed with this PR. I have not made any additional changes rather than restoring the original interpreter in test. The interpreter I added is still inside libyul, and I will make change to it.

We discussed it again today and we think that reuse of a subset of current interpreter should be good enough after all, but it will still require some work to make sure it's reliable. If you can make it solid, we'd be happy to merge it and then you could use it as a base for your new step.

I created an issue for that #15435 and added some details about implementation, which hopefully answer your questions related to that part. If you have more, let's continue discussion on that topic there.

Personally, I don’t see this PR as a typical optimizer PR but more as an improvement to the development experience.

Fair enough, I now see where you're coming from. I just think we can achieve both with not that much extra effort. In many ways it would even be simpler. We can have a step that can make constants usable and also make the optimizer more powerful.

But for now we decided we're ok with such a limited step targeting constants as an initial step. We'd generally prefer it to become something more generic, usable in more optimization scenarios and composable with other steps, it would be a waste not to do it having gone through the trouble of preparing the interpreter for production use, but we can think about it when we're done with this PR. Having this something like this is already helpful. It's a welcome contribution.

Just a note that we'd rather mark it as experimental and keep it out of the default sequence at first. We're a bit wary of potential issues in the interpreter and we treat optimizer bugs very seriously, so we really want to see people use carefully first and report back. Then we'd consider including it in the sequence.

Initially, I thought I might contribute to libsolidity, as it provides the direct tools and interfaces developers use.

Doing it at Yul level is definitely the better idea for a contribution. It makes the scope actually manageable. We have #3157 for the Solidity-level equivalent, but that's a much more complex task and is actually blocked on some design decisions. It's a very important point on our roadmap, but it'll take us some time to get there.

cameel · 2024-09-16T19:10:46Z

As to your more specific questions about my suggestion to make the step simpler:

Evaluating statements is not always the same as evaluating constant functions. Suppose I have three variables that call the same constant function. Evaluating the statement would lead to the constant function being evaluated multiple times, which would also impact the compiler's runtime. So, if we proceed with statement evaluation, I think we still need to keep constant function evaluation.

In that scenario you would first evaluate statements inside the function, which would be done once. If the function is simple enough to be eliminated by your step then mine would reduce it to something like return <number>. Then the function would be simple enough to be inlined at the point of use of each constant by the inliner.

It seems to me that statement evaluation requires more effort. To achieve this, I would either need to inspect the entire statement to see if all the parameters are pure before evaluating, or I’d need to rely on the DataflowAnalyzer. Both approaches seem more complicated than what I proposed in this PR.

Looking at how simple the step is currently, perhaps it would be a bit more complex. But it would be more powerful and still not very complex overall, because it would rely on other steps to do the complex stuff. For example it would not need to use Dataflow Analyzer. You could run Structural Simplifier first, which already uses Dataflow Analyzer and would deal with any constant control flow. You would also not have to check if variables are pure - Rematerialiser and Literal Rematerialiser would convert it into a form where the final is directly in the expression - you'd only check if it's a literal and skip the expression if not.

Also keep in mind that it would not necessarily happen all at once, some steps or patterns get repeated multiple times, because each pass may reveal new opportunities. For example you might run your step first to evaluate conditionals, then Structural Simplifier to unroll the ifs and loops and then Expression Joiner to join what you can into a single expression and finally your step again to evaluate things that now potentially got consolidated into longer constant expressions.

BTW, I initially said it should work at a statement level, but it should rather be expression level. I was thinking about this being necessary to handle control flow, but then realized that Structural Simplifier will do it.

As I mentioned, constant function evaluation gives developers a tool they can use proactively, similar to other programming languages. Statement evaluation doesn’t offer developers as much flexibility.

I think you're still not seeing fully what I mean. Yes, the step on its own would only deal with expressions/statements. But it would be combined with other steps and together they would be able do do what yours does and even more. Basically, your current step is closer in complexity to a fixed sequence of our current simple steps. Ours are more flexible, because we can combine them in many different ways, like LEGOs.

Also for benchmarking, I want run and see the gas usage report locally. I tried to run ./test/external_tests.py but seems like it requires some environment variables, such as the RPC key. Can you help guiding me through running it please?

Well, this one is a pain to run locally. Here's how we run it in CI. But it should not normally require any RPC key so something must have gone wrong.

Maybe try to run it for a single project first, say, OpenZeppelin. Check if this works:

test/external_tests.py test \
    --solc-binary-type native \
    --solc-binary-path /path/to/solc \
    --run zeppelin

This will take a while to run and drop gas measurements in reports/externalTests/. Then you'll have to run scripts/externalTests/summarize_benchmarks.py and scripts/externalTests/benchmark_diff.py on if you want a nice table like the one posted by @matheusaaguiar above.

But we usually don't run it locally because that's a royal pain to do for all ~10 projects. We go to the artifacts of c_ext_benchmarks job on CircleCI and just grab the file from there and run scripts/externalTests/benchmark_diff.py on that.

We also have scripts/externalTests/download_benchmarks.py that automates downloading them, but that one unfortunately does require an API key. It used to work without it but now CircleCI requires tokens.

quangloc99 · 2024-09-18T02:53:01Z

Hi cameel!

I really appreciate your feedback and for welcoming this PR!

Regarding the open issue with the side-effect-free interpreter, I think the spec is clear. I’ll create a separate PR for that. Once it's ready, I’ll integrate the new interpreter into this PR and then we can wrap it up.

On the discussion side, I understand the LEGO analogy, and I know the current step seems very limited for now. It’s still hard to mix it with other steps to really push it to its potential. I would love to contribute to this in the future, but I agree that we should treat this PR as a base step and work on making it more powerful later.

I’m also curious about how to make this an experimental feature. How would that work? Would we add a flag to enable a new optimization sequence? And how would we gather feedback on it? I apologize if I’m asking too many questions, but I’m genuinely interested.

Once again, thank you for your feedback and for your patience with all my questions!

cameel · 2024-09-18T11:00:41Z

I’m also curious about how to make this an experimental feature.

There isn't that much to it. We don't have any specific mechanism for that (aside from the experimental pragma, but that's really only viable for language features, not compiler features). It's more conceptual really - such features are never enabled by default, usually undocumented, we do not provide changelog for them and they're not even guaranteed to actually work in a release. So it makes working on them less stringent until we decide to make them official. In case of your step the most important aspect would be not including it in the default sequence.

And how would we gather feedback on it?

We'll probably say something about it in release notes and also advertise it a bit using the usual channels, like the forum or social media. Just making it clear that it's experimental and people should not rely on it in production.

quangloc99 added 30 commits August 24, 2024 00:52

Move test/tools/yulInterpreter to libyul/tools/interpreter

60da2ac

Add UnsupportedBuiltinFunctionEvaluated exception

4dd061a

Add test for unsupported_builtin_function_evaulated

c5862cd

Assert the literal to be limited.

561f2ef

Add class ConstantFunctionEvaluator with basic info

d5906dc

Override interpreter to check for arithmetic only expression

134dae7

Add overriden visit FunctionCall method to ConstantFunctionEvaluator

91e2c27

Add visit block logic for ConstantFunctionEvaluator

f8d6955

Register ConstantFunctionEvaluator in the Suite

d7527c8

Add ConstantFunctionEvaluator to the DefaultYulOptimiserSteps

c0726b4

Add ConstantFunctionEvaluator to YulOptimizerTestCommon

f028985

std::move newBody to _function.body

06700f4

Add new optimisation step to cmake

f2a1142

Invoke the optimizer step

e053ce2

Add missing overriden evaluateMulti

b0e4d35

Bring back old behavior when evaluating unlimited literal

0451498

Add getValueForUnlimitedLiteral in Interpreter

a040375

Disallow unlimited literal in ArithmeticOnlyExpressionEvaluator

dc11602

Add a few tests

1e8c9de

Add lots_of_dependent_constants test

426af2a

Update stack_compressor_msize test

656c615

Add nested_expression test

341a077

Add nested_function test

b2d8575

Add recursive_calls_to_eachother test

7a31ceb

Override makeInterpreter* function of ExpressionEvaluator

0dbe61a

Add branch_with_sideeffect test

ec02bc9

Add deeply_nested_with_sideeffect test

3260348

Whitelist POP instruction

9161063

Add no_return_variables test

6abc828

Add prime_check test

94ad59d

quangloc99 force-pushed the yul-opt-const-func branch from e56f0ea to 8eada94 Compare September 2, 2024 03:07

quangloc99 added 3 commits September 2, 2024 10:25

Merge branch 'develop' into yul-opt-const-func

ecd48c0

Remove type assignment for literal

92aa297

Minor formatting

dfa525f

quangloc99 force-pushed the yul-opt-const-func branch from 8eada94 to dfa525f Compare September 2, 2024 03:42

Merge branch 'develop' into yul-opt-const-func

44ccb2e

cameel added the optimizer label Sep 11, 2024

Revert yulInterpreter in test

0cf0a77

cameel mentioned this pull request Sep 16, 2024

Yul interpreter subset for side-effect-free evaluation #15435

Open

quangloc99 mentioned this pull request Sep 29, 2024

Side-effect-free Yul interpreter #15464

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Yul Optimizer Step to Evaluate Yul `constant` Functions #15358

Add Yul Optimizer Step to Evaluate Yul `constant` Functions #15358

quangloc99 commented Aug 23, 2024

ekpyron commented Aug 28, 2024

quangloc99 commented Aug 28, 2024

matheusaaguiar commented Sep 9, 2024

quangloc99 commented Sep 10, 2024

quangloc99 commented Sep 10, 2024

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

`ir-no-optimize`

`ir-optimize-evm+yul`

`ir-optimize-evm-only`

`legacy-no-optimize`

`legacy-optimize-evm+yul`

`legacy-optimize-evm-only`

quangloc99 commented Sep 11, 2024

quangloc99 commented Sep 11, 2024

matheusaaguiar commented Sep 11, 2024

cameel commented Sep 11, 2024 •

edited

Loading

cameel commented Sep 11, 2024

quangloc99 commented Sep 12, 2024

quangloc99 commented Sep 12, 2024 •

edited

Loading

quangloc99 commented Sep 12, 2024

cameel commented Sep 16, 2024

cameel commented Sep 16, 2024

quangloc99 commented Sep 18, 2024

cameel commented Sep 18, 2024 •

edited

Loading

Add Yul Optimizer Step to Evaluate Yul constant Functions #15358

Are you sure you want to change the base?

Add Yul Optimizer Step to Evaluate Yul constant Functions #15358

Conversation

quangloc99 commented Aug 23, 2024

Issue with Constants

Solution: New Optimization Step

Implementation

Impact

ekpyron commented Aug 28, 2024

quangloc99 commented Aug 28, 2024

matheusaaguiar commented Sep 9, 2024

quangloc99 commented Sep 10, 2024

quangloc99 commented Sep 10, 2024

matheusaaguiar commented Sep 11, 2024 • edited Loading

matheusaaguiar commented Sep 11, 2024 • edited Loading

matheusaaguiar commented Sep 11, 2024 • edited Loading

ir-no-optimize

ir-optimize-evm+yul

ir-optimize-evm-only

legacy-no-optimize

legacy-optimize-evm+yul

legacy-optimize-evm-only

quangloc99 commented Sep 11, 2024

quangloc99 commented Sep 11, 2024

matheusaaguiar commented Sep 11, 2024

cameel commented Sep 11, 2024 • edited Loading

Benchmarks

Bytecode size

Compilation time

Semantic test with a gas usage

cameel commented Sep 11, 2024

quangloc99 commented Sep 12, 2024

quangloc99 commented Sep 12, 2024 • edited Loading

quangloc99 commented Sep 12, 2024

cameel commented Sep 16, 2024

cameel commented Sep 16, 2024

quangloc99 commented Sep 18, 2024

cameel commented Sep 18, 2024 • edited Loading

Add Yul Optimizer Step to Evaluate Yul `constant` Functions #15358

Add Yul Optimizer Step to Evaluate Yul `constant` Functions #15358

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

matheusaaguiar commented Sep 11, 2024 •

edited

Loading

`ir-no-optimize`

`ir-optimize-evm+yul`

`ir-optimize-evm-only`

`legacy-no-optimize`

`legacy-optimize-evm+yul`

`legacy-optimize-evm-only`

cameel commented Sep 11, 2024 •

edited

Loading

quangloc99 commented Sep 12, 2024 •

edited

Loading

cameel commented Sep 18, 2024 •

edited

Loading