-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider combining "foo" "bar" into single *StringNode. #1479
Comments
Right now, if you have Of course that doesn't address the issue of post-processing. But I'm guessing it wouldn't be worth increasing the size of the serialized buffer? |
@kddnewton are you saying the long term goal will be to keep source in order to build source? I believe this will make us have to abandon the idea of lazy method evaluation if so (e.g. we will be keeping serialized + source in memory). If it ends up in a constant pool then it will then become less memory. The interpolated case will be less memory (alhough admittedly much less common) either way. This could be a case of something serialization perhaps want but the main parser doesn't? Has compile.c tackled these mixed interpolated cases yet? I am curious if they prefer this as-is. |
@kddnewton I may be speaking on this wrong. I see that loadString will populate from the source when making the tree so we can do lazy method building but it means that we need to fully generate its AST. I am ok with this and you can ignore my previous comment past it being aspirational. By aspirational I mean the dream would be to get to be able to save offsets to a method and not generate a tree for that section. In that case we would need strings to be in a pool and not source. |
@kddnewton ok another idea. StringConcat is always normal Strings and when mixed with an InterpolatedString it is combined into the InterpolatedString during parse? Then it would make the complexity much less. It should reduce interpolated cases size (and just be somethign we already can parse). |
I think there are a couple of things we can take away from this that can move us toward the aspirational goal:
As for combining them all, the parser gem does that and I think it would be possible. What they do is have a parent "a #{b} c" "d" "e #{f} g" would get parsed as:
This works because we still get to keep all of the opening/closing/location information for the child strings, and it removes the need to have a string concat node at all so at least it can be consistently compiled. What do you think of that approach? |
Re lazy parsing my intention is to not deserialize at all (i.e. not read the serialized bytes of) lazy methods (until that method is executed). That can mean either keeping the source around, or rereading it from disk, or indeed populating the constant pool eagerly, which is probably the best/simplest. An alternative would be when serializing to copy & embed any string/constant pool entry in the serialized output and therefore not need the source at all when deserializing. What do you think of this alternative? I think it might be the best and also be fairly straightforward to implement. |
So that means replacing |
This is my aspirational goal as well. It may be that we eagerly put strings into the node when serializing in the first place (and still need to retain the constant pool).
I also believe you mean as @eregon just said that means using InterpolatedStringNode (DStr is this with legacy parsers). I think that is a reasonable solution. |
The build logic for this is pretty ugly:
Legacy parsers do this combination during parse. The only reason I can think of not to add this enhancement is syntax tooling probably needs to know this. With that said this:
I keep thinking we need to have two profiles (semantics, syntax) during parse which does the thing each type of tool wants.
The text was updated successfully, but these errors were encountered: