-
-
Notifications
You must be signed in to change notification settings - Fork 496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST Node ID #4188
Comments
I'm planning to look into this question as part of review of semantic. I've suggested a solution in the meantime on rolldown/rolldown#1578 to unblock them. |
This comment was marked as resolved.
This comment was marked as resolved.
Another use case for oxc/crates/oxc_semantic/src/builder.rs Lines 263 to 276 in c65198f
|
Rolldown needs to implement |
I have suggested to them that they could use either:
fn addr_of_import(decl: &Box<ImportDeclaration>) -> usize {
&**decl as *const _ as usize
} Either of the above will work as a unique identifier, and neither depends on I'm not saying that we shouldn't add |
I may need to try it out myself at this point. I'll ask for a walk through to better understand the problem today. |
Let's decide on this one next week. I'm a little bit worn out on discussing about this but constantly coming up with hacks. |
I talked to @overlookmotel and here's what we agreed on: To minimize code change and support Rolldown's use case as a start point, we should
@rzvxa It seems like this task is going to be on you again, feel free to ask questions to clear uncertainties. |
I'm going to think about implementation over the course of this week and come back with a proposal. A useful first step would be to gather the use cases we have for Node IDs. Can anyone who is aware of such needs please comment below? For other discussion, we have a thread on Discord: https://discord.com/channels/1079625926024900739/1272506817238405160 |
AST node ids are super useful for caching data in a multi-pass compiler architecture. A few points:
Concrete examples:
|
Example: #4767 |
I attempted making
|
Lint rules use
I'm currently in favor of a single As a thought, if we want to store statement/expression-specific information, we could discriminate using an enum: // nodes.rs in oxc_semantic
enum SpecificNodeData<'a> {
Statement(StatementNodeData<'a>),
Expression(ExpressionNodeData<'a>)
}
struct AstNodes<'a> {
// ...
specific_data: IndexVec<AstNodeId, SpecificNodeData<'a>>
} |
If the goal is to add it to all the types that have an I propose we do the following: // since we no longer need the struct that had the same name before
trait AstNode {
fn ast_node_id(&self) -> AstNodeId;
}
impl From<AstNode> for AstNodeId {
fn from(node: &AstNodeId) -> AstNodeId {
node.ast_node_id()
}
}
impl<'a> AstNode for Statement<'a> {
fn ast_node_id(&self) -> AstNodeId {
self.node_id
}
}
impl<'a> AstNode for AstKind<'a> {
fn ast_node_id(&self) -> AstNodeId {
match self {
// kinds => kind.ast_node_id(),
}
}
} Then we can rewrite all the methods in the // from:
#[inline]
pub fn kind(&self, ast_node_id: AstNodeId) -> AstKind<'a> {
self.nodes[ast_node_id].kind
}
// to:
#[inline]
pub fn kind(&self, ast_node_id: impl Into<AstNodeId>) -> AstKind<'a> {
self.kinds[ast_node_id.into()]
} On the subject of specific ast IDs, We can use something like this to provide access to both types. // we put this in our structs
struct AstKindId<T> {
id: AstNodeId,
_marker: PhantomData<T>,
}
impl<T> AstNode for AstKindId<T> {
fn ast_node_id(&self) -> AstNodeId {
self.id
}
} Then we can use this generic type to do all sorts of type-safe stuff like this: pub fn reference_id(&self, ident_ref_id: AstKindId<IdentifierReference>) -> ReferenceId
// ...
} I'm not saying this is better, Both approaches have their pros and cons, But having a conversion from "more specific -> less specific" is possible, So we aren't losing that much, especially if we use the I'm looking forward to seeing what are your thoughts on it. |
Thanks all for input into this. @rzvxa Will come back on your proposals for APIs. But right now I'm still at the stage of trying to figure out what are all the use cases we want to support. I know you've brought some up previously - one related to CFG, if I remember right, but I think there were more. Could you please remind me? In particular, are there any things we want to do which are currently impossible, or extremely painful/expensive without Node IDs? |
@DonIsaac Do I understand right that both these things are already possible with what we have now? Is there anything in linter which is currently blocked but having node IDs stored inline in the AST types would make possible? |
Just to throw one of my own ideas into the mix (which may or may not be a good idea): If each AST type had node ID stored inline, This would also allow us to add UTF-16 spans and/or line/column spans without bloating the AST (#959) - they'd also go in side arrays. This may not be a good idea because it makes the AST less ergonomic to use (especially for those consuming Oxc as a library), but it does also have some positive effects. |
Correct, these are all currently possible. However, I don't think they'd be possible with an The main thing that is not possible right now is walking back up the AST in contexts where an AST node or an ID is not available. It's possible to work around this, but it would be quite helpful to have one on hand in each node. |
Thanks for coming back Don. Can you give an example of where walking back up the AST from a node (ancestors) is useful, and what is the current workaround? |
For examples of iterating up the AST, see no_unused_vars/usage.rs. Since iteration occurs over AstNodes, we have access to an AstNodeId on each loop iteration. For examples of where upwards traversal could be made cleaner with AstNodeIds in each struct, see no_useless_spread.rs. On my phone so I can't elaborate too much further on how/why, but you should find a redundant match on AstKind between the first check function and the second two. |
Here's my take on this, I think the root of our issue is that we can't walk down the tree using oxc/crates/oxc_linter/src/rules/eslint/no_fallthrough.rs Lines 443 to 477 in a226962
There is also the need for In my opinion, the most valuable thing is that we can now keep a bunch of secondary data all over the place - where we currently don't have access to |
@Boshen Can I just clarify: Is that really type for expressions, or is it type of bindings? i.e. could that info be in an array indexed by |
I'm a bit confused why this is problematic in parser. All |
I don't think so, but I'm not certain. AstBuilder is |
Yeah it was me that made |
close: #3943 ## Further improvements There is a double visit here. We need to collect all react hooks calling in `Function` and `ArrowFunctionExpression`. If we want to remove this implementation, we may wait for #4188. https://github.com/oxc-project/oxc/blob/d797e595d286c613848b773c256bd43124ad1981/crates/oxc_transformer/src/react/refresh.rs#L744-L947 ## Tests All tests copy from https://github.com/facebook/react/blob/main/packages/react-refresh/src/__tests__/ReactFresh-test.js There are still 4 tests that have not been passed **1. refresh/can-handle-implicit-arrow-returns/input.jsx** Related to #4767. transform correct, just output doesn't match the expected output **2. refresh/registers-identifiers-used-in-jsx-at-definition-site/input.jsx** **3. refresh/registers-identifiers-used-in-react-create-element-at-definition-site/input.jsx** Blocked by #4746 **4. refresh/supports-typescript-namespace-syntax/input.tsx** oxc transforms ts to js first, so probably we can ignore this case. If we really want to pass this test, we also need to turn off `TypeScript` plugin. ## What's next? ### Options: 1. Support transform `refresh_reg` and `refresh_sig` options to `MemberExpression`. Currently `import.meta.xxxx` still is an `Identifier` 2. Support `emit_full_signatures` option ### Other NAPI, testing in `monitor-oxc`, etc..
close: #3943 ## Further improvements There is a double visit here. We need to collect all react hooks calling in `Function` and `ArrowFunctionExpression`. If we want to remove this implementation, we may wait for #4188. https://github.com/oxc-project/oxc/blob/d797e595d286c613848b773c256bd43124ad1981/crates/oxc_transformer/src/react/refresh.rs#L744-L947 ## Tests All tests copy from https://github.com/facebook/react/blob/main/packages/react-refresh/src/__tests__/ReactFresh-test.js There are still 4 tests that have not been passed **1. refresh/can-handle-implicit-arrow-returns/input.jsx** Related to #4767. transform correct, just output doesn't match the expected output **2. refresh/registers-identifiers-used-in-jsx-at-definition-site/input.jsx** **3. refresh/registers-identifiers-used-in-react-create-element-at-definition-site/input.jsx** Blocked by #4746 **4. refresh/supports-typescript-namespace-syntax/input.tsx** oxc transforms ts to js first, so probably we can ignore this case. If we really want to pass this test, we also need to turn off `TypeScript` plugin. ## What's next? ### Options: 1. Support transform `refresh_reg` and `refresh_sig` options to `MemberExpression`. Currently `import.meta.xxxx` still is an `Identifier` 2. Support `emit_full_signatures` option ### Other NAPI, testing in `monitor-oxc`, etc..
I have finally written up my thoughts: #5688 Side note: #5674 gets rid of |
Continue in #5689 |
Blocking issues:
InputOptions.inject
rolldown/rolldown#1578constructor-super
. #3780Related discussions:
Previous attempts:
ast_node_id
field to all ast nodes #2818AstNode
trait and provide access toAstNodeId
#2861Research:
The text was updated successfully, but these errors were encountered: