-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
explore: alternative CSS selector parsers #2560
Comments
I've looked at Crass a bit yesterday and today, but it's not returning a fine-enough-grained AST for selectors; we'd have to use the tokens and implement some sort of parser to make it work. Looking at syntax_tree-css, it's incomplete but is definitely a well-formed AST for selectors. I've started kicking the tires and making basic improvements to see how far I can take it. |
PRs against syntax_tree-css to get it to where we can integrate it:
PRs against nokogiri with this goal in mind: |
I've got a branch where the hand-written parser work is progressing, in case anybody wants to follow along: https://github.com/sparklemotion/nokogiri/tree/2560-start-custom-css-parser |
The CSS selector parser we have is complex, and selector parsing is really a separable concern from Nokogiri proper. It would be nice if we were able to use an existing parser.
(Side note: the generation of XPath from the CSS is a Nokogiri concern, though, since the generated xpath query is often tightly coupled to the version of libxml or the C extension. Perhaps we can spin this off as a separate gem/concern at some point, but it would need to be pluggable to do nokogiri-specific xpath things and I don't feel like that's worth the effort right now.)
Some things to look at that generate ASTs for CSS:
I'd also like to fix some outstanding bugs in the current implementation:
foo~:nth-child(2)
gives incorrect XPath #707though, note that the behavior changes to fix these bugs probably justify a 2.0 major release, because it's going to break existing apps.
And then I think we can also introduce some new features:
:not
CSS pseudo-class #3207The text was updated successfully, but these errors were encountered: