Skip to content

Commit

Permalink
Changed processing in REXML::Parsers::BaseParser#pull_event from regu…
Browse files Browse the repository at this point in the history
…lar expression to processing using StringScanner.

## Why
Improve maintainability by optimizing the process so that the parsing process proceeds using StringScanner#scan.

# Changed
- Added Source#string= method for error message output.
- Added TestParseDocumentTypeDeclaration#test_no_name test case.
- Of the `intSubset` of DOCTYPE, "<!" added consideration for processing `Comments` that begin with "<!".

[intSubset Spec]
https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-doctypedecl
> [28] 	doctypedecl   ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-intSubset
> [28b] intSubset   ::=  (markupdecl | DeclSep)*

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-markupdecl
> [29]  markupdecl   ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PI | Comment

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-elementdecl
> [45]  elementdecl   ::=   '<!ELEMENT' S Name S contentspec S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-AttlistDecl
> [52] 	AttlistDecl   ::=   '<!ATTLIST' S Name AttDef* S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityDecl
> [70] 	EntityDecl   ::=   GEDecl | PEDecl
> [71] 	GEDecl	   ::=   '<!ENTITY' S Name S EntityDef S? '>'
> [72] 	PEDecl	   ::=   '<!ENTITY' S '%' S Name S PEDef S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-NotationDecl
> [82] 	NotationDecl   ::=   '<!NOTATION' S Name S (ExternalID | PublicID) S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PI
> [16] 	PI	   ::=   '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-Comment
> [15] 	Comment	   ::=   '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-DeclSep
> [28a] DeclSep	   ::=   PEReference | S

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PEReference
> [69]  PEReference   ::=   '%' Name ';'

[Benchmark]

```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     11.240      10.569        17.173       18.219 i/s -     100.000 times in 8.896882s 9.461267s 5.823007s 5.488884s
                 sax     31.812      30.716        48.383       52.532 i/s -     100.000 times in 3.143500s 3.255655s 2.066861s 1.903600s
                pull     36.855      36.354        56.718       61.443 i/s -     100.000 times in 2.713300s 2.750693s 1.763099s 1.627523s
              stream     34.176      34.758        49.801       54.622 i/s -     100.000 times in 2.925991s 2.877065s 2.008003s 1.830779s

Comparison:
                              dom
         after(YJIT):        18.2 i/s
        before(YJIT):        17.2 i/s - 1.06x  slower
              before:        11.2 i/s - 1.62x  slower
               after:        10.6 i/s - 1.72x  slower

                              sax
         after(YJIT):        52.5 i/s
        before(YJIT):        48.4 i/s - 1.09x  slower
              before:        31.8 i/s - 1.65x  slower
               after:        30.7 i/s - 1.71x  slower

                             pull
         after(YJIT):        61.4 i/s
        before(YJIT):        56.7 i/s - 1.08x  slower
              before:        36.9 i/s - 1.67x  slower
               after:        36.4 i/s - 1.69x  slower

                           stream
         after(YJIT):        54.6 i/s
        before(YJIT):        49.8 i/s - 1.10x  slower
               after:        34.8 i/s - 1.57x  slower
              before:        34.2 i/s - 1.60x  slower

```

- YJIT=ON : 1.06x - 1.10x faster
- YJIT=OFF : 0.94x - 1.01x faster

Co-authored-by: Sutou Kouhei <kou@clear-code.com>
  • Loading branch information
naitoh and kou committed Feb 26, 2024
1 parent 0656925 commit 54b0298
Show file tree
Hide file tree
Showing 3 changed files with 200 additions and 164 deletions.
Loading

0 comments on commit 54b0298

Please sign in to comment.