Skip to content

Commit

Permalink
Use more StringScanner based API to parse XML (ruby#114)
Browse files Browse the repository at this point in the history
## Why?

Improve maintainability by optimizing the process so that the parsing
process proceeds using StringScanner#scan.

## Changed
- Change `REXML::Parsers::BaseParser` from `frozen_string_literal:
false` to `frozen_string_literal: true`.
- Added `Source#string=` method for error message output.
- Added TestParseDocumentTypeDeclaration#test_no_name test case.
- Of the `intSubset` of DOCTYPE, "<!" added consideration for processing
`Comments` that begin with "<!".

## [Benchmark]

```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     11.240      10.569        17.173       18.219 i/s -     100.000 times in 8.896882s 9.461267s 5.823007s 5.488884s
                 sax     31.812      30.716        48.383       52.532 i/s -     100.000 times in 3.143500s 3.255655s 2.066861s 1.903600s
                pull     36.855      36.354        56.718       61.443 i/s -     100.000 times in 2.713300s 2.750693s 1.763099s 1.627523s
              stream     34.176      34.758        49.801       54.622 i/s -     100.000 times in 2.925991s 2.877065s 2.008003s 1.830779s

Comparison:
                              dom
         after(YJIT):        18.2 i/s
        before(YJIT):        17.2 i/s - 1.06x  slower
              before:        11.2 i/s - 1.62x  slower
               after:        10.6 i/s - 1.72x  slower

                              sax
         after(YJIT):        52.5 i/s
        before(YJIT):        48.4 i/s - 1.09x  slower
              before:        31.8 i/s - 1.65x  slower
               after:        30.7 i/s - 1.71x  slower

                             pull
         after(YJIT):        61.4 i/s
        before(YJIT):        56.7 i/s - 1.08x  slower
              before:        36.9 i/s - 1.67x  slower
               after:        36.4 i/s - 1.69x  slower

                           stream
         after(YJIT):        54.6 i/s
        before(YJIT):        49.8 i/s - 1.10x  slower
               after:        34.8 i/s - 1.57x  slower
              before:        34.2 i/s - 1.60x  slower

```

- YJIT=ON : 1.06x - 1.10x faster
- YJIT=OFF : 0.94x - 1.01x faster

---------

Co-authored-by: Sutou Kouhei <kou@clear-code.com>
  • Loading branch information
naitoh and kou authored Feb 27, 2024
1 parent fb7ba27 commit 370666e
Show file tree
Hide file tree
Showing 3 changed files with 203 additions and 168 deletions.
Loading

0 comments on commit 370666e

Please sign in to comment.