Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use more StringScanner based API to parse XML (ruby#114)
## Why? Improve maintainability by optimizing the process so that the parsing process proceeds using StringScanner#scan. ## Changed - Change `REXML::Parsers::BaseParser` from `frozen_string_literal: false` to `frozen_string_literal: true`. - Added `Source#string=` method for error message output. - Added TestParseDocumentTypeDeclaration#test_no_name test case. - Of the `intSubset` of DOCTYPE, "<!" added consideration for processing `Comments` that begin with "<!". ## [Benchmark] ``` RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22] Calculating ------------------------------------- before after before(YJIT) after(YJIT) dom 11.240 10.569 17.173 18.219 i/s - 100.000 times in 8.896882s 9.461267s 5.823007s 5.488884s sax 31.812 30.716 48.383 52.532 i/s - 100.000 times in 3.143500s 3.255655s 2.066861s 1.903600s pull 36.855 36.354 56.718 61.443 i/s - 100.000 times in 2.713300s 2.750693s 1.763099s 1.627523s stream 34.176 34.758 49.801 54.622 i/s - 100.000 times in 2.925991s 2.877065s 2.008003s 1.830779s Comparison: dom after(YJIT): 18.2 i/s before(YJIT): 17.2 i/s - 1.06x slower before: 11.2 i/s - 1.62x slower after: 10.6 i/s - 1.72x slower sax after(YJIT): 52.5 i/s before(YJIT): 48.4 i/s - 1.09x slower before: 31.8 i/s - 1.65x slower after: 30.7 i/s - 1.71x slower pull after(YJIT): 61.4 i/s before(YJIT): 56.7 i/s - 1.08x slower before: 36.9 i/s - 1.67x slower after: 36.4 i/s - 1.69x slower stream after(YJIT): 54.6 i/s before(YJIT): 49.8 i/s - 1.10x slower after: 34.8 i/s - 1.57x slower before: 34.2 i/s - 1.60x slower ``` - YJIT=ON : 1.06x - 1.10x faster - YJIT=OFF : 0.94x - 1.01x faster --------- Co-authored-by: Sutou Kouhei <kou@clear-code.com>
- Loading branch information