Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Changed processing in REXML::Parsers::BaseParser#pull_event from regu…
…lar expression to processing using StringScanner. ## Why Improve maintainability by optimizing the process so that the parsing process proceeds using StringScanner#scan. # Changed - Added Source#string= method for error message output. - Added TestParseDocumentTypeDeclaration#test_no_name test case. - Of the `intSubset` of DOCTYPE, "<!" added consideration for processing `Comments` that begin with "<!". [intSubset Spec] https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-doctypedecl > [28] doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-intSubset > [28b] intSubset ::= (markupdecl | DeclSep)* https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-markupdecl > [29] markupdecl ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PI | Comment https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-elementdecl > [45] elementdecl ::= '<!ELEMENT' S Name S contentspec S? '>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-AttlistDecl > [52] AttlistDecl ::= '<!ATTLIST' S Name AttDef* S? '>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityDecl > [70] EntityDecl ::= GEDecl | PEDecl > [71] GEDecl ::= '<!ENTITY' S Name S EntityDef S? '>' > [72] PEDecl ::= '<!ENTITY' S '%' S Name S PEDef S? '>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-NotationDecl > [82] NotationDecl ::= '<!NOTATION' S Name S (ExternalID | PublicID) S? '>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PI > [16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-Comment > [15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->' https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-DeclSep > [28a] DeclSep ::= PEReference | S https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PEReference > [69] PEReference ::= '%' Name ';' [Benchmark] ``` RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22] Calculating ------------------------------------- before after before(YJIT) after(YJIT) dom 11.240 10.569 17.173 18.219 i/s - 100.000 times in 8.896882s 9.461267s 5.823007s 5.488884s sax 31.812 30.716 48.383 52.532 i/s - 100.000 times in 3.143500s 3.255655s 2.066861s 1.903600s pull 36.855 36.354 56.718 61.443 i/s - 100.000 times in 2.713300s 2.750693s 1.763099s 1.627523s stream 34.176 34.758 49.801 54.622 i/s - 100.000 times in 2.925991s 2.877065s 2.008003s 1.830779s Comparison: dom after(YJIT): 18.2 i/s before(YJIT): 17.2 i/s - 1.06x slower before: 11.2 i/s - 1.62x slower after: 10.6 i/s - 1.72x slower sax after(YJIT): 52.5 i/s before(YJIT): 48.4 i/s - 1.09x slower before: 31.8 i/s - 1.65x slower after: 30.7 i/s - 1.71x slower pull after(YJIT): 61.4 i/s before(YJIT): 56.7 i/s - 1.08x slower before: 36.9 i/s - 1.67x slower after: 36.4 i/s - 1.69x slower stream after(YJIT): 54.6 i/s before(YJIT): 49.8 i/s - 1.10x slower after: 34.8 i/s - 1.57x slower before: 34.2 i/s - 1.60x slower ``` - YJIT=ON : 1.06x - 1.10x faster - YJIT=OFF : 0.94x - 1.01x faster Co-authored-by: Sutou Kouhei <kou@clear-code.com>
- Loading branch information