Skip to content

Commit

Permalink
bug #38228 [Yaml Parser] Fix edge cases when parsing multiple documen…
Browse files Browse the repository at this point in the history
…ts (digilist)

This PR was merged into the 3.4 branch.

Discussion
----------

[Yaml Parser] Fix edge cases when parsing multiple documents

| Q             | A
| ------------- | ---
| Branch?       | 4.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Tickets       |
| License       | MIT
| Doc PR        |

I identified some edge cases when parsing multiple YAML documents with the same parser instance, because the totalNumberOfLines was not reset and so any subsequent parsing considered the number of lines of the first document.

Consider this document:
```yaml
a:
    b: |
        row
        row2
c: d
```

Normally, `a.b` would be parsed as `row\nrow2\n`. But if the parser parsed a shorter document before, the `\n` after row2 was missing, as the parser considered it as the end of the file (that's why the `c: d` at the end is important).

So this fix resets the `totalNumberOfLines` in the YAML parser to `null` so that any subsequent parsing will initialize the value for the new document and does not use the file length of the first parsed document.

I stumbled upon this because of a flickering unit test that was using the translation component. Sometimes the translated string contained a trailing `\n` and sometimes not. In the end it was based on this bug, as the translation files were not loaded in the same order every time (not really sure why. It's somehow related to the cache state, but even with a warm cache it was not totally deterministic).

Commits
-------

012ee4fa59 [Yaml Parser] Fix edge cases when parsing multiple documents
  • Loading branch information
fabpot committed Sep 18, 2020
2 parents 50f080d + c6d162b commit a2b5a78
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 0 deletions.
1 change: 1 addition & 0 deletions Parser.php
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@ public function parse($value, $flags = 0)
$this->refs = [];
$this->skippedLineNumbers = [];
$this->locallySkippedLineNumbers = [];
$this->totalNumberOfLines = null;

if (null !== $e) {
throw $e;
Expand Down
33 changes: 33 additions & 0 deletions Tests/ParserTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -2362,6 +2362,39 @@ public function testParseValueWithNegativeModifiers()
$this->parser->parse($yaml)
);
}

/**
* This is a regression test for a bug where a YAML block with a nested multiline string using | was parsed without
* a trailing \n when a shorter YAML document was parsed before.
*
* When a shorter document was parsed before, the nested string did not have a \n at the end of the string, because
* the Parser thought it was the end of the file, even though it is not.
*/
public function testParsingMultipleDocuments()
{
$shortDocument = 'foo: bar';
$longDocument = <<<YAML
a:
b: |
row
row2
c: d
YAML;

$expected = ['a' => ['b' => "row\nrow2\n"], 'c' => 'd'];

// The parser was not used before, so there is a new line after row2
$this->assertSame($expected, $this->parser->parse($longDocument));

$parser = new Parser();
// The first parsing set and fixed the totalNumberOfLines in the Parser before, so parsing the short document here
// to reproduce the issue. If the issue would not have been fixed, the next assertion will fail
$parser->parse($shortDocument);

// After the total number of lines has been rset the result will be the same as if a new parser was used
// (before, there was no \n after row2)
$this->assertSame($expected, $parser->parse($longDocument));
}
}

class B
Expand Down

0 comments on commit a2b5a78

Please sign in to comment.