Multi-lined envelope reading corruption caused by direct reference into bufio.Reader's internal buffer rollover. #214
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BUG: #213
If we're dealing with multi-lined envelope (either by rows or by header/footer), readLine() will be called several times, thus whatever ios.ByteReadLine, which uses bufio.Reader underneath, returns in a previous call may be potentially be invalidated due to bufio.Reader's internal buf rollover. If we read the previous line directly, it would cause corruption.
To fix the problem the easiest solution would be simply copying the return []byte from ios.ByteReadLine every single time. But for files with single-line envelope, which are the vast majority cases, this copy becomes unnecessary and burdensome on gc. So the trick is to has a flag on reader.linesBuf's last element to tell if it contains a reference into the bufio.Reader's internal buffer, or it's a copy. Every time before we call bufio.Reader read, we check reader.liensBuf's last element flag, if it is not a copy, then we will turn it into a copy.
This way, we optimize for the vast majority cases without needing allocations, and avoid any potential corruptions in the multi-lined envelope cases.
@paulstadler