Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: Preserve '\r\n' during CSV import #28181

Merged
merged 1 commit into from
Aug 2, 2018
Merged

Conversation

neeral
Copy link
Contributor

@neeral neeral commented Aug 1, 2018

See #25344.

@neeral neeral requested a review from a team as a code owner August 1, 2018 23:01
@neeral neeral requested a review from a team August 1, 2018 23:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of the lint rules are failing because of imported code. e.g. you don't need to check for errors on Write or Flush because you call Error immediately afterwards. This is the semantics of how the csv writer is designed. How can ask the linter to ignore these?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@benesch
Copy link
Contributor

benesch commented Aug 2, 2018

Can you post the output of the failed lint rule? I'm not seeing it in the failed TeamCity job.

Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benesh this is the link to TC lint job:
https://teamcity.cockroachdb.com/viewLog.html?buildTypeId=Cockroach_UnitTests_Check&buildId=810996&branch_Cockroach_UnitTests_Check=28181

[/TestUnused] lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:85:2: var ErrTrailingComma is unused (U1000)
lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:134:2: field TrailingComma is unused (U1000)

[23:49:08][TestLint] /TestErrCheck
[23:49:08]

[/TestErrCheck] lint_test.go:1145: pkg/util/encoding/csv/example_test.go:121:12:	w.WriteAll(records) // calls Flush internally <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer.go:103:11:	w.w.Flush() <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:69:9:	f.Write([]string{"abc"}) <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:78:9:	f.Write([]string{"abc"}) <- unchecked error

[/TestForbiddenImports] lint_test.go:1031: 
	github.com/cockroachdb/cockroach/pkg/util/encoding/csv: log <- please use "util/log" instead of "log"

The Unused variables is easy to remove.
The ErrCheck rule conflicts with the semantics of write/flush.
Use of log package means adding contexts.

I don't want to change this stdlib code more than I have to, as it will make it hard to maintain. I'm open to suggestions on how you normally handle this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job ran on revision1 of this PR - in case the line numbers don't align with the current version. I tried to fix some of these but did a bad job.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

See cockroachdb#25344.

It appears this is being caused by the behaviour of Golang's
encoding/csv library, which folds \r\n into \n when reading. This was
fixed in golang/go#21201 but then reverted golang/go#22746. It appears
based on that second issue that Go is unlikely to change that behavior.

Check in the stdlib `encoding/csv` into `pkg/util` with
golang/go#22746 reverted.

Release note:
`\r\n` characters in CSV files were silently converted into `\n`. This
causes imported data to be different. This is now fixed.
Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added test case for this in pkg/ccl/importccl/import_stmt_test.go

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@benesch benesch requested a review from dt August 2, 2018 17:55
@benesch
Copy link
Contributor

benesch commented Aug 2, 2018

To suppress the errcheck warnings, I'd be OK with you adding -ignorepkg ./pkg/util/encoding/csv to the errcheck invocation here:

If you do so, please add a comment indicating why it's OK.

@dt
Copy link
Member

dt commented Aug 2, 2018

I too would be fine with exempting from linters to minimize diff from stdlib, but also LGTM as-is. Thanks!

Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d rather not list (*bufio.Writer).Flush in errcheck_excludes. It’s not in general safe to ignore its error. It’s only safe here because encoding/csv knows that it can find that error later by calling w.Write(nil). Which actually seems like an undocumented implementation detail to rely on; but whatever, parity with the stdlib definitely wins here.

Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-ignorepkg doesn't do what we thought. -ignorepkg ./pkg/util/encoding/csv doesn't work - need to do -ignorepkg bufio which defeats the purpose of what we want to do.

https://github.com/kisielk/errcheck

The -ignorepkg flag takes a comma-separated list of package import paths to ignore

It's not which packages to ignore when running the checks.

If you are both happy with this, could @benesh or @dt give an LGTM and trigger bors?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):

			create: `t text`,
			typ:    "CSV",
			data:   "\"hello\r\nworld\"\n\"friend\nfoe\"\n\"mr\rmrs\"",

Is this the correct way to handle this?

@benesch
Copy link
Contributor

benesch commented Aug 2, 2018

D’oh, sorry to lead you astray. Your approach is definitely the lesser poison. Leaving to @dt to answer your other question and merge.

Copy link
Member

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):

Previously, neeral (Neeral Dodhia) wrote…

Is this the correct way to handle this?

Yeah, looks right to me! It might be a little easier to read using backticks rather than escaping all the quotes, but not enough to justify another CI cycle.

@dt
Copy link
Member

dt commented Aug 2, 2018

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 2, 2018

👎 Rejected by code reviews

Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dt could you give an LGTM, bors is complaining no one has approved this PR

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@benesch
Copy link
Contributor

benesch commented Aug 2, 2018

Sorry, my review was in the way.

bors r+

@benesch
Copy link
Contributor

benesch commented Aug 2, 2018

Thanks for the fix, @neeral!

@craig
Copy link
Contributor

craig bot commented Aug 2, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Aug 2, 2018
28181: importccl: Preserve '\r\n' during CSV import r=benesch a=neeral

See #25344.

Co-authored-by: neeral <neeral@users.noreply.github.com>
@craig
Copy link
Contributor

craig bot commented Aug 2, 2018

Build succeeded

@craig craig bot merged commit 67ffc7e into cockroachdb:master Aug 2, 2018
@neeral neeral deleted the csv branch August 2, 2018 23:32
craig bot pushed a commit that referenced this pull request Aug 6, 2018
27868: backport-2.0: storage: prevent unbounded raft log growth without quorum r=nvanbenschoten a=nvanbenschoten

Backport 2/2 commits from #27774.

/cc @cockroachdb/release

---

Fixes #27772.

This change adds safeguards to prevent cases where a raft log
would grow without bound during loss of quorum scenarios. It
also adds a new test that demonstrates that the raft log does
not grow without bound in these cases.

There are two cases that need to be handled to prevent the
unbounded raft log growth observed in #27772.
1. When the leader proposes a command and cannot establish a
   quorum. In this case, we know the leader has the entry in
   its log, so there's no need to refresh it with `reasonTicks`.
   To avoid this, we no longer use `refreshTicks` as a leader.
2. When a follower proposes a command that is forwarded to the
   leader who cannot establish a quorum. In this case, the
   follower can't be sure (currently) that the leader got the
   proposal, so it needs to refresh using `reasonTicks`. However,
   the leader now detects duplicate forwarded proposals and
   avoids appending redundant entries to its log. It does so
   by maintaining a set of in-flight forwarded proposals that
   it has received during its term as leader. This set is reset
   after every leadership change.

Both of these cases are tested against in the new
TestLogGrowthWhenRefreshingPendingCommands. Without both of
the safeguards introduced in this commit, the test fails.

Release note (bug fix): Prevent loss of quorum situations from
allowing unbounded growth of a Range's Raft log.


28225: release-2.0: importccl: Preserve '\r\n' during CSV import r=dt a=dt

Backport 1/1 commits from #28181.

/cc @cockroachdb/release

---

See #25344.


Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: neeral <neeral@users.noreply.github.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants