Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsq_to_file: check for rotate size specifically in rev-incr loop #1123

Merged
merged 4 commits into from
Jan 9, 2019

Conversation

ploxiln
Copy link
Member

@ploxiln ploxiln commented Jan 8, 2019

instead of calling needsRotation() because it is theoretically possible that the time-format could
roll-over in the middle of the updateFile() rev-incr loop, so needsRotation() would always return true, and the loop would never terminate.

cc @mreiferson @jehiah

I thought of this while playing with @mccutchen's nsq_to_file_jepsen.go and tweaking it to work with non-gzip mode as well. That doesn't work because all the concurrent nsq_to_file instances using the same filename format at the same time can open the same file as each other in append mode (if not using gzip), so only one successfully moves the file and the others fail due to source file missing, and exit with failure. (This seems like "working as designed" to me.)

While figuring that out, I figured that calling needsRotation() inside updateFile() is an iffy proposition - f.filename and f.openTime could cause needsRotation() to return true, but that won't update f.filename or f.openTime. It seemed like duplicating the filesize check was the simplest thing to do here. (And note that updateFile() is always called immediately after needsRotation().)

instead of calling needsRotation()

because it is theoretically possible that the time-format could
roll-over in the middle of the updateFile() rev-incr loop,
so needsRotation() would always return true,
and the loop would never terminate
@mreiferson
Copy link
Member

do we need the rotate interval check copied too?

@ploxiln
Copy link
Member Author

ploxiln commented Jan 8, 2019

The rotate interval should not be hit, because f.openTime was updated at the beginning of updateFile(), and updateFile() should not take a significant amount of time. Put another way, I guess you're asking if we should try to figure out an older openTime for an existing file we just re-opened in append mode?

This part of the loop is really about corner cases. What if a later REV of the same datestamp file already exists, and we are not gzipping so we can open an existing file with append mode, but then the existing file already meets the rotate size? I guess this is possible if nsq_to_file is restarted, and it needs to skip over already-full REVs for the current datestamp.

So back to the rotate-time check: I don't think we can figure out the create time of an existing file on disk in a good way. Posix offers last-metadata-change-time and last-contents-change-time. Maybe if rotating based on time, the best we can do is just always pick a new file instead of an existing file?

@mreiferson
Copy link
Member

mreiferson commented Jan 8, 2019

Would some of this be easier if we just always used exclusive mode? What is the argument for preserving the behavior of multiple processes appending to the same file? Is that even safe to do?

@ploxiln
Copy link
Member Author

ploxiln commented Jan 8, 2019

Multiple processes at the same time is not the intended case ... it's for when a service restarts, it can pick up where it left off, no need for REVs (if no gzip and no size or time limit, just "date-file-format")

@mreiferson
Copy link
Member

mreiferson commented Jan 8, 2019

I think I'm confused from your comment above:

That doesn't work because all the concurrent nsq_to_file instances using the same filename format at the same time can open the same file as each other in append mode (if not using gzip), so only one successfully moves the file and the others fail due to source file missing, and exit with failure. (This seems like "working as designed" to me.)

It doesn't seem like we want it to be possible for two processes to open the same file. The only way to do that would be to always use O_EXCL, which means in the single-process case you'd generate more files when restarting. I feel like systems built to use the output files of nsq_to_file already expect to have to handle multiple revisions, so this doesn't seem like a bad tradeoff. But, this would mean that <REV> is required, which is a backwards-incompatible change we cannot make lightly.

Regarding the addition of rotate time checking in this loop, your explanation makes sense. I'm still generally unhappy with the complexity of the code in FileLogger, so I'm looking for ways to simplify and reduce edge cases.

Close() should use current filename and rev if it needs
to resolve an exclusiveRename() destination conflict
@ploxiln
Copy link
Member Author

ploxiln commented Jan 9, 2019

The "nsq_to_file_jepsen" test starts a bunch of nsq_to_file instances at the same time with all the same options (and on the same system), to test them conflicting with each other. But without the -gzip option, they can open the same file at the same time, in append mode. I don't think we should worry too much about that exact case though - I think the more realistic case is that one nsq_to_file (per server) is intended to be run for a particular dir/topic/channel.

What we do want to handle is if the service is restarted. On restart, continuing to append to the same file is probably OK - if the previous run exited cleanly it definitely is, and even if it didn't there's still a good chance that a whole message write finished last, but there's a chance that half the last written message (un-acked) is effectively prepended to the first new message (will be acked), messing it up.

We may also want to try to handle two runs accidentally overlapping (on the same server). If in this case one exits because it can't move the file, because the other one did, that seems fine. There's the more worrying chance of interleaved messages not being atomic when both processes append to the same file, but it's still possible (and somewhat likely) for small message writes to be all be cleanly appended, effectively atomically. But I don't want to worry too much about this case ... you could use gzip or (now) rotate-interval to avoid it, if your setup could provoke it.

I pushed up two more small tweaks, take a look. Maybe they're enough for now :)

@ploxiln
Copy link
Member Author

ploxiln commented Jan 9, 2019

successfully tested with modified pseudo-jepsen:

--- nsq_to_file_jepsen.go.orig	2019-01-08 19:33:04.000000000 -0500
+++ nsq_to_file_jepsen.go	2019-01-08 19:28:43.000000000 -0500
@@ -45,6 +45,7 @@
 	"os/exec"
 	"os/signal"
 	"path/filepath"
+	"strings"
 	"syscall"
 	"time"
 
@@ -97,7 +98,7 @@
 			"-work-dir", filepath.Join(outputRoot, "work"),
 			"-filename-format", "<TOPIC>/<TOPIC>.<DATETIME>+0000.<HOST><REV>.log",
 			"-datetime-format", "%Y-%m-%d_%H-%M",
-			"-gzip",
+			"-rotate-interval", "30s",
 		)
 		cmd.Stdout = logSink
 		cmd.Stderr = logSink
@@ -284,13 +285,18 @@
 		return fmt.Errorf("error opening output file %s: %s", path, err)
 	}
 	defer f.Close()
-	gzf, err := gzip.NewReader(f)
-	if err != nil {
-		return fmt.Errorf("error opening gzip reader: %s: %s", path, err)
+	var rf io.Reader = f
+
+	if strings.HasSuffix(path, ".gz") {
+		gzf, err := gzip.NewReader(f)
+		if err != nil {
+			return fmt.Errorf("error opening gzip reader: %s: %s", path, err)
+		}
+		defer gzf.Close()
+		rf = gzf
 	}
-	defer gzf.Close()
 
-	scanner := bufio.NewScanner(gzf)
+	scanner := bufio.NewScanner(rf)
 	for scanner.Scan() {
 		line := scanner.Bytes()
 		msg := &Msg{}

Copy link
Member

@mreiferson mreiferson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, LGTM

@mreiferson mreiferson merged commit e070367 into nsqio:master Jan 9, 2019
@ploxiln
Copy link
Member Author

ploxiln commented Jan 9, 2019

I ran nsq_to_file_jepsen without -gzip or -rotate-interval, having all 10 nsq_to_file instances append to the same file at the same time (!), and without -work-dir so that they did not have to move it. With this additional tweak to file_logger.go, it passed (!) (under both macOS 10.13 and ubuntu 16.04).

--- a/apps/nsq_to_file/file_logger.go
+++ b/apps/nsq_to_file/file_logger.go
@@ -112,16 +112,11 @@ func (f *FileLogger) router() {
                                f.updateFile()
                                sync = true
                        }
-                       _, err := f.Write(m.Body)
+                       _, err := f.Write(append(m.Body, byte('\n')))
                        if err != nil {
                                f.logf(lg.FATAL, "writing message to disk: %s", err)
                                os.Exit(1)
                        }
-                       _, err = f.Write([]byte("\n"))
-                       if err != nil {
-                               f.logf(lg.FATAL, "writing newline to disk: %s", err)
-                               os.Exit(1)
-                       }
                        output[pos] = m
                        pos++
                        if pos == cap(output) {

without that tweak, the newlines were mixed up:

2019/01/09 00:15:50 report
2019/01/09 00:15:50 ======================================================
2019/01/09 00:15:50 failure: error parsing output msg: {"id":14,"topic":"foo"}{"id":11,"topic":"foo"}{"id":13,"topic":"foo"}: invalid character '{' after top-level value
exit status 1

@mreiferson
Copy link
Member

mreiferson commented Jan 9, 2019

Interesting, those separate writes stood out to me when I was refactoring, but I suspect that the append(...) you've added forces an allocation (which is basically the reason I wrote them separately to begin with).

Ideally we'd use something like vectorio (writev) to write multiple slices with one syscall. For a long time there was no support for this in the Go standard library, but it looks like at some point (1.8 via https://go-review.googlesource.com/c/go/+/29951/) it was officially added as net.Buffers?

Want to give it a try?

@ploxiln
Copy link
Member Author

ploxiln commented Jan 9, 2019

I think the special writev() handling is implemented only for net.Conn and not os.File, and it's still being considered: golang/go#21676 (comment)

The message size affects whether an extra copy or an extra syscall is slower ...

It'll be at least a few days before I revisit this, so if someone else is interested, go ahead :)

@mreiferson
Copy link
Member

heh, of course

@ploxiln ploxiln deleted the nsq_file_rotatesize branch February 1, 2019 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants