Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using gs.File.CopyToFile after reading the file results in 0 byte file at the destination #78

Closed
aucampia opened this issue Jun 24, 2021 · 7 comments · Fixed by #80
Closed

Comments

@aucampia
Copy link
Contributor

This is because the same reader is used for copying and for reading, so by the time you get to CopyToFile the reader is already at the end of the file.

Probably the simplest solution to this fix will be to just Seek(0,0) before doing the copy.

As a workaround the seek can be done in user code.

Any inputs on possible resolutions is welcome, I think it would be good to just call Seek(0, 0) and I will prepare a patch for that.

@funkyshu
Copy link
Member

Is the behavior different for S3? OS?

@aucampia
Copy link
Contributor Author

aucampia commented Jun 24, 2021

I think the same problem will occur if the copy happens from S3 to non S3, and OS to non OS. But I can confirm copying from OS to OS works fine regardless of whether the file is read first.

The difference when it comes to google storage is that copying from gs to gs somehow does use reader. Which maybe should not happen.

I will submit unit test shortly for this, which may make it a bit easier to see what is up.

I think the safest resolution to this, which won't not affect anyone relying on state, is to create a secondary reader for the temp file.

But this will have to be fixed in other back-ends also for cases where where the file is copied to another back-end.

@aucampia aucampia changed the title Using gs.File.CopyToFile after reading it results in 0 byte file at the destination Using gs.File.CopyToFile after reading gf results in 0 byte file at the destination Jun 24, 2021
@aucampia aucampia changed the title Using gs.File.CopyToFile after reading gf results in 0 byte file at the destination Using gs.File.CopyToFile after reading the file results in 0 byte file at the destination Jun 24, 2021
@funkyshu
Copy link
Member

So this is a consequence of the fact that we chose vfs.File to implement io.Reader, io.Writer, I.Closer, io.Seeker. For whose we must keep state of cursor position.

However, I think it's fair to say that our intention for the MoveToFile, MoveToLocation, CopyToFile, CopyToLocation functions was that they were meant to always copy/move ALL of the contents of a give file such that Seek(0, 0) was assumed.

The question we're facing here is whether calling one of the move/copy functions should 1.) alter the state of the cursor position (Close is always called at the end of each so the position would return to 0 afterwards), or 2.) that the state of the cursor position used by io.XXX functions should remain independent of the Move/Copy functions.

Internal to C2FO code, I'd say we've never intentionally used both io.XXX and Move/Copy functions at the same time. I'm struggling a bit understand what desirable scenario there might be to do that. Should we instead simply prohibit (throw error) Move/Copy functions when cursor is not already 0,0? @Roosh513

You solution is of using an independent handle seems logical but what happens after a move operation. It's likely the original io.xxx handle will fail.

@aucampia
Copy link
Contributor Author

Should we instead simply prohibit (throw error) Move/Copy functions when cursor is not already 0,0? @Roosh513

This may be a good interim option, at the very least it would make people aware that something is happening which they likely don't expect. I think even in such a case subsequent io.XXX operations will have surprising results to users.

Will wait for more input, at the moment I am also working around this by setting cursor to 0,0.

Maybe the best option in long term is to have separate File and FileHandle concepts, and File being moveable, and FileHandle provider io.XXX operations. This will break compatibility though.

@aucampia
Copy link
Contributor Author

aucampia commented Jun 26, 2021

Probably gs.File.MoveToFile() and/or gs.File.Delete() should reset gs.File.tempFile?

@funkyshu
Copy link
Member

funkyshu commented Jul 6, 2021

@aucampia Sorry for taking so long. After meeting with the other devs here, we decide for now to return an error after Move/Copy functions.
We found the following to be true:

  • It was our original intent that one would use the io.Read, Seek, Write, Close functions independently of the Move/Copy functions.
  • Close() is an integral function in most implementations to "commit" a write to the file system. There are several reasons for this but partially to accommodate Seek. Because of this, it makes little sense to have an open handle before a copy/move.
  • In cases where we've done this in our own internal codebase, we've always follow the rule that we should first call Close(). See example:
    file, err := vfssimple.NewFile("s3://somebucket/path/to/file.text")
    _, err :+ file.Write([]byte("write some text")
    //commit the writes to s3
    err := file.Close()

    //only now should I call MoveToFile()
    err := file.MoveToFile(otherfile)

Adding an error if cursor is not at 0,0 should actually add safety to our own code, besides making it unambiguous to a new consumer of the library as what is happening.

@aucampia
Copy link
Contributor Author

aucampia commented Jul 6, 2021

Thanks for the input, I will look into this tonight to see the best way to implement this in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants