Skip to content

Commit

Permalink
Allow uploading IO objects with unknown size
Browse files Browse the repository at this point in the history
Currently Shrine requires all IO objects to respond to #size. This is
not necessary, because storages like FileSystem and S3 allow uploading
IO objects of unknown size (with some modifications). The size of the IO
object can be unknown if it's a pipe, socket, or a general stream of
data.

We remove #size from the required IO methods. We also deprecates the
Shrine::IO_METHODS constant, because now that it's public we cannot
remove :size from it without affecting backwards compatibility.
Shrine::IO_METHODS will be private in Shrine 3.

We modify S3 storage to use multipart upload in case the size is
unknown. There is aws/aws-sdk-ruby#1711 which
adds Aws::S3::Object#upload_stream, but it hasn't been merged yet at
this moment, and we unfortunately don't control aws-sdk-s3 version so
we have to support older versions anyway.

We also modify FileSystem and S3 storages to backfill the "size"
metadata of the IO object in case it's not known.

Fixes #252
  • Loading branch information
janko committed Mar 3, 2018
1 parent 05aea5b commit 84f534f
Show file tree
Hide file tree
Showing 11 changed files with 288 additions and 87 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

* Inherit the logger on subclassing `Shrine` and make it shared across subclasses (@hmistry)

* Don't require IO size to be known on upload (@janko-m)

* Deprecate `Shrine::IO_METHODS` constant (@janko-m)

## 2.9.0 (2018-01-27)

* Support arrays of files in `versions` plugin (@janko-m)
Expand Down
2 changes: 1 addition & 1 deletion doc/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ class FakeIO
end

extend Forwardable
delegate Shrine::IO_METHODS.keys => :@io
delegate %i[read rewind eof? close size] => :@io
end
```

Expand Down
5 changes: 3 additions & 2 deletions lib/shrine.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ def initialize(io, missing_methods)
size: [],
close: [],
}
deprecate_constant(:IO_METHODS) if RUBY_VERSION > "2.3"

# Core class that represents a file uploaded to a storage. The instance
# methods for this class are added by Shrine::Plugins::Base::FileMethods, the
Expand Down Expand Up @@ -279,7 +280,7 @@ def extract_mime_type(io)

# Extracts the filesize from the IO object.
def extract_size(io)
io.size
io.size if io.respond_to?(:size)
end

# It first asserts that `io` is a valid IO object. It then extracts
Expand Down Expand Up @@ -356,7 +357,7 @@ def get_metadata(io, context)
# object doesn't respond to one of these methods, a Shrine::InvalidFile
# error is raised.
def _enforce_io(io)
missing_methods = IO_METHODS.select { |m, a| !io.respond_to?(m) }
missing_methods = %i[read eof? rewind close].select { |m| !io.respond_to?(m) }
raise InvalidFile.new(io, missing_methods) if missing_methods.any?
end

Expand Down
4 changes: 3 additions & 1 deletion lib/shrine/storage/file_system.rb
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,10 @@ def initialize(directory, prefix: nil, host: nil, clean: true, permissions: 0644

# Copies the file into the given location.
def upload(io, id, shrine_metadata: {}, **upload_options)
IO.copy_stream(io, path!(id))
bytes_copied = IO.copy_stream(io, path!(id))
path(id).chmod(permissions) if permissions

shrine_metadata["size"] ||= bytes_copied
end

# Moves the file to the given location. This gets called by the `moving`
Expand Down
2 changes: 1 addition & 1 deletion lib/shrine/storage/linter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def initialize(content)
end

extend Forwardable
delegate Shrine::IO_METHODS.keys => :@io
delegate %i[read rewind eof? close size] => :@io
end
end
end
Expand Down
51 changes: 49 additions & 2 deletions lib/shrine/storage/s3.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,11 @@
raise exception
end
end

require "down/chunked_io"
require "uri"
require "cgi"
require "tempfile"

class Shrine
module Storage
Expand Down Expand Up @@ -172,6 +174,8 @@ module Storage
# [Transfer Acceleration]: http://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html
# [object lifecycle]: http://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Object.html#put-instance_method
class S3
MIN_PART_SIZE = 5 * 1024 * 1024 # 5MB

attr_reader :client, :bucket, :prefix, :host, :upload_options

# Initializes a storage for uploading to S3.
Expand Down Expand Up @@ -248,7 +252,8 @@ def upload(io, id, shrine_metadata: {}, **upload_options)
if copyable?(io)
copy(io, id, **options)
else
put(io, id, **options)
bytes_uploaded = put(io, id, **options)
shrine_metadata["size"] ||= bytes_uploaded
end
end

Expand Down Expand Up @@ -396,8 +401,50 @@ def put(io, id, **options)
# use `upload_file` for files because it can do multipart upload
options = { multipart_threshold: @multipart_threshold[:upload] }.merge!(options)
object(id).upload_file(path, **options)
File.size(path)
else
object(id).put(body: io, **options)
io.open if io.is_a?(UploadedFile)

if io.respond_to?(:size) && io.size
object(id).put(body: io, **options)
io.size
else
# IO has unknown size, so we have to use multipart upload
multipart_put(io, id, **options)
end
end
end

# Uploads the file to S3 using multipart upload.
def multipart_put(io, id, **options)
multipart_upload = object(id).initiate_multipart_upload(**options)
parts = upload_parts(multipart_upload, io)
bytes_uploaded = parts.inject(0) { |size, part| size + part.delete(:size) }
multipart_upload.complete(multipart_upload: { parts: parts })
bytes_uploaded
rescue
multipart_upload.abort if multipart_upload
raise
end

def upload_parts(multipart_upload, io)
1.step.inject([]) do |parts, part_number|
parts << upload_part(multipart_upload, io, part_number)
break parts if io.eof?
parts
end
end

def upload_part(multipart_upload, io, part_number)
Tempfile.create("shrine-s3-part-#{part_number}") do |body|
multipart_part = multipart_upload.part(part_number)

IO.copy_stream(io, body, MIN_PART_SIZE)
body.rewind

response = multipart_part.upload(body: body)

{ part_number: part_number, size: body.size, etag: response.etag }
end
end

Expand Down
12 changes: 12 additions & 0 deletions test/file_system_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,18 @@ def root
@storage = file_system(root, directory_permissions: nil)
@storage.upload(fakeio, "a/b/c/file.jpg")
end

it "backfills size metadata if missing" do
io = fakeio("content")
io.instance_eval { undef size }
uploaded_file = @shrine.new(:file_system).upload(io)
assert_equal 7, uploaded_file.metadata["size"]

io = fakeio("content")
io.instance_eval { def size; 3; end }
uploaded_file = @shrine.new(:file_system).upload(io)
assert_equal 3, uploaded_file.metadata["size"]
end
end

describe "#movable?" do
Expand Down
Loading

0 comments on commit 84f534f

Please sign in to comment.