Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[S3] Get object with range header returns empty(wrong) content. #349

Closed
ZhaoX opened this issue Apr 9, 2015 · 8 comments
Closed

[S3] Get object with range header returns empty(wrong) content. #349

ZhaoX opened this issue Apr 9, 2015 · 8 comments

Comments

@ZhaoX
Copy link

ZhaoX commented Apr 9, 2015

Purpose

  • What purpose of LeoFS do you use?
    • Test

Environments

  • LeoFS 1.2.7
  • Erlang R16B03-1 (erts-5.10.4)
  • uname -a: Darwin Xin-Air.local 14.1.0 Darwin Kernel Version 14.1.0: Thu Feb 26 19:26:47 PST 2015; root:xnu-2782.10.73~1/RELEASE_X86_64 x86_64

What happened and How to reproduce if possible

  • What did you do?
    1. use bootstrap start integration-test created a test cluster;
    2. use leofs-adm create a user and add a bucket named test;
    3. use s3cmd put an object (~600MB) to bucket test;
    4. use erlcloud get this object with http range header(bytes=400000000-400000001).
  • What did you expect to see?
    a http response with content-length=2
  • What did you see instead?
    {ok,{{"HTTP/1.1",206,"Partial Content"},
    [{"connection","keep-alive"},
    {"date","Thu, 09 Apr 2015 03:17:11 GMT"},
    {"server","LeoFS"},
    {"content-length","0"},
    {"content-type","application/octet-stream"}],
    []}}

Plus

  • When get object with a small range start value, LeoFS returns correct value:
    request: bytes=0-1
    response:
    {ok,{{"HTTP/1.1",206,"Partial Content"},
    [{"connection","keep-alive"},
    {"date","Thu, 09 Apr 2015 03:23:45 GMT"},
    {"server","LeoFS"},
    {"content-length","2"},
    {"content-type","application/octet-stream"}],
    [0,0]}}
  • Object info
    s3cmd ls s3://test
    2015-04-08 11:43 630237184 s3://test/test
@mocchira
Copy link
Member

mocchira commented Apr 9, 2015

@ZhaoX Thank you for your report.

This is a known restriction.(we need to document it)
This issue can occur if you put a large object by using the multipart upload method with which part size being larger than leofs's chunk size.

Since the latest s3cmd's default part size is 15MB (https://github.com/s3tools/s3cmd/blob/44712cc2723e156e270beaa94b3bde44da2acfa1/S3/Config.py#L91) and leofs's default chunk size is 5MB, so this should happen with default settings.

You can avoid this issue by setting multipart_chunk_size_mb in s3cmd's configuration to less than 5MB for now.

We are going to solve this issue fundamentally with the next major update 1.4.

@ZhaoX
Copy link
Author

ZhaoX commented Apr 9, 2015

Thank you for your reply.

@yosukehara yosukehara modified the milestones: 1.2.8, 1.4.0 Apr 13, 2015
@yosukehara
Copy link
Member

@ZhaoX I've fixed this issue. We will include it in LeoFS v1.2.8 after tested retrieving a part of an object.

@yosukehara
Copy link
Member

You can easily build LeoFS v1.2.8-dev as follows then check this issue.

@yosukehara
Copy link
Member

I've shared briefly method of checking this issue as below:

Check s3cmd's version

$ ./s3cmd --version
s3cmd version 1.5.2

Check this issue

Case-1 - Chunk size: 15MB:

## Check a file attributes
$ ls -la ~/Downloads/leofs-1.2.7-1.dmg
-rw-r-----@ 1 yosuke yosuke 64789044 Mar  9 10:45 /Users/yosuke/Downloads/leofs-1.2.7-1.dmg

## PUT an object
$ ./s3cmd --multipart-chunk-size-mb=15 put ~/Downloads/leofs-1.2.7-1.dmg s3://test/
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 1 of 5, 15MB]
 15728640 of 15728640   100% in    0s    78.69 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 2 of 5, 15MB]
 15728640 of 15728640   100% in    0s    79.84 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 3 of 5, 15MB]
 15728640 of 15728640   100% in    0s    81.93 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 4 of 5, 15MB]
 15728640 of 15728640   100% in    0s    82.52 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 5 of 5, 1830kB]
 1874484 of 1874484   100% in    0s    65.87 MB/s  done

## GET a part of the object
$ curl --range 52428800-52428801 http://test.localhost:8080/leofs-1.2.7-1.dmg -o ./test-01-15M -v
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1...
* Connected to test.localhost (127.0.0.1) port 8080 (#0)
> GET /leofs-1.2.7-1.dmg HTTP/1.1
> Range: bytes=52428800-52428801
> User-Agent: curl/7.37.1
> Host: test.localhost:8080
> Accept: */*
>
< HTTP/1.1 206 Partial Content
< transfer-encoding: chunked
< connection: keep-alive
< date: Mon, 13 Apr 2015 05:30:37 GMT
* Server LeoFS is not blacklisted
< server: LeoFS
< Content-Type: application/octet-stream
<
{ [data not shown]
100     2    0     2    0     0     87      0 --:--:-- --:--:-- --:--:--    90
* Connection #0 to host test.localhost left intact

Case-2 - Chunk size: 10MB:

## PUT an object
$ ./s3cmd --multipart-chunk-size-mb=10 put ~/Downloads/leofs-1.2.7-1.dmg s3://test/
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 1 of 7, 10MB]
 10485760 of 10485760   100% in    0s    75.92 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 2 of 7, 10MB]
 10485760 of 10485760   100% in    0s    78.28 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 3 of 7, 10MB]
 10485760 of 10485760   100% in    0s    80.33 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 4 of 7, 10MB]
 10485760 of 10485760   100% in    0s    77.94 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 5 of 7, 10MB]
 10485760 of 10485760   100% in    0s    76.99 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 6 of 7, 10MB]
 10485760 of 10485760   100% in    0s    76.44 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 7 of 7, 1830kB]
 1874484 of 1874484   100% in    0s    64.81 MB/s  done

## GET a part of the object
$ curl --range 52428800-52428801 http://test.localhost:8080/leofs-1.2.7-1.dmg -o ./test-01-10M -v
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1...
* Connected to test.localhost (127.0.0.1) port 8080 (#0)
> GET /leofs-1.2.7-1.dmg HTTP/1.1
> Range: bytes=52428800-52428801
> User-Agent: curl/7.37.1
> Host: test.localhost:8080
> Accept: */*
>
< HTTP/1.1 206 Partial Content
< transfer-encoding: chunked
< connection: keep-alive
< date: Mon, 13 Apr 2015 05:31:40 GMT
* Server LeoFS is not blacklisted
< server: LeoFS
<F1>S
...skipping...
< Content-Type: application/octet-stream
<
{ [data not shown]
100     2    0     2    0     0    234      0 --:--:-- --:--:-- --:--:--   250
* Connection #0 to host test.localhost left intact

Case-3 - Chunk size: 5MB:

## PUT an object
$ ./s3cmd --multipart-chunk-size-mb=5 put ~/Downloads/leofs-1.2.7-1.dmg s3://test/
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 1 of 13, 5MB]
 5242880 of 5242880   100% in    0s    71.25 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 2 of 13, 5MB]
 5242880 of 5242880   100% in    0s    77.25 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 3 of 13, 5MB]
 5242880 of 5242880   100% in    0s    82.76 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 4 of 13, 5MB]
 5242880 of 5242880   100% in    0s    77.64 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 5 of 13, 5MB]
 5242880 of 5242880   100% in    0s    77.01 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 6 of 13, 5MB]
 5242880 of 5242880   100% in    0s    77.07 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 7 of 13, 5MB]
 5242880 of 5242880   100% in    0s    80.46 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 8 of 13, 5MB]
 5242880 of 5242880   100% in    0s    80.39 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 9 of 13, 5MB]
 5242880 of 5242880   100% in    0s    75.60 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 10 of 13, 5MB]
 5242880 of 5242880   100% in    0s    77.67 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 11 of 13, 5MB]
 5242880 of 5242880   100% in    0s    78.85 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 12 of 13, 5MB]
 5242880 of 5242880   100% in    0s    74.96 MB/s  done
/Users/yosuke.hara/Downloads/leofs-1.2.7-1.dmg -> s3://test/leofs-1.2.7-1.dmg  [part 13 of 13, 1830kB]
 1874484 of 1874484   100% in    0s    66.54 MB/s  done

## GET a part of the object
$ curl --range 52428800-52428801 http://test.localhost:8080/leofs-1.2.7-1.dmg -o ./test-01-5M -v
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 127.0.0.1...
* Connected to test.localhost (127.0.0.1) port 8080 (#0)
> GET /leofs-1.2.7-1.dmg HTTP/1.1
> Range: bytes=52428800-52428801
> User-Agent: curl/7.37.1
> Host: test.localhost:8080
> Accept: */*
>
< HTTP/1.1 206 Partial Content
< transfer-encoding: chunked
< connection: keep-alive
< date: Mon, 13 Apr 2015 05:31:00 GMT
* Server LeoFS is not blacklisted
< server: LeoFS
< Content-Type: application/octet-stream
<
{ [data not shown]
100     2    0     2    0     0    244      0 --:--:-- --:--:-- --:--:--   250
* Connection #0 to host test.localhost left intact

Compare with each other

$ diff test-01-5M  test-01-15M
$ diff test-01-10M test-01-5M
$ diff test-01-15M test-01-10M

@yosukehara
Copy link
Member

Fix this issue more correctly: leo-project/leo_gateway@919444e

@ZhaoX
Copy link
Author

ZhaoX commented Jun 5, 2015

@yosukehara Thank you for sharing.
I found another related problem. I use aws-java-sdk-s3 access Leofs V1.2.10. When getting object with range, the client sdk throws an AmazonClientException.

S3Object s3Object = s3Client.getObject(new GetObjectRequest(ConfigProps.S3_BUKET, objectKey)
                .withRange(0, 3));

Exception in thread "main" com.amazonaws.AmazonClientException: More data read than expected: dataLength=4; expectedLength=0; includeSkipped=true; in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0
    at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:155)
    at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:110)
    at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:73)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at com.lenovo.leoss.client.Utils.displayTextInputStream(Utils.java:43)
    at com.lenovo.leoss.client.Main.main(Main.java:44)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

It seems the aws-java-sdk-s3 need a content-length header in response, but leofs used chunked transfer encoding.

@yosukehara
Copy link
Member

@ZhaoX Thank you for your report. I've posted this issue as another ticket #376

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants