Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipart uploads of large files produces partially corrupted data when upload chunk size #190

Closed
Licenser opened this issue May 15, 2014 · 11 comments

Comments

@Licenser
Copy link
Contributor

LeoFS is configured with 5MB chunks.

Upload a file with multipart upload put into 1MB chunks

  • Download the file with multipat download in 5MB chunks -> the file is corrupted
  • Download as one file -> File is not corrupted

Upload a file with 5MB chunks.

  • Download the file with multipat download in 5MB chunks -> File is fine
  • Download as one file -> File is fine

Uploading:

[root@go-test ~/fifo_s3]# md5sum 4b6c9c1e-ab43-11e3-b6af-0799fb0203af
735b95a9640a67f7295a6665dbdee8d1  4b6c9c1e-ab43-11e3-b6af-0799fb0203af

[root@go-test ~/fifo_s3]# ./fifo_s3 -c 1 -b fifo-images -h ny4.storage.lucerahq.com put test 4b6c9c1e-ab43-11e3-b6af-0799fb0203af
Done

[root@go-test ~/fifo_s3]# ./fifo_s3 -s 1048576 -c 1 -b fifo-images -h ny4.storage.lucerahq.com put test-1M 4b6c9c1e-ab43-11e3-b6af-0799fb0203af
Done

Checking md5:

[root@go-test ~/fifo_s3]# ./fifo_s3 -s 1048576 -b fifo-images -h ny4.storage.lucerahq.com md5 test-1M
MD5 Sum: 5d05ec45a8788e940b3245bd5c196901

[root@go-test ~/fifo_s3]# ./fifo_s3 -b fifo-images -h ny4.storage.lucerahq.com md5 test-1M
MD5 Sum: 5d05ec45a8788e940b3245bd5c196901

[root@go-test ~/fifo_s3]# ./fifo_s3 -s 1048576 -b fifo-images -h ny4.storage.lucerahq.com md5 test
MD5 Sum: 735b95a9640a67f7295a6665dbdee8d1

[root@go-test ~/fifo_s3]# ./fifo_s3 -b fifo-images -h ny4.storage.lucerahq.com md5 test
MD5 Sum: 735b95a9640a67f7295a6665dbdee8d1

Chross check with s3cmd (S3cmd does not use multipart):

[root@go-test ~/fifo_s3]# s3cmd get s3://fifo-images/test --force
s3://fifo-images/test -> ./test  [1 of 1]
 227597244 of 227597244   100% in    2s    90.74 MB/s  done
WARNING: MD5 signatures do not match: computed=735b95a9640a67f7295a6665dbdee8d1, received="cc4525ef9f8e8fe1f7304f383adf3086"

[root@go-test ~/fifo_s3]# s3cmd get s3://fifo-images/test-1M --force
s3://fifo-images/test-1M -> ./test-1M  [1 of 1]
 227597244 of 227597244   100% in    1s   125.88 MB/s  done
WARNING: MD5 signatures do not match: computed=735b95a9640a67f7295a6665dbdee8d1, received="f0dea482c247937112a1890f8265c6a6"

[1] fifo_s3: https://github.com/project-fifo/fifo_s3

@Licenser
Copy link
Contributor Author

A possible solution would be for LeoFS to refuse multipart uploads that do not match the chunk size.

@yosukehara
Copy link
Member

Thank you for your report. We'll check this issue.

related issue: #177

@yosukehara
Copy link
Member

We're planning to fix this issue with v1.2 later because we need to improve the internal architecture.
Please wait.

@yosukehara yosukehara added this to the 1.2 milestone Sep 10, 2014
@yosukehara yosukehara self-assigned this Sep 10, 2014
@mocchira mocchira modified the milestones: 1.4, 1.2 Oct 7, 2014
@windkit
Copy link
Contributor

windkit commented Jun 24, 2015

This issue is related to range get queries, issue #376 #382
Bugs are fixed on develop branch.

File MD5 (50M /dev/urandom)

/tmp/fifo_s3$ md5sum testfile
3cd55bbe40389beac4d493d3f91cd687  testfile

Multipart Upload (1MB Chunk Size)

/tmp/fifo_s3$ ./fifo_s3 -c 1 -s 1048576 -b test -h localhost -p 8443 put test-1M2 testfile
Done
/tmp/fifo_s3$ ./fifo_s3 -c 1 -s 5242880 -b test -h localhost -p 8443 put test-1M3 testfile
Done

/tmp/fifo_s3$ ./fifo_s3 -b test -h localhost -p 8443 md5 test-1M2
MD5 Sum: 3cd55bbe40389beac4d493d3f91cd687
/tmp/fifo_s3$ ./fifo_s3 -b test -h localhost -p 8443 md5 test-1M3
MD5 Sum: 3cd55bbe40389beac4d493d3f91cd687

OT:
Part Size should be at least 5MB according to Amazon S3 Documentation
http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html
But I guess it is no harm to support small part size

@shooding
Copy link

shooding commented Jul 3, 2015

Hi, what is the maximum number of parts in a upload job?

@mocchira
Copy link
Member

mocchira commented Jul 3, 2015

@shooding you can configure the max number of parts at leo_gateway.conf as below.
http://leo-project.net/leofs/docs/configuration/configuration_3.html?highlight=max_chunked_objs
So there are no limits.

@windkit
Copy link
Contributor

windkit commented Jan 6, 2016

Related Report from Google Group

https://groups.google.com/forum/#!topic/leoproject_leofs/k7jAppwuovs

@yosukehara
Copy link
Member

From the ML, I recognized we still have an issue with retrieving high-range of an object as below:

$ ./leofs-adm whereis kavi_upload/test2/data
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
 del?  |           node           |             ring address             |    size    |   checksum   |  # of chunks   |     clock      |             when            
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
       | storage_0@127.0.0.1      | 457624893fec7fa2aa3caab95b80342b     |     60909K |   9a5443928f |              5 | 526e9abb8d4b0  | 2015-12-15 11:42:13 +0550

### OK-1:
$ curl --range 1-60909000 http://kavi_upload.localhost:8080/test2/data > /tmp/down1 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 58.0M  100 58.0M    0     0   182M      0 --:--:-- --:--:-- --:--:--  183M

### OK-2:
$ curl --range 11000000-20909000 http://kavi_upload.localhost:8080/test2/data > /tmp/down1 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 9676k  100 9676k    0     0   100M      0 --:--:-- --:--:-- --:--:--  100M


### NG-1:
$ curl --range 41000000-51000000 http://kavi_upload.localhost:8080/test2/data > /tmp/down1 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 9765k    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl: (18) transfer closed with 10000001 bytes remaining to read

### NG-2:
$ curl --range 41000000-60909000 http://kavi_upload.localhost:8080/test2/data > /tmp/down1 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 18.9M    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl: (18) transfer closed with 19909001 bytes remaining to read

@yosukehara
Copy link
Member

I've checked this issue with this script, then I could not face the same situation. It seems version of Leo's library(s) is not correct.

$ leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.2.18
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 2
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
        max number of joinable DCs | 2
           number of replicas a DC | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash | 3923d007
                previous ring-hash | 3923d007
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+----------------+----------------+----------------------------
 type  |           node           |    state     |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+----------------+----------------+----------------------------
  S    | storage_0@127.0.0.1      | running      | 3923d007       | 3923d007       | 2016-01-06 13:22:23 +0900
  S    | storage_1@127.0.0.1      | running      | 3923d007       | 3923d007       | 2016-01-06 13:22:23 +0900
  S    | storage_2@127.0.0.1      | running      | 3923d007       | 3923d007       | 2016-01-06 13:22:23 +0900
  S    | storage_3@127.0.0.1      | running      | 3923d007       | 3923d007       | 2016-01-06 13:22:22 +0900
-------+--------------------------+--------------+----------------+----------------+----------------------------

$ leofs-adm whereis "test/test.file"
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
 del?  |           node           |             ring address             |    size    |   checksum   |  # of chunks   |     clock      |             when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
       | storage_2@127.0.0.1      | cd50d0ec8a32f5f3457d4c96c4f595c5     |     63477K |   7a465a5e92 |              5 | 528a34cc0605d  | 2016-01-06 14:05:16 +0900
       | storage_3@127.0.0.1      | cd50d0ec8a32f5f3457d4c96c4f595c5     |     63477K |   7a465a5e92 |              5 | 528a34cc0605d  | 2016-01-06 14:05:16 +0900

@mocchira
Copy link
Member

mocchira commented Jan 6, 2016

To make sure that covering the case for Vansh,
I've added some test cases.
leo-project/leofs_client_tests@b7f1ca7
(Range GETs with last ranges to a large object multipart uploaded)

@yosukehara
Copy link
Member

We've checked this issue with the current development version, 1.2.18-dev.
In conclusion, we could not find the issue. I've closed it.

@yosukehara yosukehara modified the milestones: 1.2.18, 1.4.0 Jan 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants