-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS S3 sync does not sync all the files #3273
Comments
You can possibly use To help in reproducing, could you get me some information about one of the files that isn't syncing?
|
Example command run Several files are missing There are several cases like that in a list of around 50,000 files. However all the missing in sync are various times on 20 Jun 2017. Using --exact-timestamps shows much more files to download although they are exactly the same contents. However they are still missing the ones in example above. |
same issue here. dist/index.html and s3://bucket/index.html have the same file size, but their modify time are different. actually, some times awscli did upload the file, but some times not |
Same here, |
We experienced this issue was well today/last week. Again index.html is the same file size, but the contents and modified times are different. |
Is anybody aware of a workaround for this? |
I just ran into this. Same problem as reported by @icymind and @samdammers: the contents of my (local) |
Server: After Example of file Had to use |
same issue here xml file of the same size but different timestamp not synced correctly I was able to reproduce this issue bug.tar.gz
you'll see that even though repomd.xml in directories a and b differ in contents and timestamps Tested on |
im seeing the same issue. trying to sync a directory of files from s3 where one file was updated to a local directory. that file does not get updated in the local directory |
I'm seeing this too. In my case it's a react app with index.html that refers to generated .js files. I'm syncing them with the --delete option to delete old files which are no longer referred to. The index.html is sometimes not uploaded, resulting in an old index.html which points to .js files which no longer exist. Hence my website stops working !!! I'm currently clueless as to why this is happening. Does anyone have any ideas or workarounds ? |
We have the same problem, but just found a workaround. I know, it is not the best way, but it works:
It seems to us, that the copy is working fine, so first we copy after that we use the sync command to delete files, which are no longer present. |
I added |
We've met this problem and |
I'm seeing this issue, and it's very obvious because each call only has to copy a handful (under a dozen) files. The situation in which it happens is just like reported above: if the folder being We ended up changing scripts to |
I saw this as well with an html file
|
I copy pasted the |
Just ran into this issue with build artifacts being uploaded to a bucket. Our HTML tended to only change hash codes for asset links and so size was always the same. S3 sync was skipping these if the build was too soon after a previous one. Example: 10:01 - Build 1 runs Build 2 has HTML files with a timestamp of 10:05, however the HTML files uploaded to s3 by build 1 have a timestamp of 10:06 as that's when the objects were created. This results in them being ignored by s3 sync as remote files are "newer" than local files. I'm now using Hope this might be helpful to someone. |
I had the same issue earlier this week; I was not using |
The same here, files with the same name but with different timestamp and content are not synced from S3 to local and --delete does not help |
We experience the same issue. An index.html with same size but newer timestamp is not copied. This issue was reported over a year ago. Why is it not fixed? Actually it makes the snyc command useless. |
--exact-timestamps fixed the issue |
I am also effected by this issue. I added --exact-timestamps and the issue seemed to fix the files i was looking at. i have not done an exhaustive search. I have on the order of 100k files and 20gb, a lot less than the others in here. |
I have faced the same issue, |
I had this issue when building a site with Hugo and I finally figured it out. I use submodules for my Hugo theme and was not pulling them down on CI. This was causing warnings in Hugo but not failures.
Once I updated the submodules everything worked as expected. |
We've also been affected by this issue, so much so that a platform went down for ~18 hours after a new This is a very strange problem, and I can't believe the issue has been open for this long. Why would a sync not use hashes instead of last modified? Makes 0 sense. For future Googlers, a redacted error I was getting:
|
I'm experiencing the issue as well. Is there any plan for the fix? |
This just tripped us up as well with our build system when deploying to S3 between various feature branches. Some of them were built at different times, so of course the timestamps were different ( and sometimes older). Our quick (albeit kinda gross because it negates the large benefits of |
AWS, please help us. We shouldn't have to do such workarounds. The fact that the |
Totally support this request!!! |
friends, use https://github.com/peak/s5cmd instead, its much faster too |
Using --exact-timestamps doesn't always solve the problem for some reason. We still once in a while have deploys that go bad. Looking into switching to rclone or s5cmd. |
We are having this issue with a Snowball device and have been billed thousands in overages trying to compensate for the sync command not functioning properly. This is basic stuff... |
Yikes! I think I'm seeing this issue now. I'm When The publish app instances did not seem to have problems, they were also It seems to work but also seems like it might be slower? One great thing about What a WASTE of time, resources, and money. I wonder if redoing the |
I'm seeing this with elastic beanstalk deploys to empty folders such as ./images and ./media. Just the odd new file not coming. But we can see them there in S3 and manual cli from the instance does get it. Almost as if the read of S3 is cached and it has not been flushed on the update of a new file, or that flush is delayed. Our workaround is a cron job. |
I experienced an issue where So in summary, check the KMS keys for any objects ignored by |
Ran into this trying to setup a custom build cache for our React app. Issue was solved by using Details: We had no problem deploying new builds of our app, but hit this bug when trying to redeploy old cached versions even with i.e. the following command didn't work! In particular,
Possibly noteworthy is that |
@abaaslx The fact that some of your files are not being transferred to S3 won't be fixed by that option, and this issue is one we are all seeing in this thread and AFAIK there's been no indication of an actual issue or a solution. |
This should hopefully solve the issue of `aws s3 sync` saying it uploaded a file but not actually doing it. See aws/aws-cli#3273 for more details.
Is this an issue or not? Appreciate your insight on why --exact-timestamps won't work. Would like to get a solution here as this is preventing me from doing deployments (numerous files are not uploading). Is there anything that we can provide to help the maintainers work towards a solution here? |
Still occuring, same problem. Also important to mention, I'm pretty sure that the one important thing that |
Still an issue for us in 2023 |
Having the same issue as well. Running aws cli in Windows. Files updated in S3 bucket are not downloaded locally. |
There is some issue with aws sync: aws/aws-cli#3273 We use aws cp instread
Sorry for clogging this thread, but I'll try to sum up what I've seen in my case and what I have read in this issue discussion. We know for sureNote Mentioned several times in the thread, but worth repeating: Note
We are still trying to find outNote We have a great attempt to have a solid explanation of the issue and it would be great if that was true. Unfortunately we have evidences that proves otherwise.
That is a true statement and sometimes could lead to the discussed issue. For example, when newer CI build happened before previous build results were uploaded to S3.
And if this would be true it would be explained a lot. But it isn't. In my case that I have observed my local file timestamp was 6 minutes newer than the file timestamp at s3. Note We have a script that demonstrates that Note Another observation was made is that this issue may occur if local file newer than s3 copy less than in 1 day. Which leads us to a version that issue root cause maybe related to timezones and difference between |
Just hit this issue as well and can confirm everything said. Was an html file, same size as in S3, using |
I just got bit by this too. In my case I was doing Adding |
Is this still an issue with v2? This issue has been open for 5 and a half years and seems like a serious flaw. |
It just occurred here with v2. Touching the target files on the local disk still helps. |
+1 |
We removed our |
We have several hundred thousand files and S3 reliably syncs files. However, we have noticed that there were several files which were changed about a year ago and those are different but do not sync or update.
Both source and destination timestamps are also different but the sync never happens. S3 has the more recent file.
Command is as follows
aws s3 s3://source /local-folder --delete
All the files that do not sync have the same date but are spread across multiple different folders.
Is there an S3 touch command to change the timestamp and possibly get the files to sync again?
The text was updated successfully, but these errors were encountered: