-
Notifications
You must be signed in to change notification settings - Fork 28
UploadedFile::store() uses md5_file() to generate a file name #161
Comments
It appears that the function is So i think this might be alright |
I agree that an md5 hash is not the best solution.
What if you upload the same file to two records, and then delete one of the records? Presumably you'd have some sort of trigger that also deletes the file. Now the other record will be linking to a non-existent file 😕
|
@alexbowers No... it returns the hash of the file contents. I think you're reading it wrong. "Calculates the MD5 hash of the file specified by the filename parameter" (from http://php.net/manual/en/function.md5-file.php) So it means that the file is specified by the filename parameter, not that it calculates the MD5 hash of the filename. |
Ah, i misunderstood your question at the top. |
@JosephSilber I guess that your code will have to check in the trigger function for if there is another reference to the file. If there isn't, then remove the file. |
Of course there are workarounds. I'm just saying its unexpected.
|
I think using tempnam() could solve this problem |
@Skysplit no that will not work on cloud storage disks like S3 |
@jeroennoten Ah, you're right. I forgot that there could be different storage drivers. What about appending |
That will not generate a unique name. Better use something like |
IIRC MD5 also has problems with conflicts that can be generated by appending well-crafted bytes to a known file. I guess that could be abused to overwrite an existing file with a malicious version.. M |
I have multiple projects that involve content uploads to cloud , in the end there is no way to generate unique name without actually checking for existence first(from the file system or db). uniqid is good base but it's not 100% guarantee to uniqueness |
@boynet You're completely right. |
With this way, we should never remove the files from the disks because their are linked with more than once. GUID seems a approach, no? |
As @franzliedke, there can be problems with collisions, but php also has a sha1_file, which might be more suitable collision-wise. So I propose that it's either:
It could even be possible for the If interested I'd gladly create a PR to illustrate the implementation. |
If the same file is uploaded twice, then also So the solution should be a generated ID that is 'as unique as possible'. |
@jeroennoten Yes, collisions between two files with the same contents are intended, of course. The problem with MD5 is that there are algorithms that allow you to generate files with different content that collide with an existing file. |
Yeah that's right, but we want uniqueness for files with the same content as well. |
This is now fixed and merged in PR 16193 |
@jeroennoten it's not totally solved just using str_random(32)) still good chance of duplicate... in conclusion for any future readers: if you need more than basic usage don't use this method |
@boynet Have you calculated possibility to have the same filename using |
@shukshin-ivan Are you arguing to use str_random, which has already been merged in laravel/framework#16193, or are you arguing the chance of any collisions? |
The chance of collision in file names of two files in one folder. |
I believe this bug should be reopened. To summarize the requirements:
Simply using a UUID is the best bet. It meets the requirements above, is a known unique key standard, and is supported by other database / key indexing software systems. |
Here is some working example code. But I'm not sure if using the default Storage:: facade is the appropriate thing to do. What if the user wants to use a specific driver?
This example uses the laravel uuid package.
I saw that Faker has a UUID but I wasn't sure if those were valid or just a repeating set. |
@dereks I see no problem using the Storage facade. The way you uses it will make it default to the default disk, but you could as well send in a disk name into the function. (Or perhaps even a Filesystem instance.)
However, when it comes to the actual application logic, then I see a race condition between calling exists and actually writing to the file. It's minimal with the use of uuid v4, but is probably more prominent with other ways to generate a filename. I would also like to add another important thing; reproducability for testing. I presume you use ramsey/uuid which allow me to use a custom uuid factory to make sure I get reproducible values. I would like that the final implementation to have this functionality too. |
Laravel uses |
A great new addition to Laravel 5.3 is the
UploadedFile::store()
, which also generates a filename. The generated filename is the MD5 hash of the contents of the file. So if the same file is uploaded twice, it will overwrite the first one (correct me if I'm wrong). This could be seen as a feature (it saves storage), but I think it is not the right thing to do (imagine modifying the file without changing the name). I would propose something like uniqid() to generate the filename.The text was updated successfully, but these errors were encountered: