Skip to content
This repository has been archived by the owner on May 22, 2021. It is now read-only.

Is it necessary that the SHA256 signature and name of the file are sent to the server in plaintext? #417

Closed
jepler opened this issue Aug 3, 2017 · 12 comments
Labels

Comments

@jepler
Copy link

jepler commented Aug 3, 2017

I notice that the SHA256 signature (as aad) and client-side name (as filename) of the uploaded file are transmitted to the server and logged there. Is this necessary in order to allow send to work properly? If not necessary, it feels like this information should not be transmitted or logged.

Tested at v1.0.3.

{
  "Timestamp": 1501{redacted},
  "Logger": "FirefoxSend",
  "Type": "send.server.meta",
  "Severity": 6,
  "Pid": {redacted},
  "EnvVersion": "2.0",
  "Fields": {
    "aad": "d8db2{redacted}",
    "id": "9829e{redacted}",
    "filename": "{redacted}.png",
    "delete": "d984f{redacted}"
  }
}
@ehuggett
Copy link
Contributor

ehuggett commented Aug 4, 2017

@jepler
Copy link
Author

jepler commented Aug 4, 2017

@ehuggett that page does not mention that the filename and cryptographic signature of the file are gathered.

I notice that at present, https://testpilot.firefox.com/experiments/send states that

When you use Send, Mozilla receives an encrypted copy of the file you upload, and basic information about the file, such as filename and file size.

so at least the collection of the filename is disclosed.

@ehuggett
Copy link
Contributor

ehuggett commented Aug 4, 2017

Neither it does, my Apologies. I hope the information about the other metrics being collected was still of interest.

This looks like an omission from the metrics file to me?

I did initially write a comment in support of your sentiment that the collection of filenames does not appear to have an obvious (documented) justification but i wasn't satisfied with the "tone" of what i had written so i didn't include it.

To expand on the problem slightly with hopefully better wording

Knowing either of the file hash or filename would allow the server operator to find a copy of the file hosted elsewhere in some cases. I don't believe it is relevant that this implies the contents of the file may not be unique/confidential (if it can be found with a search engine/etc) as the sensitive information may be the fact that its being transferred (ie a generic document about a specific medical condition).

@pdehaan
Copy link
Collaborator

pdehaan commented Aug 4, 2017

Ref: #69 "Encrypt filename"
Ref: #293 "Consider padding files out to some consistent size"

@dannycoates
Copy link
Contributor

dannycoates commented Aug 4, 2017

There's a pull request to update the privacy section on the Test Pilot site with the file hash mozilla/testpilot#2705

Logging of the hash on the server was removed in 8cfb459 and will be part of our release next week.

With the current functionality of the site it isn't strictly necessary to send the file hash in plain text, however we want to be able to test features that require the hash of the file. One specifically is to check uploads against a malware database.

@rugk
Copy link

rugk commented Aug 4, 2017

Logging of the hash on the server was removed in 8cfb459 and will be part of our release next week.

So then also do not sent the hash of the file to the server. It is somehow ridiculous to encrypt the file, but send the hash in plain text. The hash is more than enough to identify a file, so please do not leak it.

@ehuggett
Copy link
Contributor

ehuggett commented Aug 4, 2017

direct and straight to the point rugk, having the client calculate the checksum and send it without the users permission because the server can't do it itself is fairly amusing!

I did see the malware database suggestion and assumed it was a bad example, as it would have to do something like this issue is about. But i guess being regarded as a 'risky' place to download a file from won't help adoption, and possibly prevent it, even if you think it should be the users responsibility (their own AV / common sense about links from strangers etc).

Perhaps a fair compromise would be to ask the user before the client sends a unique identifier (the hash) for the file(s) anywhere? If the question has a short warning that summarises the implications, namely it can be equivalent to sending the entire file, with a 'read more' link then i would think that could be considered informed consent?

edit: I would prefer for it to be as easy as possible to download (and upload, if your going to do that too) without the check being done, but that might not work for everyone. Letting users opt into a check ideally should not also give them a false sense of security about the file they are receiving.

@rugk
Copy link

rugk commented Aug 4, 2017

Asking just complicates the thing and we (and Mozilla) want privacy-by-default, so no… just don't do it. There is not really the idea of including any AV into Firefox Send? Next step would be USB cables getting an AV… This service is just for transferring a file and thus should not know anything/look into the files. Nobody wants an AV here, that's just privacy invasion and you know… the next "AV feature" is a "porn filter", "bad speech filter", … Don't make it possible to use the service for censoring.

Otherwise you do not need to encrypt the data at all. Period.

@ehuggett
Copy link
Contributor

ehuggett commented Aug 4, 2017

@dannycoates I believe there was a misinterpretation. When you say the server does not log the file hash you did not say it will not store it at all?

The http response headers for the request that fetches the encrypted file also contain a "X-File-Metadata" header with the SHA256 hash of the file (reworded, you can discover the SHA256 hash for any hosted file if you know its file_id and don't mind deleting it?) and it appears to be used to check the decrypted file "has not been tapered with".

I'm also a little confused, could someone clarify this for me? AES-GCM is being used for encryption and the documentation states this is an "authenticated" encryption mode, does this not make the check with the SHA256 hash superfluous or is there something about the way its being used here that makes the extra (time/battery consuming) SHA256 hashing necessary to ensure the file has not (been) changed?

@rugk
Copy link

rugk commented Aug 5, 2017

(reworded, you can discover the SHA256 hash for any hosted file if you know its file_id and don't mind deleting it?)

😱

does this not make the check with the SHA256 hash superfluous

It is. Unless the server replaces the whole content of the file maybe… but it has to know the decryption key for this, so I have no idea why they do this.
And if they really want to check the SHA256 sum, they could still pass it in the share link/URL or so…

@rugk
Copy link

rugk commented Aug 7, 2017

Just FYI – if you care – or at least the marketing department. In Germany you are getting bad media coverage, because of 1) Google Analytics and 2) this issue here (the article even links to this issue), respectively #69.

@dannycoates
Copy link
Contributor

Thank you all for your comments. The file hash has been removed completely in #470. Any further discussion about the file name can happen in #69

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants