-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snprintf instead of unsafe sprintf #1586
Comments
Such warnings are usually bogus and counter-productive. Htslib and samtools are subject to very rigorous testing, including continuous fuzz testing to look for buffer overflows. We have received reports before of "bugs" due to use of sprintf that are nothing more than false positives by over-zealous security scanning tools. There are numerous places to change, and to date we haven't deemed it necessary to fix non-security bugs just for the sake of silencing some (IMO broken) third-party tools. Eg this code: https://github.com/samtools/htslib/blob/develop/sam.c#L5393-L5395
Yes we could change it to snprintf instead, but should we blindly change all such uses? What other impact will it have? Speed maybe due to additional checks? (Although for performant things we usually use kstring instead.) It's clearly wrong to label this a security issue. (Edit: accepted though if the world ever switches to a CPU where "int" is 64-bit by default, then such code would need fixing. I think this is a long way off, if ever, and there will be many many greater problems to fix first.) |
I'm afraid that not only R's behavior can be an issue. The deprecation of I have no knowledge and cannot comment on speed or other potential impacts or security issues. |
Ugh, that's "courageous" of them! While they may wish to deprecate it, in doing so they are departing from the C standards which have not deprecated these functions. I assume they also include strcpy too? My concern is blindly changing one function for another function doesn't in itself lead to more secure code. While it's safer in general, it's not foolproof and it still punts the onus on setting the correct length to the programmer. If you trust the programmer, you maybe can also trust they know how to allocate memory correctly for sprintf too. Additionally every code change comes with risks of new bugs introduced, so change for the sake of change potentially has a higher security risk than doing nothing. That said, if this is just the norm now, then grrr! Maybe we will indeed have to do something (even if it's just turning off warnings on MacOS!). Sigh |
Out of interest, how are you compiling htslib? What compiler for example? We're doing CI on ARM Ventura macos and it's building without warnings. I'm wondering if this warning only exists when using a C++ compiler or similar. See https://cirrus-ci.com/task/5540763010859008?logs=compile#L95 for our logs. Edit: even explicitly asking for xcode 14 doesn't trigger these: https://cirrus-ci.com/task/5232657760518144 |
In R, I am linking to its prepackaged version, The warnings I complain about are R warnings during routine check at CRAN's and Bioconductor's CI. Even though I use recent Mac (and Linux), I don't see any compiler warnings myself (yet). However, since other projects are substituting Here's how |
Ah, I see this:
So basically it's doing a symbol table scan on the object files, and this has nothing specifically to do with MacOS deprecation by the looks of it. I think that's maybe a red herring here. Removing some of the exit/abort calls is on our todo list, but mostly they're deprecated APIs which we cannot change (as they don't have any way to return error codes and we have newer interfaces which people should use instead), or they're cases which are programming checks rather than user-input checks, so nothing the user can do could trigger these unless we've code a huge coding error, in which case all bets are off at that point. Stdout/stderr will be diagnostics, some of it is logging which can be increased with verbosity, but some is indeed problematic as we ought to have a better way of returning errors as integers with a hts_strerror type interface to get string forms. That mostly hasn't been written yet. However it still won't silence R warnings here as we're always going to have the debug symbols there for when people want to diagnose problems. Silencing just the sprintf cases doesn't seem like a priority in the grand scheme of things, given all the other warnings will still be there and the R build logs will continue to flag as "WARNINGS". It's unrealistic to change everything that triggers R complaints, and doing so would require a new incompatible ABI and numerous updates to other tools. |
Just a note: it is only the |
It looks like there's only 10 instances of |
Do you have a list of functions that gets scanned for? Do we need to change all strcpy to strncpy too for example (equally capable of having buffer overflows as sprintf)? |
I'd already done this search and found 34, although 14 are in test. |
You should know that For another data point, |
I know But if it's just sprintf and vsprintf then it's doable. I still argue though that it's pointless effort! A tick box exercise for people who like to be seen to do something. There are millions of way code can be broken. sprintf can trivially be secure (unlike gets which can never). But I guess I'm just swimming against the tide on this one. Edit: on the bright side, it looks like snprintf only adds 2-3% extra CPU overhead, so it's not a concern for heavily used things like kputd (and the additional |
Do you have a link or other citation to Apple documentation describing such a policy? |
Can't find any proper description or discussion from Apple (just this), only changes in open source projects. There are threads with complaints that Xcode 14 gives an error for OSX developers, Google finds them too. Issue to give the same error for I guess, when Apple changes something they don't really discuss it with the community. People then just have to decide if they want to drop OSX support or adjust accordingly... |
Thanks for accepting this issue.
Would C++14 attributes work instead?
|
We can't use C++14 in our C code. However we already have If this is needed in samtools and/or bcftools they ought to have their own issues too, as we'd probably not do them all in a single step and at the very least it'll obviously be different commits. I don't see the need for samtools/bcftools though as they're not a library and can't be wrapped up by R / Bioconductor can they? |
Not sure why, but R package contains some |
The R package's samtools code is there because they included a few functions they considered generally useful from the samtools code (rather than raising HTSlib issues to discuss adding such functions or equivalents to the HTSlib library) and they resurrected the legacy Samtools API, both presumably for the benefit of other Bioconductor packages still using such legacy APIs from the Rsamtools days. This is a copy of (a subset of) the samtools code within the Rhtslib repository. So the R developers would be free to tweak their local copy to avoid |
Several of the Instead of Footnotes
|
I see. Thanks for the explanation |
#1594 gets rid of the I haven't attempted to ban it in the code yet. Both pragmas and deprecations may be a bit messy because they have to be applied at exactly the right time, between including the system headers and any HTSlib ones. For now we can rely on maintainer vigilance, but it could be possible to scan for it using |
Thanks very much, guys. And for such a great library too |
For the record, here is what the current (13.3, 22E245) macOS SDK's __swift_unavailable("Use snprintf instead.")
#if !defined(_POSIX_C_SOURCE)
__deprecated_msg("This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead.")
#endif
int sprintf(char * __restrict, const char * __restrict, ...) __printflike(2, 3);
// …
#if defined (__GNUC__) && _FORTIFY_SOURCE > 0 && !defined (__cplusplus)
/* Security checking functions. */
#include <secure/_stdio.h>
#endif secure/_stdio.h #defines
(See libtiff's useful summary and links for details.) For HTSlib, this means that builds would normally never see this deprecation. AddressSanitizer builds, which are inherently non-production builds, would at present get the deprecation warning. At present, HTSlib does not define Hence, as long as Apple has that |
Thanks for the research. Personally, I think the security issues fixed in snprintf don't buy that much over peer-reviwed code using sprintf, and why focus on that one function and not a myriad of others like strcpy, strcat, etc? There's a whole raft of potential problems. However it's done now so shrug. It's a good point about those macros though. We probably ought to be defining |
HTSLib
is a dependency for >20 different R packages including the core ones of Bioconductor. Next R release (v4.3) will issue a warning upon use ofsprintf
in C/C++ code, which is then treated as an error in CRAN'sR CMD check
(routine check for all R packages). Besides, this warning is simply annoying and indeed flags a security risk.Suggested solution for this is to substitute all
[v]sprintf
s with[v]snprintf
s. This probably should be done inHTSlib
as well as insamtools
code. For an experienced C/C++ (HTSlib
) developer it is very easy, so I'm not sure it is a good idea to open a PR myself. But please let me know if it is of interest.Thanks for looking into this!
The text was updated successfully, but these errors were encountered: