add code coverage support #3759

jmayclin · 2023-01-14T01:43:47Z

This commit adds cmake configuration to support source-based code coverage. It also adds the CI configuration necessary to upload code coverage to codecov.io during CI runs.

Description of changes:

We already use source-based code coverage in our codebase for fuzzing coverage information, but that is only enabled through make, which we are trying to move away from.

Coverage is enabled with the COVERAGE cmake option. Note that compiling to a static archive .a seems to break something, and LLVM will report that the coverage information is corrupted. To get around this we just compile to a shared library .so.

cmake . -Bbuild \
    -DCOVERAGE=ON \
    -DBUILD_SHARED_LIBS=ON

Each process then writes out profiling information to a file, which is specified by the LLVM_PROFILE_FILE environment variable. We set LLVM_PROFILE_FILE="ut_%p.profraw" in our CI, which causes each process to write to it's own file. Otherwise the profiles would get overwritten.

The CI configuration is a bit of a break from tradition, since it doesn't rely on the omniscript (s2n_codebuild.sh). Rather than adding configuration to the run_unit_tests method in the omniscript, I thought that it would be neater to try and keep the complexity contained to the coverage specific stuff. It also has the nice advantage of fitting more neatly into codebuild's different phases.

Infrastructure added for CI:

parameter store -> holds codecov.io token
codebuild generalbatch service role -> added permission to call ssm::getParameters for the code cov token resource
s3 bucket -> holds html coverage reports + 30 day lifecycle policy
codebuild generalbatch service role -> added permission to write/list s3 bucket
cloudfront distribution -> public access for html coverage reports
s3 bucket policy -> gives cloudfront access

CallOuts

This PR doesn't currently include any code cov configuration. Their documentation isn't exactly great (Can I upload an lcov info file? Idk, read the source code and find out) so my current approach is to add some minimal integration in this PR, and modify it as needed once we have a better idea of what it looks like.

We don't run CI on changes to our buildspec's, so the buildspec has a txt extension. There will need to be a follow-up PR to change this to ayml extension. I preferred this txt approach since that way the buildspec can be reviewed alongside the relevant changes.

I think we'll also need to go in and modify the codebuild jobs manually for these changes to take effect. The will require editing the build spec for the s2nGeneralBatch job to include

    - identifier: s2nUnitCoverage
      buildspec: codebuild/spec/buildspec_unit_coverage.yml
      env:
        privileged-mode: true
        compute-type: BUILD_GENERAL1_LARGE
        image: 024603541914.dkr.ecr.us-west-2.amazonaws.com/docker:ubuntu22codebuild
        variables:
          S2N_LIBCRYPTO: openssl-1.1.1

I included this in the buildspec_omnibus, but it looks like our actual CI jobs have diverged from the buildspecs in that folder.

Also, the s3 upload (and publicly visible coverage report) currently is just under the latest/ path, which means that there won't be specifically visible reports through cloudfront: https://dx1inn44oyl7n.cloudfront.net/index.html

Additionally, codecov.io is being nasty and doesn't count any lines with missed branches as covered. Which means that our coverage score is really low, because we use lots of macros and really test all the branches on every line for the macros. Not quite sure what the solution is here. So LLVM reports lines coverage of ~90% but codecov says something more like 65%.

Testing:

I reran the build to ensure that the coverage information remained relatively constant from run to run. There are some branch differences, but generally the code coverage remain very close from run to run. There were some branch and region count differences, but most of these looked extremely nasty, e.g. branches in __typeof(n) macros. While I'm sure it's possible to make our unit tests totally reproducible, I expect it to be rather expensive to do so.

I also manually kicked off a CI job to confirm that the coverage reports got written to S3, and that the coverage report was uploaded to codecov.io.

Not sure about the exact codecov configuration.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

I think the link/compile options might be too broad. Need to only enable them when LLVM is the compiler. Otherwise should probably just enable the coverage flag. Or perhaps the coverage flag should always be enabled. The reports are not particularly nice looking. We should add an additional to link the summary table to the line level coverage. I think if we just shove all of the things into relative html paths that should be do-able.

This commit adds a few files that will be necessary for code coverage in CI runs. It breaks a little bit from tradition by avoiding the omni-script, but I think that is quite justifiable in this case since the unit test setup in the omniscript (s2n_codebuild.sh) doesn't work for me anyways. I will be attempting to slot most functionality into the correct codebuild phases, since it will make the codebuild logs a bit nicer to look at.

This commit should add the functionality necessary to run the proper build and upload reports to code cov and s3. Once they are uploaded to s3 they will be fronted by a cloudfront distribution which will allow them to be publicly accessible.

post-build is not valid. it should be post_build. :( In my defence, I feel like codebuild should really be supporting arbitrary phase names anyways.

I always forget about this part. Silly inodes shneakily making themselves part of the source control. I'm also sourcing the environment setup. It would be nice to minimally separate those out rather than chaining them together, but that feels like a task for a different time.

I didn't actually do that in the last commit even though I thought that I did.

I should go take a nap. These silly mistakes are irritating me.

I'm kinda just copying the fuzz setup at this point. There are path difficulties in the build, and it seems somewhat reasonable that this might solve it? Uncertain.

I don't think that the fuzzers get their clang from any of our CI stuff :( But it is stabbing me in the back because I don't think the fuzzer specs require cmake, but yours truly most certainly does. Two build images will make this go faster right? amend: nvm, I'm just sticking with the one

Should just be setting the CC environment variable. Unfortunately, we still haven't found clang so not sure how that's supposed to work.

So, it looks like our source control isn't really authoritative here. It's more of a "best-effort" to track our CI jobs. The previous fuzz jobs that I was trying to copy were outdated. Hoping these ones are a bit more in style.

I made progress that time, but so many logs were generated that I couldn't see what the original error message was. Now we should abort on failure.

Talked to Doug, and it looks like the LatestClang isn't necessarily expected to happen. I was expecting that to set the CC/CXX variables, but since that isn't going to be the case we can just set it in the buildspec.

My build ended up failing a random openssl test :( not quite sure why, and it is going to be very painful to debug in CI. Trying to focus on getting coverage actually working, so punting that task off to some other day.

The last run was able to successfully go through my manually invoked CI run. Moving towards pushing this into CR. I didn't find out that code cov really doesn't like some of the branch misses in the macros's, and our overall codebase coverage is significantly lower than codecov because of that. There doesn't seem to be a way to turn it off either.

More cleanup for pending PR

maddeleine

What is the end result of this PR? Is it that a coverage report will be generated per PR? How will a reviewer/author navigate from the PR to seeing that report?

jmayclin · 2023-01-17T22:00:27Z

end result of this PR?

CMake configuration to support code coverage & CI job that runs unit tests, calculates code coverage, then uploads that to S3/CloudFront & CodeCov.io.

Is it that a coverage report will be generated per PR?

Yes, each time the GeneralBatch coverage job is run, the coverage will be computed / uploaded. I expect that codecov will handle deduping for each PR

How will a reviewer/author navigate from the PR to seeing that report?

My expectation is that the codecov app will post a link onto the PR: https://github.com/apps/codecov
We already have the app installed onto this repository, and my understanding is that the CodeCov app will automatically post the coverage information, but the documentation is a little opaque on that, and doesn't really cover the CodeBuild CI use case. I followed the quick-start steps here: https://docs.codecov.com/docs/quick-start

Also worth calling out that the codecov reports should all be stored individually (per-pr) here: https://app.codecov.io/gh/aws/s2n-tls while the current cloudfront report are just "last write wins" in the S3 bucket. I think that we can get the reports to write to the PR prefix by using the CODEBUILD_SOURCE_VERSION mentioned in codebuild documentation but I figured that was more of a v2 task.

So the exact shape of the codecov.io integration is a bit uncertain right now, but next steps are

manually change the CI to include relevant buildspec
observe default/current codecov.io settings
make relevant codecov.io config changes.
We can try and frontload the config changes, but I expect "observe it" is going to be more efficient than trying to read through their documentation, although it will be a bit more noisy.

maddeleine · 2023-01-17T22:25:22Z

My expectation is that the codecov app will post a link onto the PR: https://github.com/apps/codecov We already have the app installed onto this repository, and my understanding is that the CodeCov app will automatically post the coverage information, but the documentation is a little opaque on that, and doesn't really cover the CodeBuild CI use case. I followed the quick-start steps here: https://docs.codecov.com/docs/quick-start

Currently codecov just reports a little badge in our Readme. So technically we are already using codecov. I assume the reports would show up automatically, but I don't think we know... 🤔 You're probably right. The best thing to do to figure it out is get this running in our CI and go from there.

dougch

We've been splitting out the buildspec changes so as to allow the codebuild checks to run (or include forced/over-ridden job links)

codebuild/spec/buildspec_unit_coverage.yml

codecov · 2023-01-23T18:37:51Z

Codecov Report

Merging #3759 (02b7948) into main (15bdf99) will increase coverage by 0.52%.
The diff coverage is 65.38%.

❗ Current head 02b7948 differs from pull request most recent head 3fbd377. Consider uploading reports for the commit 3fbd377 to get more accurate results

@@            Coverage Diff             @@
##             main    #3759      +/-   ##
==========================================
+ Coverage   68.56%   69.08%   +0.52%     
==========================================
  Files         233      233              
  Lines       21913    21630     -283     
  Branches     7757     7676      -81     
==========================================
- Hits        15024    14944      -80     
+ Misses       1898     1769     -129     
+ Partials     4991     4917      -74

Impacted Files	Coverage Δ
tls/extensions/s2n_client_alpn.c	`60.60% <ø> (+9.32%)`	⬆️
tls/extensions/s2n_client_key_share.c	`65.21% <ø> (+0.83%)`	⬆️
tls/extensions/s2n_client_max_frag_len.c	`84.21% <ø> (+20.21%)`	⬆️
tls/extensions/s2n_client_pq_kem.c	`61.29% <ø> (+9.93%)`	⬆️
tls/extensions/s2n_client_renegotiation_info.c	`70.90% <ø> (+3.66%)`	⬆️
tls/extensions/s2n_client_sct_list.c	`100.00% <ø> (+46.15%)`	⬆️
tls/extensions/s2n_client_server_name.c	`66.66% <ø> (+8.88%)`	⬆️
tls/extensions/s2n_client_session_ticket.c	`90.90% <ø> (+19.48%)`	⬆️
tls/extensions/s2n_client_signature_algorithms.c	`100.00% <ø> (+60.00%)`	⬆️
tls/extensions/s2n_client_status_request.c	`80.95% <ø> (+17.98%)`	⬆️
... and 34 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

This commit addresses some of the PR feedback. I don't think I actually need to be doing any of the third party test dependency things.

jmayclin · 2023-01-23T18:57:18Z

Sorry for the ugly commit history, I had some sort of dirty merge/rebase happen, but should be good now.

CodeBuild jobs won't run for yml files. But I still want this to be reviewed with the code build configuration stuff.

dougch

lgtm

CMakeLists.txt

codebuild/spec/buildspec_unit_coverage.txt

lrstewart · 2023-01-24T19:40:28Z

codebuild/bin/coverage_upload.sh

+
+# upload to codecov.io
+curl -Os https://uploader.codecov.io/latest/linux/codecov


If codecov isn't accurate for our codebase, is this a useful step?

But I'm torn on whether it IS useful, from your description. The problem is that some single-line macros are very important and both branches SHOULD be tested. But some single-line macros aren't important and should be literally impossible to fail or test, like most of our extremely paranoid null checks.

I've gone back and forth on this, but I've mostly arrived at "I can't fault CodeCov.io for pedantically accurate coverage". I think it is still nice to have from the diff perspective, but don't have a strong opinion on that.

One other option that we could consider is modifying the unit_test_coverage.info to lie about our branch coverage, so that CodeCov.io reporting would be equivalent to the reporting at https://dx1inn44oyl7n.cloudfront.net/index.html

I think my point is that the LLVM is arguably just as inaccurate as codecov.io, just in our favor. If I'm understanding the behavior you're describing, then codecov.io isn't being pedantic, it's being more accurate. We can't assume all single-line macros don't need to be tested any more than we can assume all single-line macros do need to be tested. We've got a weird mix.

- comment for fatal error - declare cmake option - add libcrypto root thing for build script Previously I think we were just using the default libcrypto from the environment.

jmayclin · 2023-01-26T01:18:07Z

Had a chat with team members and arrived at the following conclusion

codecov.io requires all branches to be checked before a line account, including those in macros. codecov.io probably isn't the best fit for us, because of our paranoid usage of safety and assertion macros, which we never expect to be hit.

So the route forward is to rely a little bit more heavily on the lcov reports. To do that we want

coverage report for each PR to be uploaded to it's own path
summary dumped to logs
link to coverage report outputted in logs.

- remove the codecoverage upload after talking to Lindsay - add branch coverage to the lcov report - upload the text summary to s3 - upload the report to s3 - s3 uploads are done to a specific PR path

jmayclin · 2023-01-26T02:05:52Z

#3781 tracks proposed future enhancement for code coverage

Ubuntu and others added 16 commits January 6, 2023 02:18

run unit test coverage in ci

d75e8ff

This commit should add the functionality necessary to run the proper build and upload reports to code cov and s3. Once they are uploaded to s3 they will be fronted by a cloudfront distribution which will allow them to be publicly accessible.

fix typo in phase name

8125c57

post-build is not valid. it should be post_build. :( In my defence, I feel like codebuild should really be supporting arbitrary phase names anyways.

source environment setup

e152f53

I didn't actually do that in the last commit even though I thought that I did.

remove the line I meant to remove

33c2a32

I should go take a nap. These silly mistakes are irritating me.

add more symlinks

db50e33

I'm kinda just copying the fuzz setup at this point. There are path difficulties in the build, and it seems somewhat reasonable that this might solve it? Uncertain.

remove compiler invocation for cmake

375a5a4

Should just be setting the CC environment variable. Unfortunately, we still haven't found clang so not sure how that's supposed to work.

try to copy fuzz test (again)

93257be

So, it looks like our source control isn't really authoritative here. It's more of a "best-effort" to track our CI jobs. The previous fuzz jobs that I was trying to copy were outdated. Hoping these ones are a bit more in style.

abort on failure

3ca90c9

I made progress that time, but so many logs were generated that I couldn't see what the original error message was. Now we should abort on failure.

add CC variable

45b6d78

Talked to Doug, and it looks like the LatestClang isn't necessarily expected to happen. I was expecting that to set the CC/CXX variables, but since that isn't going to be the case we can just set it in the buildspec.

temporarily giveup on awslc

f5efcbd

My build ended up failing a random openssl test :( not quite sure why, and it is going to be very painful to debug in CI. Trying to focus on getting coverage actually working, so punting that task off to some other day.

add newline to end of file, revert accidental typo

00551e1

More cleanup for pending PR

github-actions bot added the s2n-core team label Jan 14, 2023

jmayclin requested a review from lrstewart January 14, 2023 01:46

jmayclin marked this pull request as ready for review January 14, 2023 01:47

jmayclin requested a review from dougch as a code owner January 14, 2023 01:47

jmayclin requested a review from maddeleine January 17, 2023 20:43

maddeleine reviewed Jan 17, 2023

View reviewed changes

maddeleine approved these changes Jan 17, 2023

View reviewed changes

dougch reviewed Jan 18, 2023

View reviewed changes

codebuild/spec/buildspec_unit_coverage.yml Show resolved Hide resolved

codebuild/spec/buildspec_unit_coverage.yml Outdated Show resolved Hide resolved

jmayclin force-pushed the code-coverage branch 2 times, most recently from 7e6000e to 02b7948 Compare January 23, 2023 18:25

rearrange buildspec

d4630b6

This commit addresses some of the PR feedback. I don't think I actually need to be doing any of the third party test dependency things.

jmayclin force-pushed the code-coverage branch from 02b7948 to d4630b6 Compare January 23, 2023 18:55

jmayclin added 2 commits January 23, 2023 23:19

remove changes from yml files

94382a9

CodeBuild jobs won't run for yml files. But I still want this to be reviewed with the code build configuration stuff.

remove old buildspec file

bce7e04

dougch approved these changes Jan 23, 2023

View reviewed changes

lrstewart reviewed Jan 24, 2023

View reviewed changes

add comment documenting coverage LLVM dependency

26601a4

- comment for fatal error - declare cmake option - add libcrypto root thing for build script Previously I think we were just using the default libcrypto from the environment.

jmayclin requested a review from lrstewart January 25, 2023 18:51

remove codecov upload, focus on lcov upload

c8770ca

- remove the codecoverage upload after talking to Lindsay - add branch coverage to the lcov report - upload the text summary to s3 - upload the report to s3 - s3 uploads are done to a specific PR path

jmayclin mentioned this pull request Jan 26, 2023

Code Coverage Enhancement #3781

Open

lrstewart approved these changes Jan 27, 2023

View reviewed changes

Merge branch 'main' into code-coverage

36c69d3

jmayclin enabled auto-merge (squash) January 28, 2023 03:03

jmayclin added 2 commits January 27, 2023 19:04

Merge branch 'main' into code-coverage

4e5e388

Merge branch 'main' into code-coverage

3fbd377

jmayclin merged commit 4bd1505 into aws:main Jan 30, 2023

This was referenced Feb 16, 2023

Code coverage reports #2349

Closed

Integrate s2n with Coveralls Code Coverage #499

Closed

amazonKamath mentioned this pull request Mar 1, 2023

Removed codecov github status badge. #3859

Merged

jmayclin deleted the code-coverage branch December 22, 2023 01:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add code coverage support #3759

add code coverage support #3759

jmayclin commented Jan 14, 2023 •

edited

Loading

maddeleine left a comment

jmayclin commented Jan 17, 2023 •

edited

Loading

maddeleine commented Jan 17, 2023

dougch left a comment

codecov bot commented Jan 23, 2023 •

edited

Loading

jmayclin commented Jan 23, 2023

dougch left a comment

lrstewart Jan 24, 2023

jmayclin Jan 24, 2023

lrstewart Jan 25, 2023 •

edited

Loading

jmayclin commented Jan 26, 2023 •

edited

Loading

jmayclin commented Jan 26, 2023


		# upload to codecov.io
		curl -Os https://uploader.codecov.io/latest/linux/codecov

add code coverage support #3759

add code coverage support #3759

Conversation

jmayclin commented Jan 14, 2023 • edited Loading

Description of changes:

CallOuts

Testing:

maddeleine left a comment

Choose a reason for hiding this comment

jmayclin commented Jan 17, 2023 • edited Loading

maddeleine commented Jan 17, 2023

dougch left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 23, 2023 • edited Loading

Codecov Report

jmayclin commented Jan 23, 2023

dougch left a comment

Choose a reason for hiding this comment

lrstewart Jan 24, 2023

Choose a reason for hiding this comment

jmayclin Jan 24, 2023

Choose a reason for hiding this comment

lrstewart Jan 25, 2023 • edited Loading

Choose a reason for hiding this comment

jmayclin commented Jan 26, 2023 • edited Loading

jmayclin commented Jan 26, 2023

jmayclin commented Jan 14, 2023 •

edited

Loading

jmayclin commented Jan 17, 2023 •

edited

Loading

codecov bot commented Jan 23, 2023 •

edited

Loading

lrstewart Jan 25, 2023 •

edited

Loading

jmayclin commented Jan 26, 2023 •

edited

Loading