Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec checker tool #8722

Merged
merged 42 commits into from
Apr 15, 2021
Merged

Spec checker tool #8722

merged 42 commits into from
Apr 15, 2021

Conversation

farazdagi
Copy link
Contributor

@farazdagi farazdagi commented Apr 8, 2021

What type of PR is this?

Feature

What does this PR do? Why is it needed?

  • This tool allows to download specs documentation, parse Python code blocks from it, and then use parsed blocks to compare with documentation pseudo code we are using when implementing specs functions. Otherwise, it is a simple CLI utility relying on AST parsing.
  • The delta of this PR is big, due to inclusion of parsed reference specs (see the rationale for this below).
  • This PR implements the separate CLI tool (and not the static analyzer), because go_tool_library doesn't support embededsrcs attribute (which go_library does support), and our tool relies on go:embed. Once rules_go are updated, we can turn this tool into static analyzer, easily.
  • Multiple definitions with the same function names are possible (for example you can have def func_name(): defined for both phase1 and phase0). As long as one of the definitions matches our comments, we consider comments to be up to date.

Which issues(s) does this PR fix?

Fixes #5717

How to update referenced specs?

bazel run //tools/specs-checker download -- --dir=$PWD/tools/specs-checker/data

This will pull the files defined in specDirs (see main.go), parse them (extract Python code snippets, discarding any other text), and save them to the folder from which bazel run //tools/specs-checker check will be able to embed (using go:embed):

image

Note: see included README for more details.

How to run tests?

bazel run //tools/specs-checker check -- --dir $PWD/beacon-chain
bazel run //tools/specs-checker check -- --dir $PWD/validator
bazel run //tools/specs-checker check -- --dir $PWD/shared

Or, to check the whole project:

bazel run //tools/specs-checker check -- --dir $PWD

How do results of the check look like?
image

Why save parsed reference definitions, why not discard them after the check?
There are several reasons for doing that:

  • I found it particularly useful when downloading updates, since reference definitions are versioned, on git diff I immediately see what files have changed, to better understand the scope of updates I need to do on our comments.
  • When used as static analyzer, re-downloading from GitHub on every build attempt is not the best idea, so having those results cached comes handy. I could have implemented more sophisticated caching scheme, where tool will check the update time of the spec files and re-download only if older than a given period, but I consider it as a bit of an overkill for the tool that simple. So, it is up to us to decide how often do we want to pull updated specs from the upstream.
  • Finally, it is way easier to debug, as you have all the reference definitions saved within our project. As a side-kick: when updating the comments (if it is out of date), you can just grep the saved reference docs, and do not have to fish through specs repo site in order to find the up to date version of a snippet.

@farazdagi farazdagi self-assigned this Apr 8, 2021
@codecov
Copy link

codecov bot commented Apr 10, 2021

Codecov Report

Merging #8722 (8cfa92a) into develop (169cd78) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #8722      +/-   ##
===========================================
- Coverage    60.54%   60.54%   -0.01%     
===========================================
  Files          519      519              
  Lines        36271    36320      +49     
===========================================
+ Hits         21961    21989      +28     
- Misses       11146    11161      +15     
- Partials      3164     3170       +6     

@farazdagi farazdagi added the Ready For Review A pull request ready for code review label Apr 10, 2021
@nisdas nisdas added this to the v1.3.8 milestone Apr 12, 2021
Copy link
Member

@prestonvanloon prestonvanloon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing. Is there any issue filed with rules_go to support embed in go tool library?

Comment on lines +171 to +173
if matchesPerfectly {
return true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to check that all refDefs match perfectly or just any of them match perfectly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I loop through all versions of a given definition (say, in altair we have the function with the same name but different implementation, I want to make sure that both comments in altair-related code and phase0 related code are checked).

So, at the beginning of loop that checks that all definition's lines match our comment, the matchesPerfectly flag is set to true. Line checking loop breaks when non-matching line is found, the matchesPerfectly flag is set to false. So, a single matching definition is enough for flag to survive, and us be sure that our comment matches at least one known version of spec's function.

NB: Will search rules go for an issue, and if none exists will create my own -- somehow missed that part, thanks for pointing that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing. Is there any issue filed with rules_go to support embed in go tool library?

@farazdagi farazdagi marked this pull request as draft April 14, 2021 03:25
@farazdagi
Copy link
Contributor Author

Turning back to draft, will implement as static code analyzer (using workaround with go_embed_data).

@farazdagi
Copy link
Contributor Author

It doesn't seem that I can use go_embed_data target within go_tool_library, whenever I do so, I get circular dependency error:
image

Let's merge it as is -- and once patch is available, will update the implementation to become static analyzer.

@farazdagi farazdagi marked this pull request as ready for review April 14, 2021 21:55
Copy link
Member

@prestonvanloon prestonvanloon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job. This is great!

@prylabs-bulldozer prylabs-bulldozer bot merged commit 3d3b9d1 into develop Apr 15, 2021
@delete-merged-branch delete-merged-branch bot deleted the spec-checker-tool branch April 15, 2021 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure comment Spec pseudocode definition... aligns with upsteam
4 participants