Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub: File validation interacts badly with PRs adding submodules #396

Closed
StephanTLavavej opened this issue Dec 18, 2019 · 2 comments · Fixed by #526 or #2832
Closed

GitHub: File validation interacts badly with PRs adding submodules #396

StephanTLavavej opened this issue Dec 18, 2019 · 2 comments · Fixed by #526 or #2832
Labels
fixed Something works now, yay! infrastructure Related to repository automation

Comments

@StephanTLavavej
Copy link
Member

Our automation to "validate" files for whitespace, line endings, and non-ASCII characters doesn't handle submodule additions very well. We have a hardcoded list of directories to skip when validating:

static constexpr array skipped_directories{
".git"sv,
".vs"sv,
".vscode"sv,
"llvm-project"sv,
"out"sv,
"vcpkg"sv,
};

Here's the problem scenario:

  1. A PR adds a cat submodule, and updates validate.cpp to skip that directory.
  2. The VMs properly init that submodule, and skip it while validating.
  3. At this point, the cat PR hasn't been merged yet!
  4. A totally separate PR (modifying mouse machinery or whatever) triggers checks on those VMs.
  5. The VMs don't clean out the cat directory, and the mouse PR doesn't contain validate.cpp changes to skip cat, so all of the cat files are scanned and fail to validate.

I suspect that this is involved:

- checkout: self
submodules: recursive

According to the YAML schema, do we need to set clean: true so we don't have to worry about lingering submodules?

@StephanTLavavej StephanTLavavej added enhancement Something can be improved infrastructure Related to repository automation labels Dec 18, 2019
@StephanTLavavej StephanTLavavej removed the enhancement Something can be improved label Feb 6, 2020
@StephanTLavavej
Copy link
Member Author

@cbezault just encountered another form of this - submitting a PR that emits new files/directories that are also added to .gitignore contaminates the VMs.

@StephanTLavavej StephanTLavavej linked a pull request Feb 25, 2020 that will close this issue
4 tasks
barcharcraz added a commit that referenced this issue Feb 25, 2020
@StephanTLavavej StephanTLavavej added the fixed Something works now, yay! label Feb 25, 2020
@StephanTLavavej StephanTLavavej removed the fixed Something works now, yay! label Jun 23, 2022
@StephanTLavavej
Copy link
Member Author

This wasn't cured after all - #2780 (adding a benchmarks/google-benchmark submodule) contaminated a run for #2811 (making an unrelated visualizer change). We probably would have noticed this earlier if adding submodules happened more often.

The raw_log.txt reveals the problem:

...
##[command]git clean -ffdx
Removing tests/std/tests/Dev09_056375_locale_cleanup/__pycache__/
Removing tests/std/tests/Dev09_172666_tr1_tuple_odr/__pycache__/
Removing tests/std/tests/GH_000545_include_compare/__pycache__/
Removing tests/std/tests/P0607R0_inline_variables/__pycache__/
Removing tests/std/tests/P1502R1_standard_library_header_units/__pycache__/
Removing tests/std/tests/VSO_0000000_any_calling_conventions/__pycache__/
Removing tests/std/tests/VSO_0000000_matching_npos_address/__pycache__/
Removing tests/utils/stl/__pycache__/
Removing tests/utils/stl/test/__pycache__/
##[command]git reset --hard HEAD
HEAD is now at 2542aa00 Merge 36798a7cd9176b5de0b5f45719f0be83dc61ce9c into 7f04137880e1c3df5125c1baf808f16f399dee2e
...
##[command]git checkout --progress --force refs/remotes/pull/2811/merge
warning: unable to rmdir 'benchmarks/google-benchmark': Directory not empty
Previous HEAD position was 2542aa00 Merge 36798a7cd9176b5de0b5f45719f0be83dc61ce9c into 7f04137880e1c3df5125c1baf808f16f399dee2e
HEAD is now at f25805ba Merge 84a9095de7a38d21148f29b0970a0573e889d2a5 into 7f04137880e1c3df5125c1baf808f16f399dee2e
...

The clean: true documentation says "When set to true, the pipeline runs execute git clean -ffdx && git reset --hard HEAD before fetching the repo." This appears to be accurate, but it doesn't solve the problem we have! The git clean -ffdx is running before checking out the commit we want to build. In this "before" state, the submodule still exists, so the clean skips it. It's when we git checkout the commit to build, that git notices that the new commit doesn't contain the submodule, and (to avoid data loss) dutifully reports that it will not rmdir the non-empty directory. If we cleaned after checking out the commit, git clean -ffdx would remove the contaminating submodule - but there appears to be no way to request specifically that.

However, other documentation suggests that we may be able to use workspace: clean: all. If I understand the docs right, this would be a true clean.

In theory we can confirm that this works, although doing so would require a delicate dance. It might be easier to just inspect the logs to verify the true clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! infrastructure Related to repository automation
Projects
None yet
2 participants
@StephanTLavavej and others