Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow use of pull_request_target to run pull_request code #26170

Open
phlax opened this issue Mar 20, 2023 · 19 comments
Open

Allow use of pull_request_target to run pull_request code #26170

phlax opened this issue Mar 20, 2023 · 19 comments
Labels
area/ci area/security no stalebot Disables stalebot from closing an issue

Comments

@phlax
Copy link
Member

phlax commented Mar 20, 2023

There is currently a PR that uses pull_request_target and would allow PRs to run with repo secrets/permissions

This pattern has been discussed quite a bit previously in the context of shifting off of AZP for CI

Essentially pull_request_target runs whatever code in the "context" of the target branch - ie main and has access to repo secrets and permissions.

Github absolutely advises against this, but you can kinda hack it, so that it runs PRs (ie untrusted code) with that context and permissions/secrets

The problem is that there really is no other way to run PRs with secrets on github (its mostly what kept us on AZP until now)

You can do various things to mitigate the risk, but it opens up attack vectors that are incredibly hard to prevent either programatically or by visual inspection so the advice is just dont do it. Security tools that look for vulnerable repos will flag the repo just for using the pattern of pull_request_target + PR git checkout - irrespective of what happens after

There are several possible vectors, including cache poisoning - the defaults are incredibly insecure and the github token is available on disk (by default) and in memory regardless of what you do in a job

The biggest risk is probably to secrets/permissions

Example attack vector

This is the example given by Github

In the following code, altho the npm job (ie untrusted code) is not directly given access to secrets, it can inject code to change how any following jobs work, and thereby snaffle secrets from other steps, or change how they run using the repo authority

# INSECURE. Provided as an example only.
on:
  pull_request_target

jobs:
  build:
    name: Build and test
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
      with:
        ref: ${{ github.event.pull_request.head.sha }}

    - uses: actions/setup-node@v1
    - run: |
        npm install
        npm build

    - uses: completely/fakeaction@v2
      with:
        arg1: ${{ secrets.supersecret }}

    - uses: fakerepo/comment-on-pr@v1
      with:
        message: |
          Thank you!

Mitigations

Addressing the insecure defaults and preventing the checkout action writing the token to disk can help to some extent, but ultimately are a maginot line if you can still steal the creds from mem, or inject unwanted code.

One possible mitigation is only running untrusted code in containers - and doing so in such a way that the code can neither access secrets (unless explicitly allowed) nor influence future action. Unfortunately this wont help with the mac (mobile) jobs

The key point about migitation i think is that while you can setup such a system securely, keeping it so, is an art in itself

Refs

@phlax phlax added triage Issue requires triage area/security area/ci and removed triage Issue requires triage labels Mar 20, 2023
@phlax
Copy link
Member Author

phlax commented Mar 20, 2023

cc @envoyproxy/security-team @envoyproxy/senior-maintainers @envoyproxy/envoy-maintainers

cc @joycebrum @Yannic @jpsim

@phlax
Copy link
Member Author

phlax commented Mar 20, 2023

one thing that would make everyones life easier if having multiple secret stashes - so you can associate public-rotating-keys-stash with pull_request_target jobs and limit their damage and yet have another completely-private-never-exposed-to-prs-stash containing secrets that should never be exposed

@phlax
Copy link
Member Author

phlax commented Mar 20, 2023

or even just a flag on existing secrets "available to pull_request_targets" or somesuch

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 19, 2023
@phlax phlax added no stalebot Disables stalebot from closing an issue and removed stale stalebot believes this issue/PR has not been touched recently labels Apr 19, 2023
@joycebrum
Copy link

Sorry for the late response, I wasn't notified through email and lost it ;-;

Let me see if I got it correctly: you want to bring some automations from this AZP to github workflow but the problem is that it would need write permission to run and the only way to do it would be using pull_request_target, right?

If that's the case I'd say for us to first define which steps would need write permission and which write permission would be needed.

This way we can try using the workflow run trigger to run immediately after the on pull_request workflow. This workflow would only run the steps that needs right permission.

I have an example of a change I've proposed to remove the pull_request_target used and divide the workflow in two: nlohmann/json#3969. I can also help you with that if that's the case.

@phlax
Copy link
Member Author

phlax commented May 10, 2023

np @joycebrum thanks for getting back to us - been trying to think through the points you raise

If that's the case I'd say for us to first define which steps would need write permission and which write permission would be needed.

yeah - tbh i dont think any of the Envoy pr jobs require gh write permissions - altho i think that was what was being added for mobile in #25770) - not entirely clear what the needs were there - cc @Yannic

This way we can try using the workflow run trigger to run immediately after the on pull_request workflow.

so i think some of the tasks that we want to move from azp can work well with this pattern - eg publishing dev docs can be separated from the build, and then the job with creds just pushes html to the expected place etc

i think the harder problem is the RBE creds as they are needed whenever bazel is invoked - and everything mostly works with bazel

tbh i think that is something that we may need to think about differently, and i think mobile is already providing these to PRs anyway without the use of a pull_request_target

maybe there is something im missing here tho - cc @lizan @mattklein123

@phlax
Copy link
Member Author

phlax commented May 17, 2023

This way we can try using the workflow run trigger to run immediately after the on pull_request workflow.

experimenting with the workflow_run trigger - it seems it has to be on main - which makes it harder to test in prs etc, but also makes it a lot harder to use with release branches with their own ci

@joycebrum
Copy link

Another aspect to consider is that if the "write" permission will be granted through other tokens or secrets (example: an external token to publish doc, or the creds you might need for bazel), it can easily run on pull request without the need of a pull_request_target.

Although it is also important to protect the credentials used as much as possible: setting minimal token permissions (GITHUB_TOKEN), using hash pinning and avoiding dangerous workflows (pull_request_target with checkout, code injections, etc)

Ping me if you want help on evaluating security of these workflows in the future. I'll be happy to help! 😄

@phlax
Copy link
Member Author

phlax commented May 31, 2023

since we have landed #27600 im going to close this - if we need to adapt/iterate the policy i think we can do that in the policy file

@phlax phlax closed this as completed May 31, 2023
@phlax phlax reopened this Aug 29, 2023
@phlax
Copy link
Member Author

phlax commented Aug 29, 2023

reopening this as we seem to have hit an iresolvable blocker that requires reconsidering current policy

cc @joycebrum

@joycebrum
Copy link

Hi @phlax! what is the current blocker? Any specific workflow not working properly?

@phlax
Copy link
Member Author

phlax commented Aug 29, 2023

its just the same RBE question coming round again - seems that we cant even expose vars to PR workflows - so i think there really isnt any other way around it

this is for engflow RBE which has a GITHUB_TOKEN attestation scheme for auth so hopefully we can minimize the potential surface here

ive been working on other (reusable) workflows that dont need creds, but are triggered by non-pull_request events so have similar potential issues - im thinking i can repurpose/adapt them for this also

@gabibguti
Copy link

Hi @phlax! I'm joining the discussion here along with Joyce.

Yes, that's indeed a complicated blocker. If I understand the case correctly, the problem is that you cannot create a workflow that runs on pull_request and runs Bazel CI because it needs access to GitHub Secrets. So, you would need to use on pull_request_target to run Bazel CI. And, yes, vars are not encrypted so they would expose your Bazel credentials more easily than pull_request_target. The options are:

  • not running the Bazel CI on pull requests to avoid exposing your Bazel credentials
  • or, run the Bazel CI on pull_request_target and risk exposing your Bazel credentials

You can mitigate this risk by using the label workaround. The idea is to review if the PR is "safe for testing" before testing, works as a kind of approval so the workflow will not run immediately.

Plus, a question to ask here is: what damage can be done if the Bazel credentials are exposed? Depending on the answer, it may be worth it to run on pull_request_target if the damage is minimum to you, and the benefits of knowing the tests and builds are passing at pull requests is higher.

@phlax
Copy link
Member Author

phlax commented Sep 8, 2023

hi @gabibguti are on envoy slack ?

@gabibguti
Copy link

No, I'm not.

@phlax
Copy link
Member Author

phlax commented Sep 8, 2023

would you be able to join? it would be easier to explore the questions you raised i think

@gabibguti
Copy link

Sure, I believe I joined #envoy-users channel.

@gabibguti
Copy link

gabibguti commented Sep 20, 2023

Hey! Do you have an example of Bazel command that needs Bazel credentials? I wanted to create an issue with an example on https://github.com/bazelbuild/bazel. I searched and there are other similar issues like bazelbuild/bazel#14278 requesting auth / credentials helpers to run commands.

@phlax
Copy link
Member Author

phlax commented Sep 21, 2023

pretty much all of our CI commands use bazel and RBE - the best place to look is in ci/do_ci.sh for specific examples

the auth is set between .bazelrc and some env vars:

envoy/.bazelrc

Lines 474 to 482 in cfdc99a

# RBE (Google)
build:rbe-google --google_default_credentials=true
build:rbe-google --remote_cache=grpcs://remotebuildexecution.googleapis.com
build:rbe-google --remote_executor=grpcs://remotebuildexecution.googleapis.com
build:rbe-google --remote_timeout=7200
build:rbe-google --remote_instance_name=projects/envoy-ci/instances/default_instance
build:rbe-google-bes --bes_backend=grpcs://buildeventservice.googleapis.com
build:rbe-google-bes --bes_results_url=https://source.cloud.google.com/results/invocations/

and here:

envoy/ci/setup_cache.sh

Lines 17 to 20 in cfdc99a

bash -c 'echo "${GCP_SERVICE_ACCOUNT_KEY}"' | base64 --decode > "${GCP_SERVICE_ACCOUNT_KEY_FILE}"
export BAZEL_BUILD_EXTRA_OPTIONS+=" --google_credentials=${GCP_SERVICE_ACCOUNT_KEY_FILE}"

with the Engflow RBE it works differently - in that case it uses a GITHUB_TOKEN to attest read permissions to an (arbitrary) github repo container registry.

This is also setup in .bazelrc:

envoy/.bazelrc

Line 490 in cfdc99a

build:rbe-engflow --credential_helper=*.engflow.com=%workspace%/bazel/engflow-bazel-credential-helper.sh

and uses the GITHUB_TOKEN from the env

atm its setting up Engflow RBE in github for public use that is the focus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci area/security no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

3 participants