Skip to content

s-thom/create-robots-txt-action

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Repository files navigation

Create Robots.txt Action

An action to create a robots.txt file from a variety of sources.

Input name Example Description
output-file robots.txt Where to write the resulting robots.txt file
input-file base-robots.txt An existing robots.txt. Will be added to the top of the output file. Must not be the same as the output-file
append-allow-rule true Whether to add an allow for all unspecified user agents to the end of the file
allowed-bot-names Multiline string. Names of bots that should not be included in the blocklist
blocked-bot-names Multiline string. Names of bots that should be included in the blocklist
cloudflare-api-token An API token for Cloudflare. Enables Cloudflare's bot categories as a source for bots
cloudflare-categories AI Crawler Bot categories to add to the blocklist. Required if cloudflare-api-token is set
dark-visitors-api-token An API token for Dark Visitors. Enables Dark Visitors' user agent categories as a source for bots
dark-visitors-categories AI Data Scraper User agent categories to add to the blocklist. Required if dark-visitors-api-token is set

Example workflow.yml

Note

You will need to enable the "Allow GitHub Actions to create and approve pull requests" option in your repository's Settings > Actions > General

name: Update robots.txt

on:
  workflow_dispatch:
  schedule:
    - cron: "13 6 * * 1"

jobs:
  update-robots-txt:
    name: Update robots.txt
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Create robots.txt
        uses: s-thom/create-robots-txt-action@v1
        with:
          output-file: public/robots.txt
          append-allow-rule: true
          allowed-bot-names: |
            Chrome-Lighthouse
          cloudflare-api-token: ${{ secrets.CLOUDFLARE_RADAR_API_TOKEN }}
          cloudflare-categories: |
            YOUR BOT CATEGORIES HERE
          dark-visitors-api-token: ${{ secrets.DARK_VISITORS_API_TOKEN }}
          dark-visitors-categories: |
            YOUR BOT CATEGORIES HERE

      - name: Create Pull Request
        uses: peter-evans/create-pull-request@v7
        with:
          add-paths: |
            public/robots.txt
          commit-message: "Update robots.txt"
          branch: robots-txt
          delete-branch: true
          author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
          committer: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
          title: "Update robots.txt"
          body: |
            # Automated update of robots.txt

            Generated by [s-thom/create-robots-txt-action](https://github.com/s-thom/create-robots-txt-action)

Known bot categories for each provider

Cloudflare

To create an API token, go to the API Tokens page in your profile settings and create a new token. Use the "Read Cloudflare Radar data" template or create a custom one with the "Account > Radar > Read" permission. Create the token and add it as an actions secret in your GitHub repository.

It is worth creating a new token for this workflow even if you already have one set up in your repository. It is good practice to give the least amount of privilege to any tokens given to third-party code, such as this action.

  • Accessibility
  • Advertising & Marketing
  • Aggregator
  • AI Assistant
  • AI Crawler
  • AI Search
  • Archiver
  • Feed Fetcher
  • Monitoring & Analytics
  • Page Preview
  • Search Engine Crawler
  • Search Engine Optimization
  • Security
  • Social Media Marketing
  • Webhooks
  • Other

Dark Visitors

To find your API token, go to the Settings page for your project. The access token is visible on this page. Copy the token and add it as an actions secret in your GitHub repository.

  • AI Assistants​
  • AI Data Scrapers​
  • AI Search Crawlers​

Note

While Dark Visitors also defines these other categories, they do not include bots from these categories in their API.

  • Archivers​
  • Developer Helpers​
  • Fetchers​
  • Headless Browsers​
  • Intelligence Gatherers​
  • Scrapers​
  • Search Engine Crawlers​
  • SEO Crawlers​
  • Uncategorized​
  • Undocumented AI Agents
Development instructions

Initial Setup

After you've cloned the repository to your local machine or codespace, you'll need to perform some initial setup steps before you can develop your action.

[!NOTE]

You'll need to have a reasonably modern version of Node.js handy (20.x or later should work!). If you are using a version manager like nodenv or fnm, this template has a .node-version file at the root of the repository that can be used to automatically switch to the correct version when you cd into the repository. Additionally, this .node-version file is used by GitHub Actions in any actions/setup-node actions.

  1. 🛠️ Install the dependencies

    npm install
  2. 🏗️ Package the TypeScript for distribution

    npm run bundle
  3. ✅ Run the tests

    $ npm test
    
    PASS  ./index.test.js
      ✓ throws invalid number (3ms)
      ✓ wait 500 ms (504ms)
      ✓ test runs (95ms)
    
    ...

Update the Action Metadata

The action.yml file defines metadata about your action, such as input(s) and output(s). For details about this file, see Metadata syntax for GitHub Actions.

When you copy this repository, update action.yml with the name, description, inputs, and outputs for your action.

Update the Action Code

The src/ directory is the heart of your action! This contains the source code that will be run when your action is invoked. You can replace the contents of this directory with your own code.

There are a few things to keep in mind when writing your action code:

  • Most GitHub Actions toolkit and CI/CD operations are processed asynchronously. In main.ts, you will see that the action is run in an async function.

    import * as core from "@actions/core";
    //...
    
    async function run() {
      try {
        //...
      } catch (error) {
        core.setFailed(error.message);
      }
    }

    For more information about the GitHub Actions toolkit, see the documentation.

So, what are you waiting for? Go ahead and start customizing your action!

  1. Create a new branch

    git checkout -b releases/v1
  2. Replace the contents of src/ with your action code

  3. Add tests to __tests__/ for your source code

  4. Format, test, and build the action

    npm run all

    This step is important! It will run ncc to build the final JavaScript action code with all dependencies included. If you do not run this step, your action will not work correctly when it is used in a workflow. This step also includes the --license option for ncc, which will create a license file for all of the production node modules used in your project.

  5. (Optional) Test your action locally

    The @github/local-action utility can be used to test your action locally. It is a simple command-line tool that "stubs" (or simulates) the GitHub Actions Toolkit. This way, you can run your TypeScript action locally without having to commit and push your changes to a repository.

    The local-action utility can be run in the following ways:

    • Visual Studio Code Debugger

      Make sure to review and, if needed, update .vscode/launch.json

    • Terminal/Command Prompt

      # npx local action <action-yaml-path> <entrypoint> <dotenv-file>
      npx local-action . src/main.ts .env

    You can provide a .env file to the local-action CLI to set environment variables used by the GitHub Actions Toolkit. For example, setting inputs and event payload data used by your action. For more information, see the example file, .env.example, and the GitHub Actions Documentation.

  6. Commit your changes

    git add .
    git commit -m "My first action is ready!"
  7. Push them to your repository

    git push -u origin releases/v1
  8. Create a pull request and get feedback on your action

  9. Merge the pull request into the main branch

Your action is now published! 🚀

For information about versioning your action, see Versioning in the GitHub Actions toolkit.

Validate the Action

You can now validate the action by referencing it in a workflow file. For example, ci.yml demonstrates how to reference an action in the same repository.

steps:
  - name: Checkout
    id: checkout
    uses: actions/checkout@v4

  - name: Test Local Action
    id: test-action
    uses: ./
    with:
      milliseconds: 1000

  - name: Print Output
    id: output
    run: echo "${{ steps.test-action.outputs.time }}"

For example workflow runs, check out the Actions tab! 🚀

Usage

After testing, you can create version tag(s) that developers can use to reference different stable versions of your action. For more information, see Versioning in the GitHub Actions toolkit.

To include the action in a workflow in another repository, you can use the uses syntax with the @ symbol to reference a specific branch, tag, or commit hash.

steps:
  - name: Checkout
    id: checkout
    uses: actions/checkout@v4

  - name: Test Local Action
    id: test-action
    uses: actions/typescript-action@v1 # Commit with the `v1` tag
    with:
      milliseconds: 1000

  - name: Print Output
    id: output
    run: echo "${{ steps.test-action.outputs.time }}"

Publishing a New Release

This project includes a helper script, script/release designed to streamline the process of tagging and pushing new releases for GitHub Actions.

GitHub Actions allows users to select a specific version of the action to use, based on release tags. This script simplifies this process by performing the following steps:

  1. Retrieving the latest release tag: The script starts by fetching the most recent SemVer release tag of the current branch, by looking at the local data available in your repository.
  2. Prompting for a new release tag: The user is then prompted to enter a new release tag. To assist with this, the script displays the tag retrieved in the previous step, and validates the format of the inputted tag (vX.X.X). The user is also reminded to update the version field in package.json.
  3. Tagging the new release: The script then tags a new release and syncs the separate major tag (e.g. v1, v2) with the new release tag (e.g. v1.0.0, v2.1.2). When the user is creating a new major release, the script auto-detects this and creates a releases/v# branch for the previous major version.
  4. Pushing changes to remote: Finally, the script pushes the necessary commits, tags and branches to the remote repository. From here, you will need to create a new release in GitHub so users can easily reference the new tags in their workflows.