feature: Introduce pre-commit for linting and formatting

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>
opendatahub-io · Nov 7, 2023 · e965de5 · e965de5
1 parent 1162813
commit e965de5
Show file tree

Hide file tree

Showing 72 changed files with 1,366 additions and 540 deletions.
diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml
@@ -0,0 +1,16 @@
+---
+# using the pre-commit action from: https://github.com/pre-commit/action
+name: pre-commit
+
+on: # yamllint disable-line rule:truthy
+  pull_request:
+  push:
+    branches: [main]
+
+jobs:
+  pre-commit:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - uses: actions/setup-python@v3
+      - uses: pre-commit/action@v3.0.0
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,36 @@
+---
+repos:
+  - repo: https://github.com/Lucas-C/pre-commit-hooks
+    rev: v1.3.1
+    hooks:
+      - id: remove-tabs
+
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
+    hooks:
+      - id: check-added-large-files
+      - id: check-ast
+      - id: check-byte-order-marker
+      - id: check-case-conflict
+      - id: check-docstring-first
+      - id: check-json
+      - id: check-merge-conflict
+      - id: check-symlinks
+      - id: check-toml
+      - id: check-yaml
+        args: [--allow-multiple-documents]
+      - id: debug-statements
+      - id: detect-private-key
+      - id: end-of-file-fixer
+      - id: trailing-whitespace
+
+  - repo: https://github.com/psf/black
+    rev: 23.10.0
+    hooks:
+      - id: black
+
+  - repo: https://github.com/PyCQA/flake8
+    rev: '6.1.0'
+    hooks:
+      - id: flake8
+        args: ['--max-line-length=120']
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -15,6 +15,7 @@ Pull requests are the best way to propose changes to the notebooks repository:
 
 - Configure name and email in git
 - Fork the repo and create your branch from main.
+- Install [pre-commit](https://pre-commit.com/) into your [git hooks](https://githooks.com/) by running `pre-commit install`. See [linting](#linting) for more.
 - Sign off your commit using the -s, --signoff option. Write a good commit message (see [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/))
 - If you've added code that should be tested, [add tests](https://github.com/openshift/release/blob/master/ci-operator/config/opendatahub-io/notebooks/opendatahub-io-notebooks-main.yaml).
 - Ensure the test suite passes.
@@ -35,7 +36,7 @@ Pull requests are the best way to propose changes to the notebooks repository:
     # Your comment here
     .PHONY: jupyter-${NOTEBOOK_NAME}-ubi8-python-3.8
     jupyter-${NOTEBOOK_NAME}-ubi8-python-3.8: jupyter-minimal-ubi8-python-3.8
-	$(call image,$@,jupyter/${NOTEBOOK_NAME}/ubi8-python-3.8,$<)
+    $(call image,$@,jupyter/${NOTEBOOK_NAME}/ubi8-python-3.8,$<)
     ```
 - Add the paths of the new pipfiles under `refresh-pipfilelock-files`
 - Test the changes locally, by manually running the `$ make jupyter-${NOTEBOOK_NAME}-ubi8-python-3.8` from the terminal.
@@ -56,3 +57,17 @@ Pull requests are the best way to propose changes to the notebooks repository:
 ### Testing your PR locally
 
 - Test the changes locally, by manually running the `$make jupyter-${NOTEBOOK_NAME}-ubi8-python-3.8` from the terminal. This definitely helps in that initial phase.
+
+### Linting
+
+To run linting tests, we use [pre-commit](https://pre-commit.com/).
+
+We have setup a [pre-commit](https://pre-commit.com) config file in [.pre-commit-config.yaml](.pre-commit-config.yaml).
+To [utilize pre-commit](https://pre-commit.com/#usage), install pre-commit with `pip3 install pre-commit` and then either:
+
+Run `pre-commit install` after you clone the repo, `pre-commit` will run automatically on git commit.
+   * If any one of the tests fail, add and commit the changes made by pre-commit. Once the pre-commit check passes, you can make your PR.
+   * `pre-commit` will from now on run all the checkers/linters/formatters on every commit.
+   * If you later want to commit without running it, just run `git commit` with `-n/--no-verify`.
+or
+  * If you want to manually run all the checkers/linters/formatters, run `pre-commit run --all-files`.
diff --git a/Makefile b/Makefile
@@ -331,8 +331,8 @@ test-%: bin/kubectl
 	$(KUBECTL_BIN) wait --for=condition=ready pod -l app=$(NOTEBOOK_NAME) --timeout=600s
 	$(KUBECTL_BIN) port-forward svc/$(NOTEBOOK_NAME)-notebook 8888:8888 & curl --retry 5 --retry-delay 5 --retry-connrefused http://localhost:8888/notebook/opendatahub/jovyan/api ; EXIT_CODE=$$?; echo && pkill --full "^$(KUBECTL_BIN).*port-forward.*"; \
 	$(eval FULL_NOTEBOOK_NAME = $(shell ($(KUBECTL_BIN) get pods -l app=$(NOTEBOOK_NAME) -o custom-columns=":metadata.name" | tr -d '\n')))
-	
-	# Tests notebook's functionalities 
+
+	# Tests notebook's functionalities
 	if echo "$(FULL_NOTEBOOK_NAME)" | grep -q "minimal-ubi9"; then \
 		$(call test_with_papermill,minimal,ubi9,python-3.9) \
 	elif echo "$(FULL_NOTEBOOK_NAME)" | grep -q "datascience-ubi9"; then \
@@ -467,4 +467,3 @@ refresh-pipfilelock-files:
 	cd runtimes/tensorflow/ubi8-python-3.8 && pipenv lock
 	cd runtimes/tensorflow/ubi9-python-3.9 && pipenv lock
 	cd base/c9s-python-3.9 && pipenv lock
-
diff --git a/README.md b/README.md
@@ -223,7 +223,7 @@ make undeployX-${NOTEBOOK_NAME}
 ## Validating Runtimes
 
 The runtimes image requires to have curl and python installed,
-so that on runtime additional packages can be installed. 
+so that on runtime additional packages can be installed.
 
 Deploy the runtime images in your Kubernetes environment using deploy8-${WORKBENCH_NAME} for ubi8 or deploy9-${WORKBENCH_NAME} for ubi9:
 

diff --git a/UPDATES.md b/UPDATES.md
@@ -4,7 +4,7 @@
 This document aims to provide an overview of the rebuilding plan for the notebook images. There are two types of updates that are implemented:
 
 1.  *Release updates* - These updates will be carried out twice a year and will incorporate major updates to the notebook images.
-  
+
 2.  *Patch updates* - These updates will be carried out weekly and will focus on incorporating security updates to the notebook images.
 
 ## Scope and frequency of the updates

diff --git a/base/anaconda-python-3.8/Dockerfile b/base/anaconda-python-3.8/Dockerfile
@@ -101,4 +101,4 @@ RUN curl -L https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/sta
 # Fix permissions to support pip in Openshift environments
 RUN fix-permissions /opt/app-root -P
 
-WORKDIR /opt/app-root/src
+WORKDIR /opt/app-root/src
diff --git a/ci/check-json.sh b/ci/check-json.sh
@@ -24,13 +24,13 @@ function check_json() {
     if grep --quiet --extended-regexp "${string}" "${f}"; then
     #if $(grep -e "${string}" "${f}"); then
         jsons=$(yq -r ".spec.tags[].annotations.\"${string}\"" "${f}")
-        
+
         while IFS= read -r json; do
             echo "    ${json}"
             echo -n "  > "; echo "${json}" | json_verify || ret_code="${?}"
         done <<< "${jsons}"
     else
-	echo "    Ignoring as this file doesn't contain necessary key field '${string}' for check"
+    echo "    Ignoring as this file doesn't contain necessary key field '${string}' for check"
     fi
 
     return "${ret_code}"

diff --git a/codeserver/c9s-python-3.9/nginx/root/usr/share/container-scripts/nginx/common.sh b/codeserver/c9s-python-3.9/nginx/root/usr/share/container-scripts/nginx/common.sh
@@ -23,9 +23,9 @@ function process_extending_files() {
       # Custom file is prefered
       if [ -f $custom_dir/$filename ]; then
         source $custom_dir/$filename
-      elif [ -f $default_dir/$filename ]; then 
+      elif [ -f $default_dir/$filename ]; then
         source $default_dir/$filename
       fi
     fi
   done <<<"$(get_matched_files "$custom_dir" "$default_dir" '*.sh' | sort -u)"
-}
+}
diff --git a/codeserver/c9s-python-3.9/run-code-server.sh b/codeserver/c9s-python-3.9/run-code-server.sh
@@ -6,7 +6,7 @@ source ${SCRIPT_DIR}/utils/*.sh
 
 # Start nginx and fastcgiwrap
 run-nginx.sh &
-spawn-fcgi -s /var/run/fcgiwrap.socket -M 766 /usr/sbin/fcgiwrap 
+spawn-fcgi -s /var/run/fcgiwrap.socket -M 766 /usr/sbin/fcgiwrap
 
 # Add .bashrc for custom promt if not present
 if [ ! -f "/opt/app-root/src/.bashrc" ]; then

diff --git a/codeserver/c9s-python-3.9/run-nginx.sh b/codeserver/c9s-python-3.9/run-nginx.sh
@@ -23,4 +23,4 @@ else
     envsubst '${BASE_URL}' < /etc/nginx/nginx.conf | tee /etc/nginx/nginx.conf
 fi
 
-nginx
+nginx
diff --git a/codeserver/c9s-python-3.9/utils/process.sh b/codeserver/c9s-python-3.9/utils/process.sh
@@ -16,4 +16,4 @@ function start_process() {
 
 function stop_process() {
     kill -TERM $PID
-}
+}
diff --git a/docs/developer-guide.md b/docs/developer-guide.md
@@ -2,7 +2,7 @@ The following sections are aimed to provide a comprehensive guide for developers
 
 ## Getting Started
 This project utilizes three branches for the development: the **main** branch, which hosts the latest development, and t**wo additional branches for each release**.
-These release branches follow a specific naming format: YYYYx, where "YYYY" represents the year, and "x" is an increasing letter. Thus, they help to keep working on minor updates and bug fixes on the supported versions (N & N-1) of each workbench. 
+These release branches follow a specific naming format: YYYYx, where "YYYY" represents the year, and "x" is an increasing letter. Thus, they help to keep working on minor updates and bug fixes on the supported versions (N & N-1) of each workbench.
 
 ## Architecture
 The structure of the notebook's build chain is derived from the parent image. To better comprehend this concept, refer to the following graph.
@@ -19,30 +19,30 @@ Detailed instructions on how developers can contribute to this project can be fo
 ## Workbench ImageStreams
 
 ODH supports multiple out-of-the-box pre-built workbench images ([provided in this repository](https://github.com/opendatahub-io/notebooks)). For each of those workbench images, there is a dedicated ImageStream object definition. This ImageStream object references the actual image tag(s) and contains additional metadata that describe the workbench image.
-  
+
 ### **Annotations**
 
 Aside from the general ImageStream config values, there are additional annotations that can be provided in the workbench ImageStream definition. This additional data is leveraged further by the [odh-dashboard](https://github.com/opendatahub-io/odh-dashboard/).
 
-### **ImageStream-specific annotations**  
-The following labels and annotations are specific to the particular workbench image. They are provided in their respective sections in the `metadata` section. 
+### **ImageStream-specific annotations**
+The following labels and annotations are specific to the particular workbench image. They are provided in their respective sections in the `metadata` section.
 ```yaml
 metadata:
   labels:
     ...
   annotations:
     ...
 ```
-### **Available labels**  
+### **Available labels**
 -  **`opendatahub.io/notebook-image:`** - a flag that determines whether the ImageStream references a workbench image that is meant be shown in the UI
 ### **Available annotations**
 - **`opendatahub.io/notebook-image-url:`** - a URL reference to the source of the particular workbench image
 - **`opendatahub.io/notebook-image-name:`** - a desired display name string for the particular workbench image (used in the UI)
-- **`opendatahub.io/notebook-image-desc:`** - a desired description string of the of the particular workbench image (used in the UI) 
+- **`opendatahub.io/notebook-image-desc:`** - a desired description string of the of the particular workbench image (used in the UI)
 - **`opendatahub.io/notebook-image-order:`** - an index value for the particular workbench ImageStream (used by the UI to list available workbench images in a specific order)
 - **`opendatahub.io/recommended-accelerators`** - a string that represents the list of recommended hardware accelerators for the particular workbench ImageStream (used in the UI)
 
-### **Tag-specific annotations**  
+### **Tag-specific annotations**
 One ImageStream can reference multiple image tags. The following annotations are specific to a particular workbench image tag and are provided in its `annotations:` section.
 ```yaml
 spec:
@@ -54,17 +54,17 @@ spec:
         name: image-repository/tag
       name: tag-name
 ```
-### **Available annotations**  
+### **Available annotations**
   - **`opendatahub.io/notebook-software:`** - a string that represents the technology stack included within the workbench image. Each technology in the list is described by its name and the version used (e.g. `'[{"name":"CUDA","version":"11.8"},{"name":"Python","version":"v3.9"}]`')
   - **`opendatahub.io/notebook-python-dependencies:`** -  a string that represents the list of Python libraries included within the workbench image. Each library is described by its name and currently used version (e.g. `'[{"name":"Numpy","version":"1.24"},{"name":"Pandas","version":"1.5"}]'`)
   - **`openshift.io/imported-from:`** - a reference to the image repository where the workbench image was obtained (e.g. `quay.io/repository/opendatahub/workbench-images`)
   - **`opendatahub.io/workbench-image-recommended:`** - a flag that allows the ImageStream tag to be marked as Recommended (used by the UI to distinguish which tags are recommended for use, e.g., when the workbench image offers multiple tags to choose from)
 
-### **ImageStream definitions for the supported out-of-the-box images in ODH**  
+### **ImageStream definitions for the supported out-of-the-box images in ODH**
 
 The ImageStream definitions of the out-of-the-box workbench images for ODH can be found [here](https://github.com/opendatahub-io/notebooks/tree/main/manifests).
 
-### **Example ImageStream object definition**  
+### **Example ImageStream object definition**
 
 An exemplary, non-functioning ImageStream object definition that uses all the aforementioned annotations is provided below.
 
@@ -114,11 +114,11 @@ The opendatahub-io-ci-image-mirror job will be used to mirror the images from th
 tests:
   - as: ${NOTEBOOK_IMAGE_NAME}-image-mirror
     steps:
-  	dependencies:
-    	  SOURCE_IMAGE_REF: ${NOTEBOOK_IMAGE_NAME}
-  	env:
-    	  IMAGE_REPO: notebooks
-  	workflow: opendatahub-io-ci-image-mirror
+      dependencies:
+          SOURCE_IMAGE_REF: ${NOTEBOOK_IMAGE_NAME}
+      env:
+          IMAGE_REPO: notebooks
+      workflow: opendatahub-io-ci-image-mirror
 ```
 The images mirrored under 2 different scenarios:
 1. A new PR is opened.
@@ -128,7 +128,7 @@ The Openshift CI is also configured to run the unit and integration tests:
 
 ```
 tests:
-  - as: notebooks-e2e-tests 
+  - as: notebooks-e2e-tests
     steps:
       test:
         - as: ${NOTEBOOK_IMAGE_NAME}-e2e-tests
@@ -146,15 +146,15 @@ This GitHub action is configured to be triggered on a weekly basis, specifically
 
 ### **Sync the downstream release branch with the upstream** [[Link]](https://github.com/red-hat-data-services/notebooks/blob/main/.github/workflows/sync-release-branch-2023a.yml)
 
-This GitHub action is configured to be triggered on a weekly basis, specifically every Tuesday at 08:00 AM UTC. Its main objective is to automatically update the downstream release branch with the upstream branch. 
+This GitHub action is configured to be triggered on a weekly basis, specifically every Tuesday at 08:00 AM UTC. Its main objective is to automatically update the downstream release branch with the upstream branch.
 
 ### **Digest Updater workflow on the manifests** [[Link]](https://github.com/opendatahub-io/odh-manifests/blob/master/.github/workflows/notebooks-digest-updater-upstream.yaml)
- 
+
 This GitHub action is designed to be triggered on a weekly basis, specifically every Friday at 12:00 AM UTC. Its primary purpose is to automate the process of updating the SHA digest of the notebooks. It achieves this by fetching the new SHA values from the quay.io registry and updating the [param.env](https://github.com/opendatahub-io/odh-manifests/blob/master/notebook-images/base/params.env) file, which is hosted on the odh-manifest repository. By automatically updating the SHA digest, this action ensures that the notebooks remain synchronized with the latest changes.
 
 ### **Digest Updater workflow on the live-builder** [[Link]](https://gitlab.cee.redhat.com/data-hub/rhods-live-builder/-/blob/main/.gitlab/notebook-sha-digest-updater.yml)
 
-This GitHub action works with the same logic as the above and is designed to be triggered on a weekly basis, specifically every Friday. It is also update the SHA digest of the images into the [CSV](https://gitlab.cee.redhat.com/data-hub/rhods-live-builder/-/blob/main/rhods-operator-live/bundle/template/manifests/clusterserviceversion.yml.j2#L725) file on the live-builder repo. 
-  
-  
+This GitHub action works with the same logic as the above and is designed to be triggered on a weekly basis, specifically every Friday. It is also update the SHA digest of the images into the [CSV](https://gitlab.cee.redhat.com/data-hub/rhods-live-builder/-/blob/main/rhods-operator-live/bundle/template/manifests/clusterserviceversion.yml.j2#L725) file on the live-builder repo.
+
+
 [Previous Page](https://github.com/opendatahub-io/notebooks/wiki/Workbenches) | [Next Page](https://github.com/opendatahub-io/notebooks/wiki/User-Guide)
diff --git a/docs/user-guide.md b/docs/user-guide.md
@@ -1,7 +1,7 @@
 The following sections are aimed to provide a comprehensive guide on effectively utilizing an out-of-the-box notebook by a user.
 There are two options for launching a workbench image: either through the Enabled applications or the Data Science Project.
 
-## Notebook Spawner 
+## Notebook Spawner
 
 In the ODH dashboard, you can navigate to Applications -> Enabled -> Launch Application from the Jupyter tile. The notebook server spawner page displays a list of available container images you can run as a single user."
 
@@ -36,11 +36,10 @@ During the release lifecycle, which is the period during which the update is sup
 Our goal is to ensure that notebook images are supported for a minimum of one year, meaning that typically two supported images will be available at any given time. This provides sufficient time for users to update their code to use components from the latest notebook images. We will continue to make older images available in the registry for users to add as custom notebook images, even if they are no longer supported. This way, users can still access the older images if needed.
 Example lifecycle (not actual dates):
 
-2023-01-01 - only one version of the notebook images is available - version 1 for all images.  
-2023-06-01 - release updated images - version 2 (v2023a). Versions 1 & 2 are supported and available for selection in the UI.  
-2023-12-01 - release updated images - version 3 (v2023b). Versions 2 & 3 are supported and available for selection in the UI.  
-2024-06-01 - release updated images - version 4 (v2024a). Versions 3 & 4 are supported and available for selection in the UI.  
+2023-01-01 - only one version of the notebook images is available - version 1 for all images.
+2023-06-01 - release updated images - version 2 (v2023a). Versions 1 & 2 are supported and available for selection in the UI.
+2023-12-01 - release updated images - version 3 (v2023b). Versions 2 & 3 are supported and available for selection in the UI.
+2024-06-01 - release updated images - version 4 (v2024a). Versions 3 & 4 are supported and available for selection in the UI.
 
 
 [Previous Page](https://github.com/opendatahub-io/notebooks/wiki/Developer-Guide)
-