Skip to content

Commit

Permalink
Ignore linkcheck for dummy URLs
Browse files Browse the repository at this point in the history
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
  • Loading branch information
keshav-space committed Oct 17, 2024
1 parent 2e31b8e commit 0d2a593
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 29 deletions.
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
"https://anongit.gentoo.org/git/data/glsa.git", # Git only link
"https://www.softwaretestinghelp.com/how-to-write-good-bug-report/", # Cloudflare protection
"https://www.openssl.org/news/vulnerabilities.xml", # OpenSSL legacy advisory URL, not longer available
"https://example.org/api/non-existent-packages",
]

# Add any Sphinx extension module names here, as strings. They can be
Expand Down
6 changes: 4 additions & 2 deletions docs/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,10 @@ For more established contributors, you can contribute to the codebase in several
- Create a `new issue <https://github.com/aboutcode-org/vulnerablecode/issues>`_ to request a
feature, submit a feedback, or ask a question.

* Want to add support for a new importer pipeline? See the detailed tutorial here: :ref:`tutorial_add_importer_pipeline`.
* Interested adding a new improver pipeline? Check out the tutorial here: :ref:`tutorial_add_improver_pipeline`.
* Want to add support for a new importer pipeline? See the detailed tutorial here:
:ref:`tutorial_add_importer_pipeline`.
* Interested adding a new improver pipeline? Check out the tutorial here:
:ref:`tutorial_add_improver_pipeline`.

.. note::
Make sure to check existing `issues <https://github.com/aboutcode-org/vulnerablecode/issues>`_,
Expand Down
23 changes: 12 additions & 11 deletions docs/source/tutorial_add_importer_pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,21 @@ Pipeline
We use `aboutcode.pipeline <https://github.com/aboutcode-org/scancode.io/tree/main/aboutcode/pipeline>`_
for importing and improving data. At a very high level, a working pipeline contains classmethod
``steps`` that defines what steps to run and in what order. These steps are essentially just
functions. Pipeline provides an easy and effective way to log events inside these steps (it
automatically handles rendering and dissemination for these logs.)
functions. Pipeline provides an easy and effective way to log events inside these steps (it
automatically handles rendering and dissemination for these logs.)

It also includes built-in progress indicator, which is essential since some of the jobs we run
in the pipeline are long-running tasks that require proper progress indicators. Pipeline provides
way to seamlessly records the progress (it automatically takes care of rendering and dissemination
of these progress).

Additionally, the pipeline offers a consistent structure, making it easy to run these pipeline steps
with message queue like RQ and store all events related to a particular pipeline for
with message queue like RQ and store all events related to a particular pipeline for
debugging/improvements.

This tutorial contains all the things one should know to quickly implement an importer pipeline.
Many internal details about importer pipeline can be found inside the `vulnerabilities/pipelines/__init__.py
Many internal details about importer pipeline can be found inside the
`vulnerabilities/pipelines/__init__.py
<https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/pipelines/__init__.py>`_ file.


Expand Down Expand Up @@ -95,13 +96,13 @@ Create file for the new importer pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All pipelines, including the importer pipeline, are located in the
`vulnerabilities/pipelines/
`vulnerabilities/pipelines/
<https://github.com/aboutcode-org/vulnerablecode/tree/main/vulnerabilities/pipelines>`_ directory.

The importer pipeline is implemented by subclassing **VulnerableCodeBaseImporterPipeline**
and implementing the unimplemented methods. Since most tasks, such as inserting **AdvisoryData**
into the database and creating package-vulnerability relationships, are the same regardless of
the source of the advisory, these tasks are already taken care of in the base importer pipeline,
the source of the advisory, these tasks are already taken care of in the base importer pipeline,
i.e., **VulnerableCodeBaseImporterPipeline**. You can simply focus on collecting the raw data and
parsing it to create proper **AdvisoryData** objects.

Expand Down Expand Up @@ -134,7 +135,7 @@ and that's it.
In some cases, it could be difficult to get the exact total number of advisories that would
be collected without actually processing the advisories. In such case returning the best
estimate will also work.

**advisories_count** is used to enable a proper progress indicator and is not used beyond that.
If it is impossible (a super rare case) to compute the total advisory count beforehand,
just return ``0``.
Expand Down Expand Up @@ -174,7 +175,7 @@ At this point, an example importer will look like this:
def advisories_count(self) -> int:
raise NotImplementedError
def collect_advisories(self) -> Iterable[AdvisoryData]:
raise NotImplementedError
Expand Down Expand Up @@ -291,7 +292,7 @@ version management from `univers <https://github.com/aboutcode-org/univers>`_.
.. important::
Steps should include ``collect_and_store_advisories`` and ``import_new_advisories``
in the order shown above. They are defined in **VulnerableCodeBaseImporterPipeline**.

It is the **collect_and_store_advisories** that is responsible for making calls to
**collect_advisories** and **advisories_count**, and hence **collect_advisories** and
**advisories_count** should never be directly added in steps.
Expand All @@ -307,7 +308,7 @@ Register the Importer Pipeline
------------------------------

Finally, register your pipeline in the importer registry at
`vulnerabilities/importers/__init__.py
`vulnerabilities/importers/__init__.py
<https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/importers/__init__.py>`_

.. code-block:: python
Expand Down Expand Up @@ -363,4 +364,4 @@ Now, run the importer.
INFO 2024-10-16 10:15:10.563 Pipeline completed in 0 seconds
See :ref:`command_line_interface` for command line usage instructions.
See :ref:`command_line_interface` for command line usage instructions.
35 changes: 19 additions & 16 deletions docs/source/tutorial_add_improver_pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ Pipeline
We use `aboutcode.pipeline <https://github.com/aboutcode-org/scancode.io/tree/main/aboutcode/pipeline>`_
for importing and improving data. At a very high level, a working pipeline contains classmethod
``steps`` that defines what steps to run and in what order. These steps are essentially just
functions. Pipeline provides an easy and effective way to log events inside these steps (it
automatically handles rendering and dissemination for these logs.)
functions. Pipeline provides an easy and effective way to log events inside these steps (it
automatically handles rendering and dissemination for these logs.)

It also includes built-in progress indicator, which is essential since some of the jobs we run
in the pipeline are long-running tasks that require proper progress indicators. Pipeline provides
way to seamlessly records the progress (it automatically takes care of rendering and dissemination
of these progress).

Additionally, the pipeline offers a consistent structure, making it easy to run these pipeline steps
with message queue like RQ and store all events related to a particular pipeline for
with message queue like RQ and store all events related to a particular pipeline for
debugging/improvements.

This tutorial contains all the things one should know to quickly implement an improver pipeline.
Expand All @@ -42,10 +42,10 @@ The new improver design lets you do all sorts of cool improvements and enhanceme
Some of those are:

* Let's suppose you have a certain number of packages and vulnerabilities in your database,
and you want to make sure that the packages being shown in VulnerableCode do indeed exist upstream.
Oftentimes, we come across advisory data that contains made-up package versions. We can write
(well, we already have) a pipeline that iterates through all the packages in VulnerableCode and
labels them as ghost packages if they don't exist upstream.
and you want to make sure that the packages being shown in VulnerableCode do indeed exist
upstream. Oftentimes, we come across advisory data that contains made-up package versions.
We can write (well, we already have) a pipeline that iterates through all the packages in
VulnerableCode and labels them as ghost packages if they don't exist upstream.


- A basic security advisory only contains CVE/aliases, summary, fixed/affected version, and
Expand All @@ -64,17 +64,20 @@ be absolutely sure of what you're doing and should have robust tests for these p
Writing an Improver Pipeline
-----------------------------

**Scenario:** Suppose we come around a source that curates and stores the list of packages that don't
exist upstream and makes it available through the REST API endpoint https://example.org/api/non-existent-packages,
which gives a JSON response with a list of non-existent packages.
Let's write a pipeline that will use this source to flag these non-existent package as ghost package.
**Scenario:** Suppose we come around a source that curates and stores the list of packages that
don't exist upstream and makes it available through the REST API endpoint
https://example.org/api/non-existent-packages, which gives a JSON response with a list of
non-existent packages.

Let's write a pipeline that will use this source to flag these non-existent package as
ghost package.


Create file for the new improver pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All pipelines, including the improver pipeline, are located in the
`vulnerabilities/pipelines/
`vulnerabilities/pipelines/
<https://github.com/aboutcode-org/vulnerablecode/tree/main/vulnerabilities/pipelines>`_ directory.

The improver pipeline is implemented by subclassing `VulnerableCodePipeline`.
Expand Down Expand Up @@ -124,7 +127,7 @@ At this point improver will look like this:
def fetch_response(self):
raise NotImplementedError
def flag_ghost_packages(self):
raise NotImplementedError
Expand Down Expand Up @@ -194,7 +197,7 @@ Register the Improver Pipeline
------------------------------

Finally, register your improver in the improver registry at
`vulnerabilities/improvers/__init__.py
`vulnerabilities/improvers/__init__.py
<https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/improvers/__init__.py>`_


Expand Down Expand Up @@ -253,8 +256,8 @@ See :ref:`command_line_interface` for command line usage instructions.

.. tip::

If you need to improve package vulnerability relations created using a certain pipeline,
simply use the **pipeline_id** to filter out only those items. For example, if you want
If you need to improve package vulnerability relations created using a certain pipeline,
simply use the **pipeline_id** to filter out only those items. For example, if you want
to improve only those **AffectedByPackageRelatedVulnerability** entries that were created
by npm_importer pipeline, you can do so with the following query:

Expand Down

0 comments on commit 0d2a593

Please sign in to comment.