Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate import and improve operations #525

Merged
merged 40 commits into from
Jan 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
de518f7
Get rid of added_advisory and batches
Hritik14 Aug 10, 2021
f783915
Implement improver registry and management command
Hritik14 Aug 12, 2021
fa74025
Implement improver
Hritik14 Aug 13, 2021
5fa3cd5
Use new Advisory model for importers
Hritik14 Aug 28, 2021
08605cd
Remove Inference->Advisory encapsulation
Hritik14 Aug 29, 2021
43c8ab9
Fix UI break
Hritik14 Aug 29, 2021
96a8cb3
Use to_dict, add docs, infer as function name
Hritik14 Sep 12, 2021
c3f34e9
Use OSV design for AffectedPackages
Hritik14 Sep 12, 2021
c2ef79c
Modify nginx importer to adopt OSV design
Hritik14 Oct 12, 2021
7c1e24c
Assert Inference fields
Hritik14 Sep 23, 2021
31c1054
Process one advisory in one transaction
Hritik14 Oct 11, 2021
2694c4e
Improve docs and cleanup code
Hritik14 Oct 12, 2021
c60e030
Hotfix parse_version from univers
Hritik14 Oct 18, 2021
fd0f909
Disable failing importers, add docs and refactor
lchritik Oct 22, 2021
be31a0e
Use latest univers branch
pombredanne Nov 23, 2021
1971e5b
Update TODOs and docstrings
Hritik14 Nov 30, 2021
1216244
Adopt new univers for importers and tiny refactor
Hritik14 Dec 3, 2021
8065e3c
Enable doctest in pytest
Hritik14 Dec 3, 2021
5ec989d
Rewrite nginx importer to use new univers design
Hritik14 Dec 3, 2021
18e3d4b
Refactor according to code review on 2021-12-04
Hritik14 Dec 8, 2021
22d7132
Update improvers to accept new univers design
Hritik14 Dec 14, 2021
7cd4044
Add AdvisoryData.to_inference and AffectedPackage.merge
Hritik14 Jan 14, 2022
c7bc550
Add to_reper to DataSource
Hritik14 Jan 14, 2022
f399af6
Apply formatting changes
Hritik14 Jan 14, 2022
73f8682
Adopt new vers spec
Hritik14 Jan 14, 2022
d89e3d3
Implement NginxBasicImprover
Hritik14 Jan 14, 2022
4988452
Introduce aliases
Hritik14 Jan 15, 2022
299d6f1
Reformat Alias structure and process_improver
Hritik14 Jan 23, 2022
b9ba26a
Change __repr__ to qualified_name fn
Hritik14 Jan 23, 2022
e3a1ea5
Update .gitignore for junk files
Hritik14 Jan 23, 2022
c3e71f4
Fix few tests to recent structure, dump ugettext_lazy
Hritik14 Jan 23, 2022
e822c41
Ignore outdated tests
Hritik14 Jan 23, 2022
e9f940d
Disable outdated importers
Hritik14 Jan 23, 2022
318663e
Reset migrations
Hritik14 Jan 23, 2022
86d175e
Partition without numerical index
Hritik14 Jan 25, 2022
bd390d4
Add get_fixed_purl in AffectedPackage, fix from_dict
Hritik14 Jan 25, 2022
6ab6f06
Add FIXME about wrong filters in PackageSearchView
Hritik14 Jan 25, 2022
33d0c0d
Teeny weeny fixes
Hritik14 Jan 25, 2022
21c6178
Correct file format
pombredanne Jan 25, 2022
0e74bea
Use univers 30+. Bump lxml
pombredanne Jan 25, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,12 @@ Pipfile

# VSCode
.vscode

# Various junk and temp files
.DS_Store
*~
.*.sw[po]
.build
.ve
*.bak
/.cache/
32 changes: 31 additions & 1 deletion pytest.ini
Original file line number Diff line number Diff line change
@@ -1,4 +1,34 @@
[pytest]
DJANGO_SETTINGS_MODULE = vulnerablecode.settings
markers =
webtest
webtest
addopts =
--doctest-modules
# Ignore the following doctests until these files are migrated to
# import-improve structure
--ignore=vulnerabilities/importers/alpine_linux.py
--ignore=vulnerabilities/importers/apache_httpd.py
--ignore=vulnerabilities/importers/apache_kafka.py
--ignore=vulnerabilities/importers/apache_tomcat.py
--ignore=vulnerabilities/importers/archlinux.py
--ignore=vulnerabilities/importers/debian.py
--ignore=vulnerabilities/importers/elixir_security.py
--ignore=vulnerabilities/importers/gentoo.py
--ignore=vulnerabilities/importers/github.py
--ignore=vulnerabilities/importers/istio.py
--ignore=vulnerabilities/importers/kaybee.py
--ignore=vulnerabilities/importers/npm.py
--ignore=vulnerabilities/importers/nvd.py
--ignore=vulnerabilities/importers/openssl.py
--ignore=vulnerabilities/importers/postgresql.py
--ignore=vulnerabilities/importers/project_kb_msr2019.py
--ignore=vulnerabilities/importers/redhat.py
--ignore=vulnerabilities/importers/retiredotnet.py
--ignore=vulnerabilities/importers/ruby.py
--ignore=vulnerabilities/importers/rust.py
--ignore=vulnerabilities/importers/safety_db.py
--ignore=vulnerabilities/importers/suse_backports.py
--ignore=vulnerabilities/importers/suse_scores.py
--ignore=vulnerabilities/importers/ubuntu_usn.py
--ignore=vulnerabilities/management/commands/create_cpe_to_purl_map.py
--ignore=vulnerabilities/lib_oval.py
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ django-widget-tweaks>=1.4.8
packageurl-python>=0.9.4
binaryornot>=0.4.4
GitPython>=3.1.17
univers>=21.4.16.6
univers>=30.0.0
saneyaml>=0.5.2
beautifulsoup4>=4.9.3
python-dateutil>=2.8.1
toml>=0.10.2
lxml>=4.6.3
lxml>=4.6.4
gunicorn>=20.1.0
django-environ==0.4.5
defusedxml==0.7.1
98 changes: 98 additions & 0 deletions vulnerabilities/data_inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import dataclasses
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistence, please add a license header. We still need to update the license everywhere per #277 , but best is to have whatever we have now consistently across all code files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. How about I add all the license in the last commit ? Might make things clearer. Let me know

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

import logging
from typing import List
from typing import Optional
from uuid import uuid4

from packageurl import PackageURL
from django.db.models.query import QuerySet

from vulnerabilities.data_source import Reference
from vulnerabilities.data_source import AdvisoryData

logger = logging.getLogger(__name__)

MAX_CONFIDENCE = 100


@dataclasses.dataclass(order=True)
class Inference:
"""
This data class expresses the contract between data improvers and the improve runner.

Only inferences with highest confidence for one vulnerability <-> package
relationship is to be inserted into the database
"""

vulnerability_id: str = None
Hritik14 marked this conversation as resolved.
Show resolved Hide resolved
aliases: List[str] = dataclasses.field(default_factory=list)
confidence: int = MAX_CONFIDENCE
summary: Optional[str] = None
affected_purls: List[PackageURL] = dataclasses.field(default_factory=list)
fixed_purl: PackageURL = dataclasses.field(default_factory=list)
references: List[Reference] = dataclasses.field(default_factory=list)
Hritik14 marked this conversation as resolved.
Show resolved Hide resolved

def __post_init__(self):
if self.confidence > MAX_CONFIDENCE or self.confidence < 0:
raise ValueError

assert (
self.vulnerability_id
or self.aliases
or self.summary
or self.affected_purls
or self.fixed_purl
or self.references
)

versionless_purls = []
for purl in self.affected_purls + [self.fixed_purl]:
if not purl.version:
versionless_purls.append(purl)

assert (
not versionless_purls
), f"Version-less purls are not supported in an Inference: {versionless_purls}"

@classmethod
def from_advisory_data(cls, advisory_data, confidence, affected_purls, fixed_purl):
"""
Return an Inference object while keeping the same values as of advisory_data
for vulnerability_id, summary and references
"""
return cls(
aliases=advisory_data.aliases,
confidence=confidence,
summary=advisory_data.summary,
affected_purls=affected_purls,
fixed_purl=fixed_purl,
references=advisory_data.references,
)


class Improver:
"""
Improvers are responsible to improve the already imported data by a datasource.
Inferences regarding the data could be generated based on multiple factors.
"""

@property
def interesting_advisories(self) -> QuerySet:
"""
Return QuerySet for the advisories this improver is interested in
"""
raise NotImplementedError

def get_inferences(self, advisory_data: AdvisoryData) -> List[Inference]:
"""
Generate and return Inferences for the given advisory data
"""
raise NotImplementedError

@classmethod
def qualified_name(cls):
"""
Fully qualified name prefixed with the module name of the improver
used in logging.
"""
return f"{cls.__module__}.{cls.__qualname__}"
Loading